Skip to content

feat: add 'local' network + accept standalone aggregator status='ok'#2

Closed
vrogojin wants to merge 3 commits into
fix/aggregator-critical-fail-unreachablefrom
feat/local-network-and-standalone-status
Closed

feat: add 'local' network + accept standalone aggregator status='ok'#2
vrogojin wants to merge 3 commits into
fix/aggregator-critical-fail-unreachablefrom
feat/local-network-and-standalone-status

Conversation

@vrogojin
Copy link
Copy Markdown
Contributor

Summary

Companion work for the sphere-sdk hermetic e2e stack (tests/e2e/local-infra/) which boots a local aggregator in BFT_ENABLED=false standalone mode. Stacked on top of #1 (which adds the critical-check verdict rule); merge order should be #1 then this.

Two changes:

  1. `NETWORKS.local` in `src/networks.mjs` — endpoints for the docker-compose stack: aggregator `http://127.0.0.1:3001\`, nostr `ws://127.0.0.1:7777`, ipfs `http://127.0.0.1:8082\`. Fulcrum/Market intentionally non-local (no local counterpart yet). Faucet `null` — the local faucet is DM-driven, not HTTP, so the existing faucet probe doesn't apply.

  2. Aggregator `/health` body parser accepts both shapes:

    • BFT mode (testnet/mainnet): `{"status":"healthy","database":"ok","aggregators":{...}}`
    • Standalone mode (the new local stack): `{"status":"ok","role":"standalone","details":{"database":"connected",...}}`

    Previously the standalone shape was reported as `degraded` even when the aggregator was fully functional. Now both are accepted; the verdict drops to `degraded` only on genuine unhealthy state.

Verified against the live local stack

Before:
```
⚠ aggregator http://127.0.0.1:3001 - DEGRADED
✗ health unhealthy: {"status":"ok","role":"standalone",...}
```

After:
```
✅ aggregator http://127.0.0.1:3001 - HEALTHY (4/4 checks passed)
✓ health 15ms ok role=standalone (db connected, 15ms)
✓ json-rpc 3ms OK — result={"blockNumber":"17"}
✓ submit_commitment 26ms accepted (status=SUCCESS, 26ms)
✓ get_inclusion_proof 2ms proof returned in 2ms
```

Tests

30/30 pass. Two new tests pin the new shape:

  • `NETWORKS.local` has loopback endpoints + `faucet === null`
  • `NETWORKS` keys updated to `['dev','local','mainnet','testnet']`

Test plan

  • `npm test` — 30/30 pass
  • Live probe against the local stack: `unicity-infra-probe --network local --only aggregator` → HEALTHY + exit 0
  • Live probe against testnet still works (no regression to the BFT-mode parser path)

Bumps to 0.4.2.

Companion work for the sphere-sdk hermetic e2e stack
(tests/e2e/local-infra/) which boots a local aggregator in
BFT_ENABLED=false standalone mode. Two changes layered on top of #1's
critical-check verdict fix:

1. NETWORKS.local — endpoints for the docker-compose stack:
   - aggregator http://127.0.0.1:3001
   - nostr     ws://127.0.0.1:7777
   - ipfs      http://127.0.0.1:8082
   Fulcrum/Market intentionally non-local (no local counterpart yet).
   Faucet null — the local faucet is DM-driven, not HTTP, so the
   existing faucet probe doesn't apply.

2. aggregator /health body parser accepts both shapes:
   - BFT mode:        {"status":"healthy","database":"ok","aggregators":{...}}
   - Standalone mode: {"status":"ok","role":"standalone","details":{"database":"connected",...}}
   Previously the standalone shape was reported as `degraded` even
   when the aggregator was fully functional (submit_commitment +
   get_inclusion_proof both passing).

Verified against the live local stack:
  unicity-infra-probe --network local --only aggregator
  → HEALTHY (4/4 checks passed), exit 0.

Tests: 30/30 pass. Two new tests cover NETWORKS.local shape +
NETWORKS.local.faucet === null.

Bumps to 0.4.2.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a local network configuration for docker-compose environments and updates the aggregator health probe to support both BFT and standalone response formats. The reviewer suggested refactoring the aggregator health check logic to explicitly distinguish between these two modes to prevent potential false positives and improve code clarity.

Comment thread src/probes/aggregator.mjs
Comment on lines +62 to +74
const isHealthyStr = body?.status === 'healthy' || body?.status === 'ok';
const databaseOk =
body?.database === 'ok' ||
body?.details?.database === 'connected' ||
// Both legacy fields absent → assume OK (some standalone builds
// omit the database line entirely). Liveness has already been
// confirmed by the HTTP 200; the functional check below will catch
// any actual DB outage.
(body?.database === undefined && body?.details?.database === undefined);
const allShardsOk = body?.aggregators
? Object.values(body.aggregators).every((v) => v === 'ok')
: true; // standalone has no shards
const happy = isHealthyStr && databaseOk && allShardsOk;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic to determine if the aggregator is healthy is spread across several variables and has a fallback case that could lead to false positives. For instance, a minimal response like {"status":"ok"} would be incorrectly classified as healthy.

To improve clarity and correctness, I suggest refactoring this to explicitly check for the two distinct valid response shapes: one for BFT mode and one for standalone mode. This makes the code's intent clearer and avoids ambiguity. Using the role field, as described in the PR, would make the standalone check more specific.

    // Two health-body shapes are normal. We check for each shape explicitly.
    //   - BFT mode: { status: 'healthy', database: 'ok', aggregators: { … } }
    //   - Standalone mode: { status: 'ok', role: 'standalone', details: { database: 'connected' } }
    const isBftHealthy =
      body?.status === 'healthy' &&
      body?.database === 'ok' &&
      (body?.aggregators ? Object.values(body.aggregators).every((v) => v === 'ok') : true);

    const isStandaloneHealthy =
      body?.status === 'ok' &&
      body?.role === 'standalone' &&
      // `database` can be 'connected' or absent for some older standalone builds.
      (body?.details?.database === 'connected' || body?.details?.database === undefined);

    const happy = isBftHealthy || isStandaloneHealthy;

vrogojin and others added 2 commits May 23, 2026 11:59
…ical checks)

The agent-facing guide had drifted from the code: faucet probe added in
0.4.0, `local` docker-compose network in 0.4.2, and the aggregator
critical-check verdict rule (sphere-sdk #191 follow-up) were all
documented in commits but not in the contributors' single source of
truth. New agents were re-deriving these from git log — which is what
this file exists to prevent. Also adds the required Claude Code header,
a Common commands section, and pins the `faucet: null` clean-skip
pattern so the next "optional service" addition follows precedent
instead of inventing a new convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous faucet probe sent a deliberately-invalid nametag and
treated the faucet's "Nametag not found" rejection as proof-of-life.
That correctly exercised the HTTP/parse/resolve pipeline but couldn't
catch the failure mode that actually matters to downstream e2e suites:
the faucet accepts a real mint request, returns success:true, and yet
no token ever lands at the recipient.

A live testnet run also surfaced a confusing UX consequence — the
"Nametag not found" string next to a green-tick check reads as a
contradiction even when the verdict is correct, leading operators to
distrust the probe.

This commit replaces the rejection-handshake with the full mint round-
trip. The probe now spins up an ephemeral Sphere wallet, mints a
single-use nametag on the L3 aggregator, publishes the kind:30078
binding, requests 1 raw unit (1e-6) of USDU from the faucet, and waits
up to 10s for the corresponding kind:31113 token-transfer event to
arrive. The SDK handles NIP-04 decryption + Token deserialization. We
then compare the delivered token's coinId + amount against the faucet's
own HTTP-response declaration (amountInSmallestUnits) — independent
proof the mint actually landed.

The faucet has no probe-only mode and no direct-pubkey shortcut, so
verifying real delivery requires running as a one-shot wallet. The
trade-off taken to keep the implementation tractable was pulling in
@unicitylabs/sphere-sdk as a dependency, which violates three of the
project's "Hard rules" in CLAUDE.md (minimal-deps, no-SDK-coupling,
stateless-on-relay). All three rules are now explicitly scoped down
with "with one exception" carve-outs and a "The faucet exception"
section that records the rationale and what to revisit if the faucet
ever grows a probe-only mode.

BREAKING CHANGE: the faucet probe's check names changed
(request/health → wallet-setup/request/receipt) and all three are now
critical:true. JSON consumers that filter on the previous check names
will need to update. End-to-end wall-clock is now ~8–12s (up from
<500ms); the orchestration layer auto-bumps the faucet's timeout
ceiling to at least 30s.

The probe now leaves a kind:30078 event on the Nostr relay + a nametag
NFT on the L3 aggregator + consumes 1 USDU raw unit (≈ economically
zero) per run. Documented in CLAUDE.md "Stateless on the
relay/gateway side, with one exception".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vrogojin vrogojin closed this May 24, 2026
@vrogojin vrogojin deleted the feat/local-network-and-standalone-status branch May 24, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant