feat: sync improvements, verified commitment trees, block sync fixes #143
feat: sync improvements, verified commitment trees, block sync fixes #143p0mvn wants to merge 16 commits into
Conversation
Analyzed seven files, diff |
|
@v12-auditor review |
|
Warning A run is already in progress for this PR. View run |
b37d99b to
04f71ba
Compare
…oint verifier (#124) * perf(consensus): precompute auth data root concurrently in the checkpoint verifier The ZIP-244 authorizing-data commitment (Block::auth_data_root, a per-transaction auth digest) is one of the two dominant serial costs of the finalized committer on heavy shielded blocks. Unlike the note-commitment tree update it depends only on the block's own transactions, not chain state, so it can be computed ahead of the committer. Computing it inline in `check_block` does NOT help: the checkpoint verifier is wrapped in a tower `Buffer` (single worker), and `check_block` runs on that serialized path, so the work just moves to another single-threaded stage. Instead, compute it in the per-block task the verifier already `tokio::spawn`s to commit each verified block. That task runs off the buffer worker, one per block, so many blocks' auth digests are computed concurrently (via `spawn_blocking`), overlapping with and ahead of the single-threaded committer. The committer uses the precomputed value (carried on `SemanticallyVerifiedBlock::auth_data_root`), falling back to computing it when absent. Only Nu5-onward blocks bind the auth data in their block commitment. Consensus-neutral: the value is byte-identical to recomputing it at commit time; an end-to-end differential mainnet sync is the proof, since a wrong auth data root fails the commitment check and rejects the block. * spawn auth data root pre-compute after verifying checkpoint * simplify comment
Computing a v5+ transaction's txid (`Transaction::hash`) and its ZIP-244 authorizing-data digest (`auth_digest`) each independently convert the whole transaction to its librustzcash representation (re-serialize + re-parse), which dominates the per-transaction cost on heavy shielded blocks. The checkpoint commit path paid this twice: once building the transaction hashes in `CheckpointVerifiedBlock::new`, and again computing the auth data root. Add `Transaction::txid_and_auth_digest`, which performs one conversion and returns both. `SemanticallyVerifiedBlock::with_hash` now computes the transaction hashes and the auth data root together from that single shared conversion (the auth digest is nearly free once the txid is computed), so the auth data root is carried on the block and the separate per-block conversion in the checkpoint verifier's commit task is removed. Byte-identical to the separate computations (differential proptest `txid_and_auth_digest_matches_separate`); an end-to-end mainnet sync is the consensus proof.
Computing a v5+ transaction's txid (`Transaction::hash`) and its ZIP-244 authorizing-data digest (`auth_digest`) each independently convert the whole transaction to its librustzcash representation (re-serialize + re-parse), which dominates the per-transaction cost on heavy shielded blocks. The checkpoint commit path paid this twice: once building the transaction hashes in `CheckpointVerifiedBlock::new`, and again computing the auth data root. Add `Transaction::txid_and_auth_digest`, which performs one conversion and returns both. `SemanticallyVerifiedBlock::with_hash` now computes the transaction hashes and the auth data root together from that single shared conversion (the auth digest is nearly free once the txid is computed), so the auth data root is carried on the block and the separate per-block conversion in the checkpoint verifier's commit task is removed. Byte-identical to the separate computations (differential proptest `txid_and_auth_digest_matches_separate`); an end-to-end mainnet sync is the consensus proof.
…ck writer (#128) * perf(state): serialize raw transactions in parallel when writing blocks * perf(state): compute block size in parallel + run block-write batch prep in dedicated pool * comment
…reshold (#138) The checkpoint committer serializes each block's raw transactions (block.rs) and sums the per-transaction sizes (chain.rs) on the rayon pool. That fan-out is a clear win for the large blocks in the heavy shielded region, but for the small blocks of the early chain the rayon fork-join cost (waking workers, distributing the items, joining) outweighs the work itself. Gate both parallel paths on PARALLEL_BLOCK_TX_THRESHOLD (16 transactions): blocks at or above it keep the parallel path, smaller blocks run sequentially. The output is byte-identical either way, so this is purely a scheduling change. Measured with two fresh-from-genesis mainnet syncs of the same binary, gate toggled, over a matched height window (per-block, committer-thread metrics that are independent of peer/download luck): batch_prep 1.45ms -> 1.31ms (-10%) write_block_total 6.38ms -> 6.08ms ( -5%) Stable across sub-windows (batch_prep -8% to -13%). The heavy shielded region is unaffected: those blocks have >= 16 transactions and keep the parallel path.
In the checkpoint range, per-transaction CPU is dominated by computing the v5 txid and ZIP-244 authorizing-data digest. Both went through `Transaction::to_librustzcash`, which serializes the whole transaction and reparses it — decompressing every Jubjub/Pallas curve point — purely so librustzcash can re-serialize those same bytes into the BLAKE2b digest tree. A `perf` flamegraph of the heavy shielded region (mainnet 1.72M–1.73M) attributes ~44% of all CPU to these reparses (leaves are `bls12_381::Scalar::square` / `sqrt_tonelli_shanks` from point decompression); the BLAKE2b hashing itself is <1%. The decompressed points are never needed in the checkpoint range (no proof/signature verification). Compute the txid and auth digest directly from Zebra's already-parsed `Transaction` fields, feeding their canonical bytes straight into the same BLAKE2b tree (`transaction::zip244`). This removes the reparse entirely for the digest path. v6 transactions (unstable `tx_v6`) still use librustzcash. This is consensus-critical and byte-identical to librustzcash: proven by a differential property test (`native_zip244_matches_librustzcash`) over thousands of random v5 transactions, the existing ZIP-244 known-answer vectors, and a clean differential mainnet checkpoint sync.
#133) V5 transaction deserialization re-ran the full Transaction::to_librustzcash conversion and discarded the result, purely to reject transactions that Zebra can parse but librustzcash cannot. That conversion decompresses every Jubjub and Pallas curve point. A flamegraph of the heavy shielded region attributes about 25 to 30 percent of checkpoint-sync CPU to this single discarded reparse, and after the native ZIP-244 digest change it is the largest remaining cost. The check is redundant for rejecting untrusted transactions. Every transaction from a peer, the mempool, or sendrawtransaction is converted via CachedFfiTransaction::new before the semantic verifier accepts it, so a non-convertible v5 transaction is still rejected there with a clean error, including fully shielded transactions whose bundles are derived from that same conversion. Blocks below the checkpoints are trusted by their hash and validated against the header merkle root built from the native transaction IDs, and the checkpoint commit path no longer calls to_librustzcash for v5. Zebra's own deserializer still rejects the non-canonical encodings it validates (for example an identity-point Orchard rk), so only the librustzcash-specific re-validation moves from parse time to verification time. The pre-NU5 consensus branch id rejection added by the same upstream change is kept, since it is independent and cheap.
…tic path (#136) * proto(chain): defer Sapling value-commitment point decompression PROTOTYPE for benchmarking lever #1. After the native-digest and dropped-reparse changes, a flamegraph of the checkpoint heavy region attributes about 60% of CPU to Sapling Jubjub point decompression (a field square root in jubjub::AffinePoint::from_bytes), almost entirely the value commitment cv on every spend and output. Checkpoint sync never uses cv as a point: it verifies no signatures or proofs, and the note-commitment tree uses cm_u, not cv. Store cv as its canonical 32-byte encoding and decompress lazily, only when a consumer needs the point. Deserialization just copies the bytes, serialization and the txid digest use them directly, and the binding-signature verification in the semantic verifier decompresses on demand via ValueCommitment::commitment. This mirrors what Orchard already does for rk, which is why Orchard decompression is negligible in the profile. Prototype caveat: ValueCommitment::commitment panics on a non-canonical encoding rather than returning an error, and the not-small-order check now happens at the point of use instead of at parse. Correct for checkpoint sync (block hashes are trusted) and exercised by the unit tests, but the production version must make the accessor fallible so the semantic and mempool paths reject a malformed point cleanly instead of panicking. * proto(chain): also defer Sapling ephemeral_key point decompression Extends the lazy value-commitment prototype. With cv deferred, the profile showed the remaining ~50% of heavy-region CPU is the other per-output Jubjub point, the ephemeral_key, decompressed at parse. The validator only needs its bytes (txid digest and serialization); the point is needed only for wallet trial-decryption. Store ephemeral_key as its canonical 32-byte encoding and skip decompression at deserialization, like cv. Same prototype caveat: the not-small-order consensus check is deferred and must be re-added on the semantic and mempool paths in a production version. * proto(chain): validate lazy Sapling cv/epk consensus safety The deferred not-small-order checks for cv and ephemeral_key are not actually missing on the consensus path: librustzcash enforces them for every untrusted transaction, which all go through to_librustzcash (CachedFfiTransaction::new) on the semantic and mempool paths. cv is rejected at read (zcash_primitives read_value_commitment uses from_bytes_not_small_order); epk is rejected at verify (sapling-crypto verifier check_output uses epk.is_small_order). The checkpoint verifier trusts block hashes and does not need them. Add a regression test that constructs a v5 transaction with a small-order cv and epk and asserts both the deferral (Zebra now deserializes it) and the safety net (to_librustzcash rejects it), plus that the exact library detection functions flag the point. Correct the type docs accordingly. * perf(chain): enforce deferred Sapling cv/epk check on the semantic path Hardens the lazy Sapling cv/ephemeral_key prototype into a safer design. The lazy types keep point decompression off the checkpoint-sync hot path (the measured ~2.5x win), but the not-small-order consensus check is now re-enforced explicitly by Zebra on the untrusted boundary instead of relying solely on librustzcash. Add `Transaction::sapling_point_encodings_are_valid` (and the underlying `ShieldedData::point_encodings_are_valid`, `ValueCommitment::is_valid_not_small_order`, `EphemeralPublicKey::is_valid_not_small_order`), and call it from `verify_v4_transaction` / `verify_v5_transaction`, returning `TransactionError::SmallOrder` for a small-order or off-curve cv or epk. This runs on the semantic verification path and the mempool, which process untrusted transactions; the checkpoint verifier never calls it (it trusts block hashes), so the checkpoint throughput is unchanged. This restores a Zebra-side, auditable enforcement of the rule and makes the epk check isolatedly testable (it runs independently of proof verification). Spend rk is still validated at deserialization. Validated by `sapling_point_encodings_check_rejects_bad_points` and the existing lazy-cv/epk tests. * fix(consensus): run the deferred Sapling cv/epk check before to_librustzcash Adversarial review of the lazy Sapling change found one non-consensus issue: a small-order or off-curve cv failed inside CachedFfiTransaction::new (mapped to UnsupportedByNetworkUpgrade, mempool misbehavior score 0) before the explicit SmallOrder check ran, so a peer spamming bad-cv transactions received a lighter penalty than before the change (when it was a deserialization error). Move the sapling_point_encodings_are_valid check into the verifier's early quick checks, before the state lookups and the librustzcash conversion. Now a bad cv or epk fails fast with TransactionError::SmallOrder (score 100), restoring the peer penalty and making the check the primary, version-agnostic enforcer for v4, v5, and v6. Remove the now-redundant per-version copies. No consensus behavior change: the same transactions are accepted and rejected. The review confirmed no path commits or relays a transaction with a bad point without this check or checkpoint hash-trust, the commitment() panic is not reachable in release (no non-test caller), and there is no DoS amplification. * refactor(chain): make ValueCommitment::commitment fallible Removes the latent panic in `ValueCommitment::commitment`, which is the only caller-facing point that could decompress a deferred (unvalidated) value commitment. It now returns `Option`, so a future caller must handle an invalid encoding instead of getting a hidden panic, eliminating a possible DoS if the helper were ever moved onto a production path. `ShieldedData::binding_verification_key` (its only caller, used in tests) now propagates the `Option`. No production code calls either; the consensus encoding check happens on the semantic path via `sapling_point_encodings_are_valid`. * test(consensus): end-to-end reject of a Sapling output with an invalid epk Adds the missing end-to-end test for the deferred Sapling cv/epk check: it takes a real Sapling-output transaction, corrupts the first output's ephemeral key to an off-curve point, and runs it through the full transaction Verifier, asserting TransactionError::SmallOrder. The state service is unreachable!, proving the check fires in the early quick checks before any state lookup, and that the rejection is the explicit SmallOrder error rather than a later proof failure. This closes the last gap from the security review: the epk rejection is now confirmed by execution through the live verifier, not only by the isolated check and the librustzcash backstop. * consensus equivalence tests
…s reads (#140) * Update zebra-state/src/request.rs Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> * Update zebra-state/src/request.rs Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> * perf(state): parallelize and de-duplicate the committer's UTXO/address reads Before building the write batch, the checkpoint committer reads every transparent input's UTXO and every changed address's balance from RocksDB, one `zs_get` at a time on the writer thread. In the transparent-heavy ranges (~100-330K) these cache-served but serial point lookups dominate the per-block write time while the other cores sit idle (CPU ~2/8). The spent-UTXO path also re-derives each input's transaction location twice: once directly and once inside `utxo()`. Two changes in `write_block`: - Read the output location once and reuse it via `utxo_by_location` instead of letting `utxo()` look it up again (3 reads/input -> 2). - Fan the spent-UTXO and address-balance reads across the rayon pool (the writer already runs inside COMMIT_COMPUTE_POOL) once a block has enough inputs/addresses to amortize the fork-join cost, gated by PARALLEL_BLOCK_READ_THRESHOLD (16). The reads are read-only and land in order-independent maps, so the committed batch is byte-identical to the sequential path. Measured over a full mainnet genesis sync, comparing the same binary with and without this change, per-100K committer-thread metrics (peer-independent): range prep_reads write_block_total 100k 7.57 -> 2.64 ms 15.71 -> 10.38 ms 200k 8.94 -> 3.75 ms 19.01 -> 14.30 ms 300k 10.89 -> 3.52 ms 20.32 -> 13.07 ms 400k 2.33 -> 1.05 ms 4.84 -> 3.05 ms prep_reads drops 55-68% and write_block_total 25-37% across the transparent band, moving the bottleneck there onto rocksdb commit. No effect on low-input blocks (gated to sequential) or the heavy shielded region (few transparent inputs). * clean up and tests * comment * clean up comment * fix(state): remove duplicate finalized block import --------- Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
#144) * perf(state): precompute note-commitment tree hashing off the committer [prototype] Move the dominant per-block committer cost — the Sapling/Orchard note-commitment tree update — off the single serial committer thread. `parallel_append` is split into `precompute_subtree_roots` (the per-leaf Merkle hashing, position-independent: it needs only the starting note count, not the frontier's hashes) and `graft` (the cheap O(log N) merge on the committer). The finalized write loop runs a 1-block look-ahead: before committing block N it spawns block N+1's hashing on the commit-compute pool, so the hashing overlaps N's commit on otherwise-idle cores; the committer then only grafts. A size-match guard makes a stale precompute fall back to inline hashing, so this can only affect speed, never correctness. Byte-identical to the inline append (differential proptests over the split and the tracked-subtree boundary). `NOTE_PRECOMPUTE_DISABLE` env var forces the inline path for single-binary A/B benchmarking. A/B over the sandblast region (1.71M-1.735M): committer update_trees -54% (12.5 -> 5.7 ms/block). Throughput was flat there because that window is feed- bound (downloads buffered, CPU ~3.5/8), not committer-bound; the win applies where the committer is the gate. Prototype pending the feed instrumentation. * tests, clean up, changelog * fix(chain): use checked arithmetic for precompute capacity check The precompute batch path takes a caller-supplied start_size, so start_size + nodes.len() could wrap past the MAX_LEAVES capacity check (and panic on overflow in debug builds) for values near u64::MAX, building an inconsistent precompute that could later panic in graft. Use checked_add and reject over-capacity sizes with a clean MaxDepthExceeded error. Adds a regression test. * fix(chain): return recoverable errors from precompute helpers instead of panicking The precompute and graft helpers enforced caller-controlled preconditions with panicking assertions: precompute_subtree_roots / precompute_append_ batch_with_subtree on an empty batch, and graft on a frontier size that did not match the precompute's start position. Reachable via the public BlockNotePrecompute path, these turned invalid input into a process panic. Replace the assertions with recoverable BatchFrontierError variants (EmptyBatch, PrecomputeStartMismatch) and map them through a new NoteCommitmentTreeError::InvalidPrecompute in the Sapling/Orchard wrappers. The in-node path is unaffected (it guards empty note sets and size-matches before applying). Adds tests. * perf(chain): run Sapling and Orchard precompute concurrently BlockNotePrecompute::compute hashed the two pools sequentially. Although each pool's append is internally parallel, the two no longer overlapped the way update_trees_parallel's per-pool spawn_fifo tasks did. Restore the cross-pool overlap with rayon::join. * perf(chain): gate note-commitment precompute parallelism on batch size Below PARALLEL_HASH_THRESHOLD (16) note commitments, the per-leaf Merkle hashing now runs entirely serially: benchmarks show that for small batches the rayon join/par_iter overhead matches or exceeds the hashing it parallelizes (crossover ~16 for both Sapling Pedersen and Orchard Sinsemilla), and most blocks outside the sandblast region are small. The gate is on the whole-batch decision only; above the threshold each chunk still splits down to the leaves, so medium batches keep their internal parallelism. BlockNotePrecompute::compute likewise only spawns the cross-pool rayon::join when a pool is large enough to repay it. Adds the precompute_threshold benchmark (and bench-only precompute_then_ graft_root shims) used to find the crossover. Correctness is unchanged and covered by the existing differential proptests. * fix(state): make the look-ahead note precompute cancellable The finalized write loop starts the next block's note-commitment precompute before the current block has committed, to overlap the hashing with the commit. A current block that fails to commit (e.g. a checkpoint-range block whose authorizing-data commitment is only rejected at finalized-state commit) leaves that speculative work unwanted, and the spawned task previously had no cancellation path: it hashed the discarded child in full before noticing the receiver was dropped. Thread an Arc<AtomicBool> cancellation flag through spawn_note_precompute into BlockNotePrecompute::compute. The two pools are now hashed sequentially (each still internally parallel) so the flag is checked between them; the writer trips it whenever it drops a pending precompute (commit failure, parent-failure skip, height mismatch, or hash mismatch), bounding the wasted work for a discarded child to at most one pool. Correctness is unaffected (the committer still size-checks before applying). Adds a cancellation test. Also normalizes the prior 'graft' terminology to 'apply_precompute'. * perf(chain): keep the cross-pool join in the cancellable precompute Restore the rayon::join (and small-block sequential gating) for the two pools in BlockNotePrecompute::compute, which was dropped when compute was made cancellable. Cancellation is now done by checking the flag up front and at the start of each pool's hashing rather than strictly between the pools, so the cross-pool overlap is preserved while a cancel that lands before a pool starts still skips its work. * fix(chain): bind note precompute to its block, not just the tree size A BlockNotePrecompute was selected solely by start_size == tree.count(), and in that branch the block's own note-commitment arguments were ignored in favor of the precompute's leaves. A precompute accidentally paired with a different block of the same starting tree size would therefore be grafted, silently producing a wrong note-commitment root. The node avoided this by pairing each precompute with the exact block hash in the write loop, but that invariant lived outside zebra-chain's API. Record the block hash in BlockNotePrecompute::compute and have update_trees_parallel_with apply the precompute only when its block_hash matches the block being committed; a mismatch falls back to inline hashing (correct, just slower). Adds a test that a precompute for a different block at the same starting size is rejected.
…type] (#151) When a required (head-of-line) block registry-misses, re-dispatch its backoff retry as a fan-out to several random ready peers, ignoring inventory markers, and take the first peer that delivers it. This bypasses stale 'missing' markers (pool.route_inv.notfound.all_missing), the measured cause of ordered-commit stalls, where ready peers actually have the block. - zebra-network: new Request::HedgedBlocksByHash routing directive + route_hedge (reuses select_random_ready_peers; rewrites to per-peer BlocksByHash); falls back to the same NotFoundRegistry as route_inv so sync retry/backoff is unchanged. - zebrad sync: download_and_verify_hedged + hol_hedge_fanout, env-gated via SYNC_HOL_HEDGE_FANOUT (default 0 = off). Only the registry-miss retry hedges; the 2s backoff and #105 gating are unchanged. - Test: peer_set_route_hedge_bypasses_missing_markers. Also fixes a pre-existing compile break in sync/tests/vectors.rs (Downloads::new missing Network arg) so the test target builds; the 5 stale Commit-vs- CommitCheckpointPrecomputed failures there are pre-existing precompute drift, unrelated to this change.
…'s UTXO reads (#158) * perf(state): parallelize per-block serialization in the finalized block writer (#128) * perf(state): serialize raw transactions in parallel when writing blocks * perf(state): compute block size in parallel + run block-write batch prep in dedicated pool * comment * perf(state): gate parallel block batch-prep on a transaction-count threshold (#138) The checkpoint committer serializes each block's raw transactions (block.rs) and sums the per-transaction sizes (chain.rs) on the rayon pool. That fan-out is a clear win for the large blocks in the heavy shielded region, but for the small blocks of the early chain the rayon fork-join cost (waking workers, distributing the items, joining) outweighs the work itself. Gate both parallel paths on PARALLEL_BLOCK_TX_THRESHOLD (16 transactions): blocks at or above it keep the parallel path, smaller blocks run sequentially. The output is byte-identical either way, so this is purely a scheduling change. Measured with two fresh-from-genesis mainnet syncs of the same binary, gate toggled, over a matched height window (per-block, committer-thread metrics that are independent of peer/download luck): batch_prep 1.45ms -> 1.31ms (-10%) write_block_total 6.38ms -> 6.08ms ( -5%) Stable across sub-windows (batch_prep -8% to -13%). The heavy shielded region is unaffected: those blocks have >= 16 transactions and keep the parallel path. * perf(state): overlap raw-transaction serialization with the committer's UTXO reads In checkpoint sync through the shielded sandblast region the finalized committer is the serial bottleneck. The `tx_by_loc` raw-transaction serialization (re-serializing each transaction to bytes) runs sequentially after the spent-UTXO reads on the committer's critical path. Run it concurrently with those reads via `rayon::join`: serialization is CPU-bound while the reads wait on disk, so they overlap. The bytes are threaded as `precomputed_raw_txs` into `prepare_block_batch`, which uses them directly; the semantic path passes `None` and serializes inline as before. Output is byte-identical and there is no on-disk-format change. Matched A/B on mainnet 1.81-1.9M (archive mode): ~0.8-1.2 ms less total committer time per block (peer-independent) and ~+5-6% throughput.
…ommit-compute pool (#247) The committer is not a member of COMMIT_COMPUTE_POOL, so install() is a synchronous cross-thread handoff that parks the committer until a pool worker runs the job. The look-ahead note-commitment precompute keeps those workers busy, so this second per-block handoff waits on a contended pool and that wait dominates the isolation it was meant to provide. Run write_block directly on the committer thread; its internal rayon uses the global pool. Measured net win (committer -12%, +5% throughput) on the sandblast region.
* perf(state): parallelize and de-duplicate the committer's UTXO/address reads (#140)
* Update zebra-state/src/request.rs
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
* Update zebra-state/src/request.rs
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
* perf(state): parallelize and de-duplicate the committer's UTXO/address reads
Before building the write batch, the checkpoint committer reads every transparent
input's UTXO and every changed address's balance from RocksDB, one `zs_get` at a
time on the writer thread. In the transparent-heavy ranges (~100-330K) these
cache-served but serial point lookups dominate the per-block write time while the
other cores sit idle (CPU ~2/8). The spent-UTXO path also re-derives each input's
transaction location twice: once directly and once inside `utxo()`.
Two changes in `write_block`:
- Read the output location once and reuse it via `utxo_by_location` instead of
letting `utxo()` look it up again (3 reads/input -> 2).
- Fan the spent-UTXO and address-balance reads across the rayon pool (the writer
already runs inside COMMIT_COMPUTE_POOL) once a block has enough inputs/addresses
to amortize the fork-join cost, gated by PARALLEL_BLOCK_READ_THRESHOLD (16).
The reads are read-only and land in order-independent maps, so the committed batch
is byte-identical to the sequential path.
Measured over a full mainnet genesis sync, comparing the same binary with and
without this change, per-100K committer-thread metrics (peer-independent):
range prep_reads write_block_total
100k 7.57 -> 2.64 ms 15.71 -> 10.38 ms
200k 8.94 -> 3.75 ms 19.01 -> 14.30 ms
300k 10.89 -> 3.52 ms 20.32 -> 13.07 ms
400k 2.33 -> 1.05 ms 4.84 -> 3.05 ms
prep_reads drops 55-68% and write_block_total 25-37% across the transparent band,
moving the bottleneck there onto rocksdb commit. No effect on low-input blocks
(gated to sequential) or the heavy shielded region (few transparent inputs).
* clean up and tests
* comment
* clean up comment
* fix(state): remove duplicate finalized block import
---------
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
* perf(state): parallelize per-block serialization in the finalized block writer (#128)
* perf(state): serialize raw transactions in parallel when writing blocks
* perf(state): compute block size in parallel + run block-write batch prep in dedicated pool
* comment
* perf(state): parallelize and de-duplicate the committer's UTXO/address reads (#140)
* Update zebra-state/src/request.rs
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
* Update zebra-state/src/request.rs
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
* perf(state): parallelize and de-duplicate the committer's UTXO/address reads
Before building the write batch, the checkpoint committer reads every transparent
input's UTXO and every changed address's balance from RocksDB, one `zs_get` at a
time on the writer thread. In the transparent-heavy ranges (~100-330K) these
cache-served but serial point lookups dominate the per-block write time while the
other cores sit idle (CPU ~2/8). The spent-UTXO path also re-derives each input's
transaction location twice: once directly and once inside `utxo()`.
Two changes in `write_block`:
- Read the output location once and reuse it via `utxo_by_location` instead of
letting `utxo()` look it up again (3 reads/input -> 2).
- Fan the spent-UTXO and address-balance reads across the rayon pool (the writer
already runs inside COMMIT_COMPUTE_POOL) once a block has enough inputs/addresses
to amortize the fork-join cost, gated by PARALLEL_BLOCK_READ_THRESHOLD (16).
The reads are read-only and land in order-independent maps, so the committed batch
is byte-identical to the sequential path.
Measured over a full mainnet genesis sync, comparing the same binary with and
without this change, per-100K committer-thread metrics (peer-independent):
range prep_reads write_block_total
100k 7.57 -> 2.64 ms 15.71 -> 10.38 ms
200k 8.94 -> 3.75 ms 19.01 -> 14.30 ms
300k 10.89 -> 3.52 ms 20.32 -> 13.07 ms
400k 2.33 -> 1.05 ms 4.84 -> 3.05 ms
prep_reads drops 55-68% and write_block_total 25-37% across the transparent band,
moving the bottleneck there onto rocksdb commit. No effect on low-input blocks
(gated to sequential) or the heavy shielded region (few transparent inputs).
* clean up and tests
* comment
* clean up comment
* fix(state): remove duplicate finalized block import
---------
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
* perf(state): POC skip note-commitment recompute on checkpoint sync via supplied roots (#165)
Behind a default-off flag, the checkpoint committer can skip the per-block
note-commitment frontier recompute (`update_trees_parallel`, the dominant
checkpoint-sync CPU cost) when the per-block Sapling/Orchard roots are supplied
externally, folding them into the anchor set and history tree instead.
This is an experiment to bracket the achievable speedup; it sources roots from a
recorded fixture and is NOT shippable (no untrusted-source verification yet).
* perf(state): add read-only verifier for supplied commitment roots vs headers (#167)
Adds `commitment_aux_verify::verify_commitment_roots`, a read-only check that
replays per-block Sapling/Orchard roots into the ZIP-221 ChainHistory MMR and
confirms each against the block header commitments, reusing the existing
`block_commitment_is_valid_for_chain_history` and `HistoryTree::push` (no new
crypto). It returns the first height whose header rejects the roots folded in.
This is the "verify" half of the verified-commitment-trees work, as a standalone
function (no commit-path change) that a later verify-before-commit step will wrap.
A block's commitment commits to its parent's history tree, so a root at height H
is confirmed when H+1 is processed; this one-block lag is part of the contract.
Tested: a V1 (Heartwood/Canopy) vector test asserts real roots verify and a wrong
root is rejected at H+1; an ignored test verifies the real NU5/V2 range against a
synced archive fork (10,001 blocks; corrupted root rejected at H+1).
* perf(state): verify supplied commitment roots before the fast commit (#169)
Wires the read-only verifier into the checkpoint committer's fast path: before
committing a block with fixture-supplied Sapling/Orchard roots, verify them
against the next block's header commitment (verify-before-commit), and refuse to
persist a root that fails.
A block's roots are only committed by the *next* block's header (the ZIP-221
one-block lag), so when the write loop has buffered the successor, its commitment
check is run against the candidate history tree; the successor's auth data root is
already precomputed by the checkpoint verifier, so this is cheap. The genuine sync
tip (no successor yet) commits on the in-arrears check and is verified when the
next block arrives.
Fast mode freezes the note-commitment frontier, so a block that fails to verify
cannot be recomputed in place (recompute would append to the stale seed frontier
and produce a wrong root). The path is therefore verify-or-error: a wrong fixture
root is rejected rather than silently miscommitted.
Adds a deterministic edge-case test that commits a valid generated chain across
the Heartwood (history-tree creation) and NU5 (V1->V2) boundaries: the correct
fixture produces byte-identical anchors + history to the legacy path, and a
corrupted root is rejected at its own commit.
* perf(state): persist fast-synced DBs and hand off the verified frontier at the checkpoint (#176)
* perf(state): check each header commitment once in the fast commit path
The verify-before-commit fast path ran two commitment checks per block that are
the same computation: a block's own commitment check `C(X, T_{X-1})` is
identical to the previous block's verify-ahead, which already computed
`C(X, T_{X-1})` one commit earlier. So each inter-block header commitment was
verified twice.
Cache the look-ahead result as `(next_height, next_hash)` on the
single-threaded committer and skip a block's own check when the previous
block's look-ahead already validated exactly it. The guard is hash identity,
and heights are monotonic, so a stale or cloned cache entry can never cause a
false skip. The cache is cleared on the no-successor sync tip and on
non-fast/legacy blocks.
Steady state drops from two commitment checks per block to one (legacy parity)
while still attesting every root. The verify-or-error contract is unchanged:
both checks still propagate errors, and a wrong root is still caught by the
look-ahead (untouched), so the rejection height is identical.
Adds a `prevalidated_count` counter and a standalone test isolating the dedup:
the second consecutive fast block skips its redundant check, and a stale cache
entry (right height, wrong hash) does not cause a false skip. The existing
fast-path proptest also asserts the dedup count across the Heartwood and NU5
boundaries.
* perf(state): persist fast-synced DBs and hand off the verified frontier at the checkpoint
Make verified-commitment-trees fast sync produce a persistent, reopenable,
RPC-safe pruned database that hands off correctly to post-checkpoint semantic
verification (merged increments 4+5).
- Fast-sync marker: a `fast_sync_metadata` column family (sibling to
`pruning_metadata`, not a reuse — pruning drops tx bytes and keeps trees,
fast-sync drops trees; a DB can be both), DB format minor bump to 27.3.0, and a
one-way reopen guard refusing archive mode.
- Read/validity guards: per-height tree reads return `None` below the handoff
(before the backward search, so no stale tree and no panic); `z_gettreestate`
returns a typed archive-mode error below the handoff; the genesis-root and
subtree format-validity checks skip fast-synced databases.
- Checkpoint handoff: verify the supplied final frontier
(`frontier.root() == verified root`) and write it as the real tip treestate via
the normal write path, so post-checkpoint semantic verification resumes from a
correct frontier. Frontiers are supplied via a `VCT_FRONTIERS` sidecar.
- Sapling-era direct-header root check below Heartwood, where the ZIP-221 MMR does
not exist and the commitment check is a no-op.
- VCT fixture/sidecar scaffolding moved into a dedicated `vct.rs` submodule so the
commit path holds only the handoff hook.
Validated on a real mainnet fork: byte-identical consensus state 2,000 blocks past
the checkpoint; archive reopen refused, pruned reopen resumes with 0 panics; live
`z_gettreestate` guarded below / served above the checkpoint.
* perf(state): embed VCT handoff frontier (#177)
Bundle the verified mainnet handoff frontier with the VCT fast-sync path so runtime sidecars cannot drift from the checkpoint list.
* perf(state): factor the fast-path root source behind a CommitmentRootSource seam (#178)
The verified-commitment-trees fast path reads per-block roots and the checkpoint
handoff frontier from a fixture/embedded blob. This factors *where* that data
comes from behind a `CommitmentRootSource` trait, so the committer reads roots /
handoff height / final frontiers through one seam regardless of source. This is
not a new mode: the two enduring paths remain standard local tree rebuilding and
the fast verified path; the fixture is just one (scaffolding) source, to be
replaced by a transport-backed peer source over `tree_aux` later.
New `commitment_aux` module holds the seam and payload types (`BlockCommitmentRoots`,
`FinalFrontiers` moved here so the dependency runs one-way), the trait, the
`FixtureSource` / `VecRootSource` implementations, and the **producer** half:
`produce_block_roots` / `produce_final_frontiers` derive the same payload from an
existing database's per-height trees — the read path a serving node runs, minus
the network. `VctState` now holds a `Box<dyn CommitmentRootSource>` and delegates
its data accessors to it; the commit path is unchanged (behavior-preserving).
Adds a round-trip test: build an archive state over a generated valid-commitment
chain crossing Heartwood and NU5, produce the roots/frontier from that database,
then drive a fresh fast-sync state that consumes the produced payload and assert
byte-identical anchors + history-tree hash, plus that the produced frontier agrees
with the legacy tip and the produced root at the handoff. This is coverage the
existing equivalence test lacks: there the roots are captured from the committer's
inline-returned trees, here they come from the database read path a server runs.
No roots-index column family, no DB-format change, and no networking: the producer
derives from existing archive per-height trees. Serving the read path to peers and
letting fast-synced nodes re-serve roots are later increments.
* perf(state): add the tree_aux commitment-roots wire type and a fillable PeerSource (#182)
Foundational layer for the `tree_aux` peer source (increment 6a) — the wire data
model and the consumer-side source, ahead of the Zakura transport itself.
- Add `BlockCommitmentRoots { height, sapling_root, orchard_root }` to `zebra-chain`
(`parallel/commitment_aux.rs`) with `ZcashSerialize`/`ZcashDeserialize`, so
`zebra-network` and `zebra-state` share the wire payload without a dependency cycle.
`zebra-state` now uses this one type (no duplicate). The final frontier is embedded
in the binary, not on the wire, so this is the only `tree_aux` wire payload.
- Add `PeerSource`: a fillable `CommitmentRootSource` backed by a shared, height-keyed
roots cache, with a `PeerSourceWriter` handle the future `tree_aux_driver` fills as
verified root ranges arrive from peers. The handoff frontier is held immutably from
the embedded constant; only roots come over the network.
Tests: `BlockCommitmentRoots` wire round-trip; and
`vct_peer_source_filled_incrementally_drives_byte_identical_state` — fills a
`PeerSource` in two chunks via its writer (as the driver would when ranges arrive) and
drives the committer to byte-identical consensus state, proving the fillable source is
a drop-in for the fixture. The Zakura `tree_aux` stream + driver + two-node run follow.
* perf(network): add the tree_aux stream wire codec and serving RequestResponseService (#183)
The core of the verified-commitment-trees peer source (increment 6a) transport: the
`tree_aux` Zakura stream as a one-shot request/response service. A client sends
`GetRoots{start,count}` and the server answers `Roots{...}` from local state.
- `tree_aux/wire.rs`: `TreeAuxMessage` (Status, GetRoots/Roots, RangeUnavailable) + byte
codec, DoS bounds (max roots/request, max message bytes), stream kind 7 / capability
`1<<4`. Roots-only — the final frontier is embedded in the binary, not on the wire.
- `tree_aux/service.rs`: `TreeAuxService` implementing `RequestResponseService`, serving
`GetRoots` from a `TreeAuxStatePort` (a trait the node implements over `zebra-state`'s
`produce_block_roots`, so `zebra-network` keeps no dependency on `zebra-state`).
Templated on `legacy_gossip`/`header_sync` but far smaller: a request/response service
needs no ordered-stream reactor or scheduler.
Tests: wire round-trip for every message; over-limit and trailing-byte rejection; the
service serves a held range and reports an unheld range unavailable. fmt + clippy clean.
Still to wire (follow-up): register the service in the handler, the client-side driver
that fills `PeerSource` (header-sync-aligned), startup/config, and the two-node run.
* perf(network): exchange commitment roots between two peers over tree_aux (#185)
Wire the tree_aux serving service into a working peer exchange and prove it with a
two-node integration test — the "proof of peers" for the verified-commitment-trees
peer source (increment 6a).
- Make the outbound request/response path (`write_outbound_request_frame_inner`)
stream-kind-aware: the legacy request stream keeps its legacy-message-specific
response budget, while generic streams (tree_aux) read response frames bounded only
by the stream frame cap and a small response-frame count. Previously every outbound
request was validated as a legacy request, so a tree_aux `GetRoots` was rejected as
an "unsupported legacy request message type".
- Teach `app_frame_cap_for_stream_kind` about tree_aux (kind 7) so larger roots
responses are not capped to the control-frame limit.
Test: `two_nodes_exchange_roots_over_tree_aux` stands up two real Zakura nodes over the
loopback transport, negotiates the tree_aux capability, and has the client fetch
`GetRoots` from the server — asserting the roots received over the wire match the
server's holdings. fmt + clippy clean; the legacy request/response path is unchanged
(a logical no-op for non-tree_aux streams).
* perf(network): add the tree_aux client driver (fetch_roots) with a two-node test (#186)
The client side of the verified-commitment-trees peer source: fetch_roots pulls a
height range of verified per-block commitment roots from connected peers (bounded
GetRoots requests, advancing by what each peer returns) and delivers each contiguous
batch to a sink. The node wires that sink to a PeerSource so the fast committer reads
peer-fetched roots through the existing seam; the committer re-verifies every root
against its own headers, so the fetch carries no trust.
Test: client_driver_fetches_a_root_range_over_tree_aux drives fetch_roots over the real
loopback transport against a serving peer and asserts the collected range matches the
server's holdings. fmt + clippy clean.
* perf(state): add the BlockRoots read request for tree_aux serving (#187)
The state-side serving read path for the verified-commitment-trees peer source
(design §9): a `ReadRequest::BlockRoots { start_height, count }` returning
`ReadResponse::BlockRoots(Vec<BlockCommitmentRoots>)` with the per-block commitment
roots a node holds for that range, derived from its per-height trees via the existing
`produce_block_roots` (now `pub(crate)` and re-exported from `finalized_state`).
The handler clamps the range to the finalized tip and serves nothing on a fast-synced
node (which lacks the historical per-height trees below its handoff and would serve
from a roots index instead, not yet wired) — so it never panics on absent trees. The
`tree_aux` server (`TreeAuxStatePort`) reads through this request; the zebrad wiring
that connects them is a follow-up.
Read-only and additive: no Request maps to it, and existing reads are unchanged
(`vct_db_produced_payload_round_trips` and the commitment_aux tests still pass).
* perf(zebrad): fast-sync commitment roots from peers via tree_aux (#188)
* perf(zebrad): fast-sync commitment roots from peers via tree_aux
Wire the verified-commitment-trees peer source into a running node and make
it the default committer source on networks with embedded final frontiers.
- network: make TreeAuxStatePort async and thread an optional port through
init_with_zakura_header_sync -> spawn_zakura_endpoint_with_header_sync_driver
-> service_registry, registering TreeAuxService under the Zakura sync path.
- state: expose the PeerSource write handle (TreeAuxRootsWriter) via a
process-global so the driver and committer share one root cache; default
VctState::from_config to the peer source where embedded frontiers exist
(Mainnet), keeping explicit VCT_FAST/VCT_CAPTURE overrides and a VCT_LEGACY
opt-out. A height the peers cannot supply stays bit-identical to legacy.
- zebrad: add StateTreeAuxPort over ReadRequest::BlockRoots and a one-shot
tree_aux driver spawned alongside the header-sync driver.
Make the note-precompute skip per-next-block (vct_fast_will_apply) so legacy
fallback blocks keep their precompute overlap instead of a coarse fast flag.
* tests
* perf(state): test the source-mode precedence and add fast/legacy commit metrics
Review follow-ups for the tree_aux wiring:
- Factor the from_config source precedence into a pure select_source_mode and
unit-test it (locks the peer-source-default flip, the VCT_LEGACY opt-out, the
no-embedded-frontier legacy path, and the fixture/capture overrides) without
touching process env or the embedded files.
- Make StateTreeAuxPort generic over the read service and unit-test the serve
mapping: BlockRoots passthrough, and read-error / wrong-response both degrade
to an empty (unavailable) range.
- Add live observability counters for the commit path: state.vct.fast.block.count,
state.vct.legacy.block.count, and state.vct.prevalidated.block.count, so the
fast-vs-legacy ratio is visible at runtime (previously only an in-memory count
in a VCT_DIGEST shutdown log).
- Drop stale dead_code allows on PeerSource now that it is wired in.
* fix(state): verify supplied orchard roots below NU5 in the fast path (#190)
The verified-commitment-trees fast path folds the supplied per-block
Sapling/Orchard roots into the anchor set for every block below the
checkpoint. Sapling roots are authenticated against the header (directly
below Heartwood, via the ZIP-221 MMR from Heartwood on), but the Orchard
root below NU5 was never checked: the V1 history leaf (Heartwood..Canopy)
ignores the Orchard root, and there is no MMR below Heartwood.
So for the entire Sapling..NU5 range the supplied Orchard root influenced
`orchard_anchors` with no header authentication. On a legacy sync the
Orchard tree is the empty default across that range, so an untrusted
source could inject an Orchard anchor the recompute path never produces,
violating the design's trust boundary (every peer-provided root must be
checked against a header commitment before it influences the anchor set)
and consensus equivalence. The hole was masked only because the source
was a trusted fixture; the in-flight tree_aux peer source would arm it.
The Orchard tree is provably empty below NU5 (no Orchard actions are
allowed there), so pin the supplied Orchard root to the empty-tree root
for heights below NU5 activation, mirroring the existing Sapling-below-
Heartwood direct-header check. The check is a direct comparison (no
one-block lag), so a wrong root is rejected at the block's own commit.
Tests:
- Unit test on `verify_supplied_orchard_root_below_nu5`: the empty root
is accepted below NU5, a non-empty root is rejected with the dedicated
error, and any root is accepted at/above NU5 (authenticated by the MMR).
- The fast-path equivalence proptest now generates height-faithful chains
(height-based network upgrades instead of forcing NU5 at every height),
so Orchard data appears only at/after NU5 as on a real chain; this is
the consensus-invalid generation that hid the bug. It adds a negative
case asserting a corrupted below-NU5 Orchard root is rejected at its
own commit.
Also adds the nu6_3 activation field to the VCT proptest initializers,
which the rebased base left unset (the field was added to
ConfiguredActivationHeights upstream), so the test target compiles.
AI assistance: written with Claude Code (audit, implementation, tests).
* test(zebrad): integration-test tree_aux serving over the wire; add a Regtest frontier override (#191)
* test(zebrad): integration-test tree_aux serving over the wire; add a Regtest frontier override
Closes the one unverified seam in the verified-commitment-trees `tree_aux` peer
source: a real node serving per-block roots from local state over the wire. The
transport (zebra-network two-node + codec) and the committer (zebra-state
PeerSource + handoff) were already unit-tested; this proves the production serving
stack over the real transport on real state.
Integration test `tree_aux_serves_real_state_roots_over_the_wire`
(`zebrad/.../zakura/tree_aux_driver.rs`): a real `populated_state` finalized DB
serves roots through the production `StateTreeAuxPort` -> `TreeAuxService` over the
real loopback Zakura transport (`ZakuraTestNode`); a peer's `fetch_roots` receives
exactly what the state serves via `ReadRequest::BlockRoots`. A negative case fetches
a range above the tip: `fetch_roots` errors, so the committer keeps that range on the
legacy path (safe by construction, never wrong state).
Also adds a Regtest handoff-frontier override so the fast path can be exercised
deterministically on Regtest (whose checkpoint list is derived at runtime, so there
is no committed frontier to embed):
- Loader: `embedded_final_frontiers` gains a Regtest-only arm that loads the frontier
from the `VCT_REGTEST_FRONTIER` file, validated against the Regtest checkpoint
height. Mainnet still uses the embedded constant and never reads the env.
- Producer: `VCT_CAPTURE_FRONTIER` (+ `VCT_CAPTURE_FRONTIER_HEIGHT`) dumps the tip
treestate frontier on the legacy commit path, so a synced node can generate the
fixture the loader reads. `FinalFrontiers::to_bytes` is now compiled outside tests.
Unit tests cover the loader round-trip and the height-mismatch rejection. The
fast-path-engaged signal needed for the higher e2e layers already exists
(`state.vct.fast.block.count`). The full two-process docker regtest e2e (a node
fast-syncing from a peer over the network) is the production-grade follow-up, now
unblocked by this Regtest override.
* better naming
* fix(state): lock the checkpoint auth-data-root cache to its block (#192)
The finalized checkpoint commit path trusts a precomputed `AuthDataRoot`
carried on `CheckpointVerifiedBlock` for the ZIP-244 `hashBlockCommitments`
header check (`block_commitment_is_valid_for_chain_history`): for NU5+
blocks it uses `precomputed_auth_data_root.unwrap_or_else(|| block.auth_data_root())`,
so a `Some` value suppresses recomputation from the block's transactions.
Every `Some` value is computed from the block by the constructors
(`prepare_block_data`), so the cache is correct by construction. But the
public API let it be desynced after construction: `SemanticallyVerifiedBlock`
exposed a `pub auth_data_root`, `CheckpointVerifiedBlock` implemented
`DerefMut` to it, and both are re-exported. A holder could swap `block`
(or overwrite the cache) while keeping a stale root, and if the header
matched the stale root the committer would finalize a block without
proving the header binds the block's actual authorizing data.
Lock the (block, auth-data-root) pair together so it cannot be forged
across the crate boundary, while keeping the precompute performance win:
- Make `auth_data_root` `pub(crate)` (only this crate's constructors, which
derive it from the block, can set it) and document the invariant.
- Remove `DerefMut` for `CheckpointVerifiedBlock` (the only type whose
cache the committer trusts), so a holder cannot mutate the block or the
cache after construction. Reads keep working via `Deref`.
- Add `CheckpointVerifiedBlock::set_deferred_pool_balance_change`, the one
field the checkpoint verifier legitimately sets post-construction, and
route the verifier and the `new` constructor through it.
- Add `SemanticallyVerifiedBlock::from_semantic_data` so the semantic
verifier builds the block through a checked entry point (auth-data root
left unset) instead of a struct literal.
Construction (`with_hash` / `From<Arc<Block>>` / `new`) and the trusted-cache
consumers (`finalized_state.rs` current-block and look-ahead checks) are
unchanged, so the optimization and consensus behavior are preserved.
Tests: assert every `CheckpointVerifiedBlock` constructor caches the
block's own auth-data root, and that the semantic constructor leaves it
unset. The no-`DerefMut` / crate-private guarantees are enforced at compile
time.
AI assistance: investigated and implemented with Claude Code.
* comments
* add response and message bounds
* fix(state): roll back the Zakura header store with the body chain (#198)
`rollback_finalized_state` rolled back the block/tx/UTXO/tree/nullifier CFs
but left the Zakura header store (`zakura_header_*`) untouched. Because the
header store races ahead of the body chain and is keyed independently, a
rolled-back database kept header rows -- and a `BestHeaderTip` -- far above
the new body tip.
That inconsistency stalls Zakura block (body) sync on the resulting node:
`missing_block_bodies` only offers heights that already have a stored header,
so the contiguous floor body (`target_height + 1`) is never requestable, the
reorder buffer never drains, the verified tip is frozen, and after the
5-minute body-sync stall timeout the node falls back to legacy ChainSync.
Roll the Zakura header store back too: delete every `zakura_header_*` entry
above `target_height`, scanning from the (possibly higher) Zakura header tip
down. After this a rolled-back DB's `BestHeaderTip` is <= the body tip,
header-sync re-validates contiguously from `target_height + 1`, the floor body
is requestable, and block-sync advances.
* test(state): cover delete_zakura_headers_above truncation; fix its rustfmt (#202)
PR #198 added `delete_zakura_headers_above` (rolling the Zakura header store
back with the body chain) without unit coverage, and its CF-handle lines were
left unformatted (`cargo fmt --check` failed on rollback.rs:864).
Add two unit tests against an ephemeral state DB:
- the populated case asserts heights above the target are removed from all four
zakura_header_* CFs, including the hash->height index, while heights at or
below the target are retained;
- the empty-store case asserts truncation is a no-op and does not panic on the
empty-tip lookup.
Run rustfmt over the function so the crate is fmt-clean again.
* fix(zebrad): start tree_aux root fetch at the verified tip, not genesis (#201)
The tree_aux driver hard-coded its root fetch to begin at genesis
(`fetch_roots(.., Height(1), ..)`). The committer only ever looks up a
fast root for blocks it is about to commit, i.e. the range
`[verified_tip + 1, checkpoint]`. Heights at or below the verified tip are
already committed and their roots are never queried.
On a node that starts well above genesis (e.g. from a snapshot), fetching
from genesis spends the whole fetch streaming already-committed roots and
never reaches the window the committer actually needs before it commits
those blocks. Every block then falls back to the legacy note-commitment
recompute path (`vct_fast_blocks = 0`), defeating the fast path.
Read the verified tip (`max(finalized_tip, best_tip)`) once at startup and
begin the fetch at `verified_tip + 1`. A genesis-empty node still yields
`Height(1)`, so the from-genesis behavior falls out exactly when the node
really is at genesis. The fetched range stays a superset of what the
committer commits even if the tip advances during the fetch (extra cached
roots below its position are harmless).
Verified end-to-end against a local archive peer from a mid-chain snapshot:
the fetch starts at `from_height = verified_tip + 1`, completes in ~1s, and
the committer reports `vct_fast = 20001, vct_legacy = 0` (previously 0 fast
/ all legacy).
* docs: restore the verified-commitment-trees design doc (now tracked) (#210)
The verified-commitment-trees design doc was kept as an untracked working file
and was lost when a shared worktree was cleaned. Restore it as a tracked file —
rebuilt from the increment-6a plan, the increment roadmap, the startup-wiring
work, and the serving-availability discussion — so it cannot be lost again.
Records the settled decisions (roots-on-wire / embedded frontier, header-sync
alignment, verified-tip fetch window, peer-source default), the source seam and
tree_aux architecture, the serving-availability analysis (roots-index CF vs
indexing-follower resync), the increment roadmap, the delivered startup wiring,
the fast/legacy observability counters, and the testing strategy.
* fix(state): refuse instead of corrupting on a frozen verified-commitment-trees frontier (#211)
During a verified-commitment-trees fast sync, fast blocks fold the supplied
note-commitment roots into the anchor set and history MMR but never advance the
per-height trees, so the on-disk frontier is "frozen" until the checkpoint
handoff writes the real one. While frozen, a legacy recompute would extend the
stale frontier and fold a wrong root into the MMR, corrupting consensus state.
This makes the committer fail closed in that window instead:
- A supplied root that fails any verification step is evicted from its source
(so a re-fetch from another peer can replace it) and rejected with the typed,
retryable VctSuppliedRootUnavailable error, rather than retried forever or
recomputed locally. This keeps one malicious peer from halting the sync.
- A frozen-frontier height with no valid supplied root refuses with the same
retryable error and leaves the database untouched, instead of recomputing
against the stale frontier.
- The frozen flag is seeded from the durable fast-sync marker on open, not just
tracked in-session, so a fast sync interrupted by a restart (frozen frontier
persisted, tip below the handoff) still refuses on the first post-restart
height with a missing root.
* docs: rewrite the verified-commitment-trees design doc from the merged code (#212)
The design doc was previously reconstructed from a partially lost working
copy, and its flat section numbering no longer resolved to the `design §N`
references in the source, nor did it reflect the decisions made after the
restore.
Rewrite it from the PR #189 commit history, reconciled against the merged
code:
- Restore a section structure where the code's `design §5.1/§5.2/§5.4`,
`§6.1`, `§9`, `§11` references resolve to the sections of the same number.
- Document the fail-closed frozen-frontier policy (evict + retryable
`VctSuppliedRootUnavailable`, frozen flag seeded from the durable marker on
restart), replacing the stale "stays in legacy mode" description.
- Add the below-NU5 Orchard pin and below-Heartwood Sapling direct-header
checks, the auth-data-root cache lock, the verified-tip fetch window, and
the Zakura header-store rollback supporting fix.
- Ground every claim in the code: type/function names, the tree_aux stream
constants and DoS bounds, the DB format bump, the live metrics, and the
test names. Add a file map.
AI disclosure: written with Claude Code (commit/code review, drafting).
* fix(state): reject invalid VCT handoff roots (#215)
Avoid panicking when a peer-supplied handoff root disagrees with the embedded final frontier, and cover the no-successor handoff case with a regression test.
* feat(state): drive VCT fast-sync mode from checkpoint_sync, not env vars (#216)
Reframe verified-commitment-trees mode selection around user-facing config
instead of the VCT_LEGACY env opt-out and the enable_verified_commitment_trees
POC flag. The fast verified path (skip the per-block note-commitment-tree
recompute below the last checkpoint) is now the default whenever a node syncs
under checkpoint trust, for both the Archive and Pruned storage modes;
consensus.checkpoint_sync = false is the only mode that fully reconstructs the
trees per block.
- zebra-state Config gains a serde-skipped checkpoint_sync mirror, set by
zebrad from consensus.checkpoint_sync at startup.
- select_source_mode gates the peer/fast default on checkpoint_sync and the
embedded-frontier presence; VCT_FAST/VCT_CAPTURE remain test-only overrides;
the VCT_LEGACY opt-out is removed.
- A completed fast-synced DB now reopens in any storage mode (Archive is fast
by default, so the old archive-reopen refusal no longer applies; fast sync
deletes nothing). The missing per-height trees stay an RPC-boundary limitation.
- A narrow new guard refuses to open an interrupted fast sync (frozen frontier,
tip below the handoff) with checkpoint_sync = false, which would otherwise
stall every below-handoff block forever with no root source to recover.
Updates the design doc and unit tests accordingly.
* fix(state): defer unverifiable VCT fast commits (#228)
Ensure VCT fast-path roots are only persisted after successor-header verification or trusted handoff frontier verification, avoiding poisoned tip commits.
* feat(state): VCT peer-source root refetch, tip-defer, and stall recovery (#217)
* fix(state): roll back the Zakura header store with the body chain (#198)
`rollback_finalized_state` rolled back the block/tx/UTXO/tree/nullifier CFs
but left the Zakura header store (`zakura_header_*`) untouched. Because the
header store races ahead of the body chain and is keyed independently, a
rolled-back database kept header rows -- and a `BestHeaderTip` -- far above
the new body tip.
That inconsistency stalls Zakura block (body) sync on the resulting node:
`missing_block_bodies` only offers heights that already have a stored header,
so the contiguous floor body (`target_height + 1`) is never requestable, the
reorder buffer never drains, the verified tip is frozen, and after the
5-minute body-sync stall timeout the node falls back to legacy ChainSync.
Roll the Zakura header store back too: delete every `zakura_header_*` entry
above `target_height`, scanning from the (possibly higher) Zakura header tip
down. After this a rolled-back DB's `BestHeaderTip` is <= the body tip,
header-sync re-validates contiguously from `target_height + 1`, the floor body
is requestable, and block-sync advances.
* test(state): cover delete_zakura_headers_above truncation; fix its rustfmt (#202)
PR #198 added `delete_zakura_headers_above` (rolling the Zakura header store
back with the body chain) without unit coverage, and its CF-handle lines were
left unformatted (`cargo fmt --check` failed on rollback.rs:864).
Add two unit tests against an ephemeral state DB:
- the populated case asserts heights above the target are removed from all four
zakura_header_* CFs, including the hash->height index, while heights at or
below the target are retained;
- the empty-store case asserts truncation is a no-op and does not panic on the
empty-tip lookup.
Run rustfmt over the function so the crate is fmt-clean again.
* feat(state): VCT peer-source root refetch, tip-defer, and stall recovery
Make the verified-commitment-trees fast path recover from missing or
unverifiable peer-supplied roots instead of wedging or silently stalling.
Refetch mechanism:
- The committer signals a targeted single-height refetch
(`request_peer_root_refetch`) over a process-global broadcast channel; the
`tree_aux` driver stays alive after its initial fetch and services these on
demand (`handle_refetch_request`).
- A frozen-frontier root miss parks the checkpoint block in place and retries the
same commit once the cache refills — without resetting the block queue.
Tip-deferral (correctness):
- An untrusted (peer) source now defers a fast block whose own root has no buffered
successor to confirm it (the one-block lag), rather than committing it on faith. A
wrong tip root is rejected before it is persisted instead of one block too late,
when it would be irreversibly on disk and could wedge the sync. The handoff
(frontier-pinned) and below-Heartwood (directly verified) blocks are exempt, as is
a trusted local fixture (`requires_verified_successor`).
Stall observability:
- Differentiated retry waits (fast poll for await-successor, slower for refetch); a
height stuck on a retryable stall past a threshold escalates to an error-level log
and the `state.vct.root.stalled.height` gauge, so a genuinely unservable root is
visible instead of a silent infinite loop.
Also updates the disk-format column-family snapshots for the `fast_sync_metadata` CF.
Tests: committer deferral + recovery, retryable-error classification, the
source-trust boundary, and driver refetch handling; the peer-source equivalence
test now buffers successors for each commit.
* test(state): harden VCT peer root recovery
Add regression coverage and observability for peer-root refetch recovery so a bad or stalled tree_aux peer cannot silently wedge the VCT fast path.
* feat(state): serve tree_aux roots from a per-height index on fast-synced nodes (#219)
* feat(state): serve tree_aux roots from a per-height index, not just trees
Fixes the root-serving collapse: under checkpoint sync, Mainnet nodes default to
VCT fast mode and mark their DB fast-synced, and the `BlockRoots` serve gated on
`!is_fast_synced()` — so once a node fast-synced it served an *empty* root list,
turning the root-serving fleet into root-consumers (bodies available, roots empty).
A fast-synced node verified every root it folded in; it just never persisted them in
a height-keyed, servable form (the anchor sets are keyed by root, for membership). So
persist them: a compact `commitment_roots_by_height` column family (64 bytes/height)
that *every* node writes for each committed block, on both the fast and legacy commit
paths (design §4). `BlockRoots` now serves from this index — removing the
`!is_fast_synced()` gate — and falls back to deriving from per-height trees only for a
pre-index archive database (where the index is empty).
- New CF `commitment_roots_by_height` + `CommitmentRootsByHeight` (32+32-byte value);
written in `prepare_trees_batch` (fast and legacy), read via
`commitment_roots_by_height_range` (contiguous prefix, gap-free for serving).
- Finalized rollback truncates the index above the target, like the per-height trees.
- DB format minor 3 -> 4 (additive; existing DBs open with an empty index and serve
from trees as before).
Test: `vct_fast_sync_handoff_marks_database_and_resumes` now asserts a fast-synced node
(no per-height trees below the handoff) serves the below-handoff roots from the index,
byte-identical to the legacy/archive node's per-height-tree roots.
Note: this restores serving for nodes that fast-sync *after* this change. Existing
fast-synced DBs still need a resync or a backfill task to populate the index for their
historical range (design §4 follow-up); deploy tooling should also treat root-serving
nodes explicitly rather than relying on them not having fast-synced.
* address rollback
* refactor(state): share VCT commitment root verification (#230)
Route the checkpoint fast commit path through the shared VCT root verifier so tests and production exercise the same consensus-critical checks.
* perf(state): bound VCT peer root cache (#231)
* refactor(state): share VCT commitment root verification
Route the checkpoint fast commit path through the shared VCT root verifier so tests and production exercise the same consensus-critical checks.
* perf(state): bound VCT peer root cache
* fix(state): repair incompatible history tree on open (#232)
Rebuild incompatible stored tip history trees before background format checks can panic, while preserving fast-synced archive reopen behavior.
* perf(state): overlap raw-transaction serialization with the committer's UTXO reads (#158)
* perf(state): parallelize per-block serialization in the finalized block writer (#128)
* perf(state): serialize raw transactions in parallel when writing blocks
* perf(state): compute block size in parallel + run block-write batch prep in dedicated pool
* comment
* perf(state): gate parallel block batch-prep on a transaction-count threshold (#138)
The checkpoint committer serializes each block's raw transactions (block.rs)
and sums the per-transaction sizes (chain.rs) on the rayon pool. That fan-out
is a clear win for the large blocks in the heavy shielded region, but for the
small blocks of the early chain the rayon fork-join cost (waking workers,
distributing the items, joining) outweighs the work itself.
Gate both parallel paths on PARALLEL_BLOCK_TX_THRESHOLD (16 transactions):
blocks at or above it keep the parallel path, smaller blocks run sequentially.
The output is byte-identical either way, so this is purely a scheduling change.
Measured with two fresh-from-genesis mainnet syncs of the same binary, gate
toggled, over a matched height window (per-block, committer-thread metrics that
are independent of peer/download luck):
batch_prep 1.45ms -> 1.31ms (-10%)
write_block_total 6.38ms -> 6.08ms ( -5%)
Stable across sub-windows (batch_prep -8% to -13%). The heavy shielded region
is unaffected: those blocks have >= 16 transactions and keep the parallel path.
* perf(state): overlap raw-transaction serialization with the committer's UTXO reads
In checkpoint sync through the shielded sandblast region the finalized
committer is the serial bottleneck. The `tx_by_loc` raw-transaction
serialization (re-serializing each transaction to bytes) runs sequentially
after the spent-UTXO reads on the committer's critical path.
Run it concurrently with those reads via `rayon::join`: serialization is
CPU-bound while the reads wait on disk, so they overlap. The bytes are
threaded as `precomputed_raw_txs` into `prepare_block_batch`, which uses
them directly; the semantic path passes `None` and serializes inline as
before. Output is byte-identical and there is no on-disk-format change.
Matched A/B on mainnet 1.81-1.9M (archive mode): ~0.8-1.2 ms less total
committer time per block (peer-independent) and ~+5-6% throughput.
* fix(state): serve VCT commitment roots without panicking; fix design-doc paths (#233)
Two leftover items from the verified-commitment-trees review:
- `produce_block_roots` (the `ReadRequest::BlockRoots` / `tree_aux` serving read)
derived each root from a per-height tree with `.expect()`, so an unexpectedly
absent tree on this peer-triggered read would panic the node. The caller already
restricts it to a non-fast-synced database within the tip, where the trees exist;
as defense-in-depth it now stops at the first absent height and serves the
contiguous prefix instead. The wire client validates contiguity and treats a short
batch as partial progress.
- Five stale comment references to `verified-commitment-trees-poc.md` now point at the
tracked `verified-commitment-trees.md`, so a `design §N` reference resolves.
* fix(state): keep tree_aux roots handles per state (#236)
* fix(state): parse VCT final frontiers safely (#237)
* fix(state): stage VCT peer roots after full fetch (#238)
* fix(state): pin Orchard roots when NU5 is unconfigured (#239)
* fix(network): enforce tree_aux response message cap (#240)
* fix(state): bound VCT peer root cache (#241)
* fix(state): prevent VCT prevalidation cache replay
The VCT fast path cached look-ahead authentication by a successor block's real block hash, but consumed that cache by comparing against the public checkpoint wrapper hash. Since CheckpointVerifiedBlock::with_hash can carry a caller-supplied hash, an in-process checkpoint commit caller could replay stale prevalidation onto a different block and skip its NU5 hashBlockCommitments check.
Bind the prevalidation skip to the wrapped block's real block.hash(), and clear cached VCT prevalidation when the finalized write loop drops wrong-height lookahead or resets discarded checkpoint state. Add regression coverage for forged wrapper hashes, cache clearing, and normal dedup resuming after a clear.
Tested with:
- cargo test -p zebra-state vct_clear_prevalidation_cache_disarms_skip_then_dedup_resumes
- cargo test -p zebra-state vct_dedup_skips_redundant_check_and_guards_stale_cache
- cargo fmt --all -- --check
- cargo test -p zebra-state service::finalized_state::tests::prop
* refactor(state): remove obsolete VCT fixture source
Remove the env-backed VCT fixture/capture root source now that peer tree_aux is the production source for verified commitment-tree roots. Source selection now resolves only to peer mode under checkpoint sync with embedded frontiers, or legacy recompute when checkpoint sync is disabled or no embedded frontier exists.
Keep the source seam for the peer cache, move the shared RootMap fixture source behind cfg(test), and delete VecRootSource in favor of the single test-only FixtureSource adapter. The legacy commit path no longer records root/frontier fixtures, and the design doc now describes VCT_FAST/VCT_FIXTURE/VCT_CAPTURE as removed transient scaffolding.
Verification:
- cargo fmt --all -- --check
- cargo test -p zebra-state vct_
- cargo test -p zebra-network tree_aux
- cargo test -p zebra-state
- cargo clippy -p zebra-state --all-targets -- -A unexpected-cfgs -D warnings
* test(state): pin embedded frontier roots
Assert the Sapling, Orchard, and Sprout roots decoded from the embedded Mainnet final-frontier payload against pinned byte constants. This gives CI a stable guard for the Sprout handoff frontier, which has no header commitment to verify at runtime, and also catches accidental regeneration changes to the Sapling and Orchard embedded roots.
Also import IntoDisk where AdvertisedBodySize implements it so the targeted zebra-state test compiles.
* refactor(state): make VCT successor policy explicit
Move the successor-verification policy out of CommitmentRootSource and into VctState data so peer vs fixture trust is set once at construction. Route finalized-state deferral checks through a single VCT predicate to avoid re-deriving the same conditions in multiple layers.
Update tests to pass the trust policy explicitly and add a focused regression test proving the policy belongs to VctState rather than the root source. Verified with cargo fmt --all -- --check and cargo test -p zebra-state vct --lib.
* refactor(state): simplify VCT fast-path handoff handling
Unify checkpoint handoff frontier root validation so Sapling and Orchard mismatches share the same typed retryable rejection path.
Preserve VCT successor prevalidation across await-successor deferrals so retrying a deferred peer-sourced block can reuse the predecessor look-ahead instead of rechecking its own commitment.
Document the Mainnet frontier regeneration flow and add focused test coverage for the preserved deferral retry dedup.
Tested with:
- cargo fmt --all -- --check
- cargo test -p zebra-state vct_peer_source_defers_unverifiable_tip_root_until_successor
- cargo test -p zebra-state vct_dedup_skips_redundant_check_and_guards_stale_cache
* feat(state): add VCT frontier regeneration
Add a state-backed final-frontier byte producer and parser validation path so checkpoint maintenance can regenerate the embedded Mainnet VCT frontier using the same serialization and height checks that node startup uses.
Extend zebra-checkpoints with explicit Mainnet frontier artifact flags, keeping checkpoint stdout stable while writing the frontier as a side artifact from a synced Zebra state. Wire the checkpoint-generation and checkpoint-update workflows to upload, require, validate, and install mainnet-frontier.bin when Mainnet checkpoints advance.
Add local compatibility coverage for DB-produced frontier bytes written to disk and parsed through the node loader validation path, plus CLI argument and auto-height tests. Document the tool usage and local verification flow in the VCT design doc.
Verification:
- cargo fmt --all -- --check
- cargo test -p zebra-state final_frontier
- cargo test -p zebra-state vct_
- cargo test -p zebra-utils --features zebra-checkpoints
- cargo test -p zebrad --features zebra-checkpoints checkpoints
- cargo clippy -p zebra-state --lib --tests -- -A unexpected-cfgs -D warnings
- cargo clippy -p zebra-utils --features zebra-checkpoints --bin zebra-checkpoints -- -A unexpected-cfgs -A clippy::unwrap_in_result -A clippy::clone_on_copy -D warnings
- cargo test -p zebra-state
* feat(state): add VCT fast sync kill switch
Expose consensus.disable_vct_fast_sync as an initial-rollout force-disable knob next to checkpoint_sync, then mirror it into state so checkpoint sync can stay enabled while VCT uses manual tree recomputation.
Keep the interrupted-fast-sync reopen guard fail-closed whenever no VCT root source is active, including when the kill switch is enabled. Document the safe switching boundaries and update generated config output so operators can discover the knob under [consensus].
Add coverage for mode selection, consensus config conversion, completed fast-sync reopen in archive and pruned modes, unsafe interrupted-fast-sync reopen, and switching between fast and manual recomputation at safe boundaries.
* refactor(state): consolidate VCT database format bump
Fold the unreleased VCT state format markers into 27.3.0 so the database version matches the consolidated format changes.
* feat(zebrad): harden tree_aux peer refetch policy
Track tree_aux root provenance in the zebrad driver rather than zebra-state, preserving the state/network crate boundary while letting rejected roots be attributed to the peer that supplied them. A rejected supplier now has all still-cached roots bulk-evicted, is excluded from tree_aux selection during a hard-failure cooldown, and is only disconnected after repeat failures in the decay window.
Expose a provenance-preserving fetch helper in zebra-network and a bulk invalidation API on the state peer-source writer so the driver can recover from poisoned root windows without grinding height-by-height. Document the final adversarial peer policy, observability, and test coverage in the VCT design.
Tested with:
- cargo test -p zebrad tree_aux_driver
- cargo test -p zebra-state peer_source_bulk_invalidate_evicts_multiple_roots
- cargo test -p zebra-network zakura::tree_aux
* comments
* feat(zakura): harden tree_aux peer liveness
Add stream-local peer policy for tree_aux fetches so transient request failures have memory without being treated as verified bad content. The network fetch path now reports per-peer request outcomes, supports normal/demoted/excluded peer preferences, and keeps soft-failed peers eligible as fallback while moving them behind healthy peers. The zebrad driver records bounded soft demotions separately from hard verification failures, clears them on successful responses, and keeps hard-failure cooldown/disconnect behavior authoritative.
Bound tree_aux liveness attacks in the fetch loop. Root responses now must make minimum progress on large requests, preventing one-root prefixes from amplifying a 4000-root fetch into thousands of round trips while still allowing small tail ranges. Fetches now use a tied, bounded hedge: start with one preferred peer, add up to two more after a short delay if the request is still unanswered, and stop after the first hedge group so one attempt no longer walks every peer through 30-second timeouts.
Document the updated failure policy and liveness tradeoffs in the verified commitment tree design note. Add coverage for peer ordering, soft demotion expiry/clearing, hard-over-soft precedence, bounded soft-failure memory, minimum-progress validation, hedged fetch latency, cancellation safety for losing hedged requests, bounded hedge width, and retry recovery when the first hedge group soft-fails before an honest fourth peer is surfaced.
Verification:
- cargo fmt --all -- --check
- cargo test -p zebra-network tree_aux
- cargo test -p zebrad tree_aux_driver
- git diff --check
- ReadLints on changed files
* fix(state): guard VCT fast path at handoff
Keep the verified-commitment-trees fast path bounded to heights at or below the checkpoint handoff inside the finalized committer. This makes the handoff invariant explicit at the point where the committer decides whether to keep the frontier frozen, so any stale or over-eager root cache entry above the handoff is ignored and post-handoff blocks resume legacy recompute from the verified frontier.
Extend the VCT mode-switch regression to poison a cached root at handoff + 1 and assert the fast counter stops at the handoff while anchors, history, and tip frontiers remain byte-identical to the legacy recompute path. Also refresh the VCT design note and changelog entry to describe the default-on fast-sync behavior and kill switch.
Tests:
- cargo fmt --all -- --check
- cargo test -p zebra-state vct_mode_switches_continue_from_safe_boundaries
- git diff --check
* address comment
* clippy and docs
* clean up document
* lints
* lint
* chore: remove committed AI-workspace scratch notes from repo root
These root-level research/scratch markdown files (CHECKPOINT_SYNC_FINDINGS,
RUNBOOK, HANDOFF, SAPLING_HASH_RESULTS, etc.) are AI-workspace notes that were
committed inadvertently and fail the repo Docs Check (markdownlint MD0xx and a
codespell typo). They are not project documentation; remove them.
* renames
* improve comments
* renames
* renames
* renames
* renames
* renames
* more edits
* more comments
* buidl issue
* refactor!: move aux tree to the headersync message
(cherry picked from commit faa17f6e79fac9ea2dbb21548747567c69771630)
* test(network): align header-sync tests with non-finalized tree-aux-root rejection
The reactor already rejects header-sync responses that carry tree-aux roots
on a non-finalized range (`UnrequestedTreeAuxRoots` at decode,
`MalformedMessage` at the reactor), but the tests still sent roots on those
ranges. Switch non-finalized test messages to roots-free builders, add the
`finalized_*` opt-in builders for finalized ranges, and add guard tests:
- `decode_rejects_tree_aux_roots_when_not_requested`
- `non_finalized_response_carrying_tree_aux_roots_is_malformed`
Also wait on the backfilled headers landing (not the pre-set finalized
height) before asserting the backward checkpoint-range commit trace.
(cherry picked from commit 0872488b679ce451ce3a8f8876c7865a77bb8265)
* feat(network): trace header-carried tree-aux roots and vct fast-path hits
Add observability for the header-carried tree-aux-roots feature:
- header-sync `headers_served`/`headers_received` rows now carry
`want_tree_aux_roots` and `tree_aux_roots_len`, and the header-sync
driver's commit-state rows carry `tree_aux_roots_len`
- new `state.vct.fast_path.{hit,miss}` counters record whether a finalized
commit consumed peer-supplied roots to skip the note-commitment rebuild
- `insert_bool` trace helper and a small `root_bytes` if/else cleanup
(cherry picked from commit 37dfbb80f9947062bbcbbcb3e4f1e183a9d488e1)
* fix(zebrad): serve the aligned tree-aux root prefix when roots lag headers
tree_aux_roots_for_served_header_range returned an empty vec whenever the
available roots did not cover every requested header height. Because served
headers normally run ahead of committed/provisional roots, that empty-on-gap
behavior meant no roots were ever served over the header-sync path, silently
disabling header-carried tree-aux roots. Stop at the first missing or
misaligned height and return the aligned prefix collected so far, matching
the served_header_tree_aux_roots_require_a_complete_aligned_prefix test.
* refactor!: remove the separate tree_aux fetch stream
Header-carried tree-aux roots (the header-sync Headers message plus
CommitHeaderRange persistence to zakura_header_commitment_roots_by_height)
fully replace the old separate tree_aux request/response stream, so remove it.
- zebra-network: delete the zakura/tree_aux client/server module, its stream
kind, capability registration, and the tree_aux_port plumbing through
init_with_zakura_header_sync / spawn_zakura_endpoint / service_registry.
- zebrad: delete tree_aux_driver (StateTreeAuxPort serving + run_tree_aux_driver
fetch loop) and its start.rs wiring; drop the now-dead tree_aux_roots_writer
argument from drive_zakura_header_sync_actions.
- zebra-state: remove TreeAuxRootsWriter, PeerSourceWriter, PeerSourceHandle, and
the peer-root refetch signal; drop the writer from zebra_state::init's return.
PeerSource stays as the DB-backed reader the committer uses. A missing or
rejected root now waits for header sync to deliver a replacement via the
in-place commit retry, or (on an archive node) recomputes from the per-height
trees; it no longer refetches over a separate stream.
zakura-commit-bench --with-roots is disabled pending a re-port onto the
CommitHeaderRange path.
(cherry picked from commit 979254f3edcd6d382d986f6a15473cb2cdaa4da6)
* fix(zakura): handle incomplete header roots
* comment
* service comments
* update docs
* propagate and debug log errors for observability
* lints
* lints
* fix(zakura): request header roots through checkpoint handoff
* feat(state): stitch tree_aux root serving across the vct upgrade height
Record a one-time `vct_upgrade_height` marker `U` (the lowest height this
binary commits, and the lowest height in the `commitment_roots_by_height`
serving index) in a new `vct_upgrade_metadata` column family. Written once on
the first committed block and never moved.
Root serving (`ReadRequest::BlockRoots`) now stitches the per-height trees
below `U` with the serving index at and above `U`, so a node that upgraded
mid-chain serves a range crossing `U` as one gap-free batch instead of the
short index-only prefix that stalled the fetch client's minimum-progress
check. A pre-index archive node (no `U`) still derives the whole range from
the trees.
Historical note-commitment tree availability is now the band `[U, H)` (H =
checkpoint handoff) via a new `vct_tree_absent` helper: trees are present
below `U` (pre-upgrade) and at/above `H` (semantic sync), absent only in
between. For a genesis fast-sync (`U = 0`) this reduces exactly to the prior
`height < H` behaviour.
* test(state): refresh column-family snapshots for vct column families
Regenerate the column_family_names and per-CF raw-data/empty snapshots to
reflect the vct_sync_metadata rename, the zakura_header_commitment_roots_by_height
feature CF, and the new vct_upgrade_metadata CF added by the upgrade-height stitch.
* feat!: enforce ranged header requests have roots (#282)
* feat!: enforce ranged header requests have roots
* test(zebrad): re-export header root-coverage helpers for driver tests
The zakura_header_sync_driver_tests module imports block_roots_cover_range
and root_covered_query_best_header_tip through super::zakura::, but the
zakura mod never re-exported them, so the zebrad lib test target failed to
compile (E0432, with a cascading E0282). Add both to the #[cfg(test)]
re-export block. They are pub(crate) in header_sync_driver and already used
by production code in that module.
* docs: PR #282 review notes and header-sync-roots follow-ups
* comments
---------
Co-authored-by: roman <roman@osmosis.team>
---------
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: evan-forbes <evan.samuel.forbes@gmail.com>
Co-authored-by: Evan Forbes <42654277+evan-forbes@users.noreply.github.com>
Excluded
#151. This PR is to be reverted. Using it as a persistent workaround to eliminate network quality issues in tests.