Skip to content

fix(state): repair pruned pre-Ironwood history tree in place instead of forcing a re-sync#242

Open
czarcas7ic wants to merge 2 commits into
add-ironwood-v6-value-poolfrom
adam/fix-pruned-ironwood-migration
Open

fix(state): repair pruned pre-Ironwood history tree in place instead of forcing a re-sync#242
czarcas7ic wants to merge 2 commits into
add-ironwood-v6-value-poolfrom
adam/fix-pruned-ironwood-migration

Conversation

@czarcas7ic

@czarcas7ic czarcas7ic commented Jun 23, 2026

Copy link
Copy Markdown

Summary

The Ironwood history-tree upgrade rebuilds the tip ZIP-221 tree from finalized blocks. A node pruned before the Ironwood bump no longer has those blocks, so the upgrade fails with a fatal "delete the cache directory and re-sync from genesis" error — including for pruned snapshots upgrading to Ironwood.

The on-disk failure is purely a buffer-width change: adding V3 (Ironwood) node data grew the history-tree Entry from 253 to 326 bytes. A pre-Ironwood tip's V1/V2 node data is consensus-fixed and still present in the stored entry. This adds an in-place re-encode fallback: when the from-blocks rebuild reports MissingData, read the stored old-format tip entry, widen each peak's buffer to the current size (padding only), and write it back — no blocks required.

Why it's consensus-safe

The inner zcash_history reader consumes only the meaningful node-data prefix and ignores trailing zero padding, and version dispatch keys on the tip height — so a pre-NU6.3 tip's peaks are parsed as V1/V2 exactly as written. The re-encoded tree is therefore byte-for-byte equal (same peaks, same size, same MMR root) to what the from-blocks rebuild and a fresh sync produce. Sync-time block-commitment validation already guarantees the stored tree equals the canonical tree, so reusing it cannot diverge.

The merged archive (from-blocks) path is unchanged: the fallback runs only on the previously-fatal pruned path, and a genuinely unreadable entry still fails loudly with the re-sync message.

…of forcing a re-sync

The Ironwood history-tree upgrade rebuilds the tip ZIP-221 tree from finalized
blocks. A node pruned before the Ironwood bump no longer has those blocks, so the
upgrade fails with a fatal "delete the cache directory and re-sync" error.

The on-disk failure is purely a buffer-width change: adding V3 (Ironwood) node
data grew the history-tree Entry from 253 to 326 bytes. A pre-Ironwood tip's
V1/V2 node data is consensus-fixed and still present in the stored entry. This
adds an in-place re-encode fallback: when the from-blocks rebuild reports
MissingData, read the stored old-format tip entry, widen each peak's buffer to
the current size (padding only), and write it back -- no blocks required.

The inner reader consumes only the meaningful node-data prefix and ignores the
trailing zero padding, and version dispatch keys on the tip height, so a
pre-NU6.3 tip's peaks are parsed as V1/V2 exactly as written. The re-encoded tree
is therefore byte-for-byte equal (same peaks, same size, same MMR root) to what
the from-blocks rebuild and a fresh sync produce. The archive from-blocks path is
unchanged; the re-encode runs only on the previously-fatal pruned path, and a
genuinely unreadable entry still fails loudly.
@v12-auditor

v12-auditor Bot commented Jun 23, 2026

Copy link
Copy Markdown

Note

Complete: Audit complete. V12 found one issue worth reviewing.

Open the full results here.

FindingSeverityDetails
F-91761 🟡 Medium
Shallow History-Tree Repair

The new MissingData fallback rewrites an unreadable old history-tree entry by deserializing it as OldHistoryTreeParts, zero-padding each entry, and saving current-format HistoryTreeParts bytes. It treats the repair as successful when needs_rebuild() returns false, but that predicate only bincode-deserializes the wrapper and never reconstructs a NonEmptyHistoryTree, checks network_kind, validates peak/size consistency, or compares current_height with the finalized tip. A malformed but structurally old-format cache or restored snapshot can therefore be marked upgraded with stale or invalid history-tree metadata when pruning prevents the block-based rebuild from running. Later history-tree readers call with_network().expect(...), and checkpoint commits use db.history_tree() as the parent chain-history root, so the invalid value is detected only after the shallow repair has succeeded and the upgrade validation can advance the version marker.

Analyzed three files, diff abf6844...2d1e710.

…re-encode

Adds the regression tests requested before this leaves draft:
- reencode_round_trip_restores_current_format_bytes: synthesizes a genuine
  pre-Ironwood (253-byte-entry) blob from a real synced tip tree, confirms it is
  unreadable in the current format (reproducing the Ironwood UnexpectedEof), then
  asserts the re-encode reproduces the original current-format bytes exactly and
  yields the same MMR root.
- pruned_old_format_database_repairs_in_place_via_reencode_fallback: a pruned
  old-format database (source blocks gone) is repaired in place via the fallback,
  leaving the history root unchanged.

Also pins OLD_MAX_ENTRY_SIZE at compile time (= MAX_ENTRY_SIZE - V3 node-data
delta) and adds a test-only old-width encoder (cfg(test)) used to synthesize the
genuine old blob. cargo test -p zebra-state -- rebuild_history_tree: 6 passed, 0 failed.
@czarcas7ic czarcas7ic marked this pull request as ready for review June 24, 2026 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant