Skip to content

Conversation

golddydev
Copy link
Collaborator

@golddydev golddydev commented Oct 13, 2025

feat: add historical SPDD storage and query capabilities

Summary

This PR implements persistent storage and querying of historical Stake Pool Delegation Distribution (SPDD) data using the fjall embedded key-value database. This enables querying stake distribution snapshots by epoch and pool ID.

Changes

New Features

  • SPDD Historical Storage: Added SPDDStore module to persist SPDD state per epoch with retention configuration.
  • New Queries:
    • GetSPDDByEpoch: Query all delegations for a specific epoch
    • GetSPDDByEpochAndPool: Query delegations for a specific pool in an epoch
  • Configuration Option: store-spdd-history flag (default: false) to enable/disable SPDD persistence

Core Implementation

  • Database Layer (spo_distribution_store.rs):

    • Key encoding: epoch (8 bytes) + pool_id (28 bytes) + stake_key (28 bytes)
    • Value: stake amount (8 bytes)
    • Batch writes with 10K batch size for performance
    • Prefix-based queries for efficient lookups
  • State Integration:

    • dump_spdd_state() method added to extract current SPDD state
    • Automatic persistence at epoch boundaries (stores for epoch + 1 - because this is active stakes)
    • Query handlers integrated into accounts_state module
  • Retention Epochs:

    • spdd-retention-epochs: Set this configuration to u64 or "none" to store only last X epochs spdd history or all history.
  • Network Support:

    • Added GetNetworkName query to parameters-state module
    • REST endpoints dynamically query current network (mainnet/testnet)
    • Stake addresses generated with correct network prefix
    • Returns 500 error for unknown networks

Developer Experience

  • Serialization Utilities: Added generic DisplayFromBech32<PREFIX> with HRP prefix support for cleaner Bech32 serialization/deserialization

Other

  • Added .gitignore entry for *_db database directories

Testing

  • Unit tests included for SPDD store operations (store, query by epoch, query by epoch and pool)

Notes

  • This PR assumes that we are running on mainnet when we convert stake key hash to bech32 stake address.
  • SPDD storage is disabled by default to avoid unnecessary storage overhead
  • When enabled, historical data is stored at the beginning of each epoch
  • Database is created at spdd_db/ path (gitignored)

Storage Estimation Per Epoch

Data Structure

Per delegation record:

  • Key: 64 bytes (8 epoch + 28 pool_id + 28 stake_key)
  • Value: 8 bytes (stake amount)
  • Raw total: 72 bytes per delegation

Cardano Network Statistics (current mainnet):

  • Active stake addresses: ~1.3 million
  • Active stake pools: ~3,000
  • Average delegations per pool: ~430

Estimated on-disk size per epoch:
≈ 115-125 MB per epoch

Long-term Storage Projections

Time Period Epochs Storage Required
1 month ~6 ~720 MB
6 months ~36 ~4.3 GB
1 year ~73 ~8.8 GB
2 years ~146 ~17.6 GB
5 years ~365 ~44 GB

…counts state module

- Added SPDDStore to store Stake Pool Delegation Distribution data using fjall.
- Introduced methods to store and query SPDD by epoch and pool ID.
- Updated AccountsState to handle new SPDD queries.
- Enhanced REST API to support fetching SPDD data by epoch and pool.
- Added configuration option to enable SPDD history storage.
- Add serialize as functions for Bech32.
- Included tests for SPDDStore functionality.
…nitialization

- Removed existing data deletion logic from SPDDStore initialization.
@golddydev golddydev marked this pull request as ready for review October 14, 2025 07:24
- Introduced  query to  enum.
- Use network from  instead of hard-coded, in epochs spdd endpoints
- Introduced a new configuration option for SPDD retention epochs in accounts state.
- Updated SPDDStore to support retention logic, allowing automatic pruning of old epochs.
- Enhanced logging to provide information on retention settings and pruning actions.
- Added tests to verify the pruning functionality for stored epochs.
- Updated SPDDStore to track the latest stored epoch and added logic to check if an epoch is stored based on retention settings.
- Modified AccountsState to utilize the new epoch checking logic when querying SPDD data.
- Adjusted omnibus.toml to enable SPDD history storage and set retention epochs for better management of stored data.
Copy link
Collaborator

@lowhung lowhung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some questions re: comments being left in the code. Would be curious to hear what others think about this too.

- Refactored  to streamline epoch data management, including methods for checking epoch completion and removing epoch data.
- Updated query methods to utilize the new  type and improved error handling for incomplete epochs.
- Enhanced test cases to reflect changes in data structures and ensure correctness of epoch operations.
- Introduced a new type  in  for better clarity.
@sandtreader
Copy link
Collaborator

sandtreader commented Oct 15, 2025

Overall comment - the description says it's a new SPDDStore module (which is ideal) but there are changes to AccountsState to do it. I'm worried AccountsState is already too big, so if it could be moved out into an SPDDStore which just tracks the SPDD output from AccountsState, that would be better.

(Actually just checked, we already have an SPDDState module which tracks this - this needs to go as an extension to that)

@whankinsiv
Copy link
Collaborator

Hi @sandtreader. Golddy and I discussed this; the main reason we placed per-pool SPDD in accounts_state was to avoid passing the full Snapshot struct through the message bus to spdd_state. I'm not opposed to moving it, but wanted to avoid unnecessary overhead.

Copy link
Collaborator

@sandtreader sandtreader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to move to SPDDState as per comment

@sandtreader
Copy link
Collaborator

Hi @sandtreader. Golddy and I discussed this; the main reason we placed per-pool SPDD in accounts_state was to avoid passing the full Snapshot struct through the message bus to spdd_state. I'm not opposed to moving it, but wanted to avoid unnecessary overhead.

Ah, does it go down into individual delegators? Of course it does, that's why it's so huge. Understood - apologies and objection withdrawn!

@sandtreader sandtreader self-requested a review October 15, 2025 15:40
@golddydev golddydev requested a review from whankinsiv October 16, 2025 06:00
Copy link
Collaborator

@whankinsiv whankinsiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Merging.

@whankinsiv whankinsiv merged commit de1b541 into main Oct 16, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants