Skip to content

feat(mempool): persist predictions and reconcile against confirmed blocks #131

@Pablosinyores

Description

@Pablosinyores

Phase goal: persist mempool predictions, reconcile against confirmed blocks

PR #118 wired live mempool decoding and analytical V2/V3/Balancer post-state simulation. Predictions today are emitted as Prometheus counters + JSON dumps; there is no relational record we can SQL-query to ask "for each pending tx we decoded, did it land where we expected and in the order we expected?"

This issue adds that record + a reconciler that closes the loop against confirmed blocks. Net result: SQL-driven measurement of analytical-sim accuracy on real mainnet traffic, no submission risk.

Out of scope

Schema additions (migration 0002_mempool_predictions.sql)

CREATE TABLE mempool_predictions (
    prediction_id              UUID PRIMARY KEY,
    decoded_at                 TIMESTAMPTZ NOT NULL,          -- client-set
    pending_tx_hash            BYTEA       NOT NULL UNIQUE,
    router_address             BYTEA       NOT NULL,
    protocol                   TEXT        NOT NULL,           -- uni_v2 / sushi / uni_v3 / curve / balancer
    token_in                   BYTEA       NOT NULL,
    token_out                  BYTEA       NOT NULL,
    amount_in                  NUMERIC(78,0) NOT NULL,
    pool_address               BYTEA,                          -- NULL when registry miss
    predicted_target_block     BIGINT      NOT NULL,           -- current_head + 1 at decode time
    predicted_post_state       JSONB       NOT NULL,           -- shape varies by protocol
    profit_factor_predicted    DOUBLE PRECISION,               -- if the post-state scan found a cycle
    detection_lead_ms          BIGINT,                         -- decoded_at - earliest builder-side timestamp
    engine_git_sha             TEXT
);
CREATE INDEX ON mempool_predictions (pending_tx_hash);
CREATE INDEX ON mempool_predictions (predicted_target_block);
CREATE INDEX ON mempool_predictions (decoded_at DESC);

CREATE TABLE mempool_reconciliation (
    prediction_id          UUID PRIMARY KEY REFERENCES mempool_predictions(prediction_id) ON DELETE CASCADE,
    resolution_ts          TIMESTAMPTZ NOT NULL,
    outcome                TEXT NOT NULL CHECK (outcome IN ('confirmed','dropped','replaced','still_pending')),
    actual_target_block    BIGINT,                              -- NULL for dropped
    actual_tx_index        INT,                                 -- position within the block
    block_delta            INT,                                 -- actual - predicted (negative = earlier than predicted)
    ordering_correct       BOOLEAN,                             -- did our predicted tx-index match the actual ordering bucket
    pool_path_correct      BOOLEAN,                             -- did the swap hit the pool we said it would
    replaced_by_tx_hash    BYTEA,
    failure_reason         TEXT
);
CREATE INDEX ON mempool_reconciliation (actual_target_block);
CREATE INDEX ON mempool_reconciliation (outcome);

Wiring

Rust — crates/grpc-server/src/mempool_pipeline.rs: after try_post_state_scan runs, emit a MempoolPrediction event over a new bounded channel. New MempoolPredictionWriter (sibling to the existing Ledger trait pattern) inserts rows. Lockstep with the existing MEMPOOL_TRACKING=1 gate — no DB writes when the flag is off.

Go — cmd/monitor/reconciler.go (new): subscribes to newHeads over the existing WS provider (reuse the Rust-side connection by having the Rust engine forward BlockConfirmed{block_number, block_hash} events over the existing gRPC stream), then for each new block:

  1. eth_getBlockByHash(block_hash, full_txs=false) → tx hash list with positions
  2. For each tx hash, look up mempool_predictions WHERE pending_tx_hash = ?
  3. If found: compute block_delta, ordering_correct (predicted_index ± 2 of actual), pool_path_correct (the swap event in receipts hit our pool_address), insert mempool_reconciliation row
  4. For predictions in mempool_predictions with predicted_target_block + 12 < new_head and no reconciliation row: insert outcome='dropped'

Metrics

Metric Type Labels
aether_mempool_predictions_persisted_total counter protocol
aether_mempool_reconciled_total counter outcome (confirmed/dropped/replaced)
aether_mempool_block_accuracy gauge (1h window) none (ratio confirmed-where-predicted / confirmed)
aether_mempool_pool_path_accuracy gauge (1h window) protocol

Test plan

  • Unit: schema round-trip in internal/db/ledger_test.go style — insert prediction, insert reconciliation, query the join.
  • Integration: anvil fork test — replay a known historical block, decode its pending pool through the predictor, assert the reconciliation row's ordering_correct = true and block_delta = 0.
  • Live 30-min mainnet soak with MEMPOOL_TRACKING=1 and Postgres reachable: expect aether_mempool_predictions_persisted_total > 0, aether_mempool_reconciled_total{outcome="confirmed"} > 0, accuracy gauges populated.

Acceptance criteria

  • Migration 0002_mempool_predictions.sql lands; make migrate-up applies cleanly on a fresh PG14
  • Rust writer persists one row per decoded mempool swap (gated by MEMPOOL_TRACKING=1 + MEMPOOL_LEDGER_DSN env)
  • Go reconciler resolves every persisted prediction within predicted_target_block + 12 blocks
  • 30-min live mainnet run produces SQL output where SELECT outcome, COUNT(*) FROM mempool_reconciliation GROUP BY outcome is non-zero on at least confirmed and dropped
  • Grafana panel: block-accuracy time series over a rolling 1h window

Estimated scope

~700 lines new Rust + ~500 lines new Go + 80-line migration + dashboards. Two PRs:

  1. Schema + Rust writer + gRPC event — lands the predictor → DB plumbing
  2. Go reconciler + dashboard — closes the loop

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions