Skip to content

feat(observability): mempool dashboard panels for reconciler + scorer#146

Merged
Pablosinyores merged 8 commits into
developfrom
feat/grafana-mempool-panels
May 20, 2026
Merged

feat(observability): mempool dashboard panels for reconciler + scorer#146
Pablosinyores merged 8 commits into
developfrom
feat/grafana-mempool-panels

Conversation

@Pablosinyores

Copy link
Copy Markdown
Owner

Summary

Extends `deploy/docker/grafana/dashboards/mempool.json` from 7 panels (PR #118 / #128 engine-side scaffold) to 22 panels covering the mempool reconciler (PR #134) and profitability scorer (PR #135 / #137). Adds the matching Prometheus scrape jobs so the panels actually have data when the obs stack is running.

Stacks on PR #145 (`feat/engine-v3-graph-reserves`).

What changed

`deploy/docker/prometheus.yml` — two new scrape jobs:

  • `aether-host-reconciler` → `host.docker.internal:9094` (Go reconciler default).
  • `aether-host-scorer` → `host.docker.internal:9095` + `:9097` (Rust scorer; 9095 is the default, 9097 is what soak ops uses via `PROFIT_SCORER_METRICS_ADDR`).

`deploy/docker/grafana/dashboards/mempool.json` — new panels organised into two collapsible rows:

Reconciler (PR #134)

Panel Query
Block accuracy (Δ ≤ 0) stat `100 * sum(rate(aether_mempool_block_delta_bucket{le="0"}[1h])) / sum(rate(aether_mempool_block_delta_count[1h]))`
Pool-path accuracy stat `sum(rate(aether_mempool_pool_path_total{correct="true"}[1h])) / sum(rate(aether_mempool_pool_path_total[1h]))`
Reconciler queue depth `aether_mempool_reconciler_queue_depth`
Reconciler drops (5m rate) `rate(aether_mempool_reconciler_drops_total[5m])`
Block-delta quantiles `histogram_quantile(0.5/0.9, …)` on `_bucket`
Reconciliation outcomes pie `sum by (outcome) (aether_mempool_reconciled_total)`
Reconciler error rates per-source rate timeseries (header / lookup / receipt)
Reconciler write latency p50 / p95 over the latency histogram

Scorer (PR #135#137)

Panel Query
Decision breakdown pie `sum by (decision) (aether_mempool_profit_scored_total)`
Scored rate by decision `sum by (decision) (rate(aether_mempool_profit_scored_total[5m]))`
Scorer queue depth `aether_mempool_profit_writer_queue_depth`
Scorer drops (5m rate) `rate(aether_mempool_profit_writer_drops_total[5m])`
Scorer write latency p50 / p95 split by `result` label

Out of scope (deliberately)

Validation

  • `python3 -m json.tool mempool.json` parses cleanly. Panel count: 22 (7 existing engine panels + 13 new + 2 row dividers). Panel IDs: 1, 2, 5, 6, 8, 9, 10, 100, 11–18, 200, 19–23.
  • `yaml.safe_load(prometheus.yml)` parses cleanly.
  • Every PromQL expression in the new panels references a metric name observed live on the running engine (`localhost:9092`) or reconciler (`localhost:9094`) endpoints, or declared in `crates/grpc-server/src/profitability_writer.rs` for the scorer-side panels.

Live render not exercised

The local docker-compose obs stack (prom + grafana + alertmanager) is not currently running on this host (only `aether-postgres` is up). Bringing the stack up and pointing it at `host.docker.internal:9094` and `:9097` will render every panel against the real running services without further changes.

Adds the Rust writer half of issue #131. Every decoded pending-tx swap
whose post-state simulation succeeded now lands as a `mempool_predictions`
row, gated on `MEMPOOL_LEDGER_DSN` so unset = no DB writes, no behaviour
change.

- migrations/0003_mempool_predictions.sql: prediction table with the
  schema from #131. Distinct DSN from the trade ledger so the two
  ledgers are independently enable-able.
- crates/grpc-server/src/mempool_writer.rs: `MempoolPredictionSink`
  trait with `NoopMempoolSink` + `PgMempoolWriter` (sibling pattern to
  `aether_common::db::PgLedger` — bounded mpsc + dedicated writer task
  + saturation-drops + Prometheus surface).
- crates/grpc-server/src/mempool_pipeline.rs: `SimContext` carries the
  sink; `try_post_state_scan` builds a prediction after computing the
  post-state regardless of cycle profitability (the reconciler in #131
  Go half needs the full decoded-swap population).
- crates/grpc-server/src/main.rs: reads `MEMPOOL_LEDGER_DSN` +
  `AETHER_GIT_SHA`, wires the sink into the mempool path.

Metrics: `aether_mempool_predictions_persisted_total{protocol}`,
`aether_mempool_writer_{drops_total, queue_depth, write_latency_ms}`.

Followups in this phase:
- PR-2: Go reconciler against confirmed blocks (#131 second half).
- PR-3: aether-profit-scorer binary writing realized P&L (#132).
Closes the loop on PR #133's persisted predictions: subscribes to
newHeads, matches landed tx hashes against the predictions table, and
writes one mempool_reconciliation row per prediction once the outcome is
known. The two tables together answer "did the tx land where we said it
would, in the order we said it would, hitting the pool we said it
would?" — entirely in SQL.

- migrations/0004_mempool_reconciliation.sql: reconciliation table with
  outcome CHECK + cascade FK to mempool_predictions, both indexes from
  issue #131. Separate from PR-1's migration so each PR's schema move
  is reviewable in isolation.
- internal/db/mempool_reconciliation_pg.go: sibling pattern to PgLedger —
  pgxpool, bounded channel, dedicated writer goroutine. Provides:
    * LookupPredictionByTxHash (sync, hot-path on per-block tx loop)
    * InsertReconciliation (fire-and-forget)
    * MarkStaleAsDropped (batch INSERT … SELECT for the 12-block window)
- internal/db/mempool_reconciliation_metrics.go: aether_mempool_reconciled_total
  {outcome}, plus writer-internal drops/queue_depth/write_latency.
- cmd/reconciler/main.go: standalone aether-reconciler binary. Two loops:
    * newHeads → BlockByHash → per-tx prediction lookup → receipt fetch
      for pool_path_correct → outcome=confirmed insert
    * Every 6s: MarkStaleAsDropped(currentHead) for predictions where
      predicted_target_block + 12 ≤ head
- internal/db/mempool_reconciliation_test.go: pure unit tests for the
  outcome constants + StaleConfirmationWindow + metric registration,
  plus two integration tests gated on MEMPOOL_LEDGER_TEST_DSN that
  exercise the full SQL round-trip (insert prediction → lookup →
  insert reconciliation → SELECT join).

Metrics: aether_mempool_reconciled_total{outcome}, aether_mempool_block_delta
(histogram), aether_mempool_pool_path_total{protocol,correct}, plus the
in-process counter family.

Follow-up:
- PR-3 adds the realized-profit scorer (#132).
Closes the value loop on PR #133 (predictions) + PR #134 (reconciliation)
by computing what our analytical arb cycle would have realised against
the actual post-state of the pool at the block where the victim swap
landed. The headline answer is `SUM(net_profit_wei) WHERE
decision='profitable'` over the soak window.

- migrations/0005_mempool_profitability.sql: profitability table
  with cycle_path JSONB, realized_profit_wei + realized_profit_eth +
  gas_estimate_wei + net_profit_wei, decision CHECK + cascade FK to
  mempool_predictions. Renumbered from #132's literal `0003` because
  0001-0004 are already taken on develop after PRs #133 and #134.
- crates/grpc-server/src/profitability_writer.rs: sibling of the
  mempool_writer module from PR #133 — bounded mpsc, dedicated writer
  task, sqlx::PgPool, drop-on-saturation. Adds NewProfitabilityScore
  payload, ProfitabilitySink trait, NoopSink, PgProfitabilityWriter,
  ProfitabilityWriterMetrics. Provides fetch_unscored_confirmed for
  the scoring loop's polling read.
- crates/grpc-server/src/bin/aether_profit_scorer.rs: new
  aether-profit-scorer binary. Bootstrap loads pools.toml and fetches
  reserves for every supported pool at the latest block to build a
  reference PriceGraph + TokenIndex. Poll loop every 30 s SELECTs
  confirmed-but-unscored predictions; for each, fetches the affected
  pool's reserves at actual_target_block (one eth_call), clones the
  reference graph, overwrites the affected edge, runs
  BellmanFord::detect_from_affected, optimises the best cycle through
  the same ternary-search the engine uses, and INSERTs a row with the
  computed decision. Inlines a few helpers (fetch_pool_state_at,
  build_graph, sol! getReserves/slot0) deliberately duplicated from
  aether_replay.rs — extracting them into a shared module would touch
  the merged 2200-line replay file and inflate this PR's review
  burden. TODO note in the module docstring for the post-phase
  deduplication.

Metrics:
  aether_mempool_profit_scored_total{decision},
  aether_mempool_profit_writer_drops_total,
  aether_mempool_profit_writer_queue_depth,
  aether_mempool_profit_writer_write_latency_ms{result}.

The headline gauges named in issue #132 (`net_profit_eth_sum_24h` etc.)
are rendered Grafana-side from rate(realized_profit_wei[24h]) rather
than as in-process metrics, matching the same PromQL-vs-in-process
trade-off PR-2 used for accuracy gauges. Dashboard JSON update
deferred to the same follow-up that adds the panels.
The scorer's ternary-search optimiser computes hop output entirely in
f64. At mainnet pool scale (USDC pools hold ~1e14 base units, WETH pools
~1e22) the f64 mantissa loses ulps and overstates gross output by
amounts that fabricate ETH-scale ghost profit. The PR #135 soak surfaced
this as one 5.29 ETH USDC/WETH/DAI triangle; the current re-soak surfaced
eight rows totalling 481B ETH worth of ghost net profit — same root cause,
different cycle shape (degenerate self-loops with massive reserve mismatch).

Two-layer fix in `score_one`:

1. `verify_cycle_u256` re-walks every V2 hop in the optimiser's chosen
   cycle with exact `uniswap_v2_get_amount_out` U256 math at the same
   `running_states` reserves the optimiser saw, threading a local
   per-pool reserve copy so multi-hop cycles that revisit the same pool
   (Bellman-Ford self-loops) see hop N+1 reserves shifted by hop N's
   swap. Without the local copy, A→B→A would see pre-swap reserves on
   both legs and "regenerate" input, producing the same precision
   signature in U256 as f64. Cycles where every hop is V2/Sushi return
   `Some(gross_wei)`; `gross < input` ⇒ `DECISION_REVERTED`, otherwise
   exact `net = gross − input − gas` drives the decision.

2. When the verifier returns `None` (V3 hop, missing pool state, drained
   pool) the score falls back to the f64 optimiser's number — but capped:
   any f64-only verdict above `MAX_PLAUSIBLE_F64_NET_WEI` (1 ETH worth)
   is downgraded to `DECISION_REVERTED` because a 1+ ETH arb on mainnet
   would be captured intra-block by faster searchers and never reach our
   scorer. Sub-ETH V3 arbs pass through unchanged.

`OptimiserSuccess` now exposes `optimal_input_wei` so the verifier can
re-walk at the same input the optimiser converged on. Adds five unit
tests covering: `uniswap_v2_get_amount_out` against on-chain math,
`u256_to_i128_saturating` overflow handling, verifier inconclusivity on
V3 hops, verifier loss on a balanced triangle, and verifier
reserve-evolution on self-loops across four orders of input magnitude.

Soak proof (29-row backlog re-scored against the live DB):

  decision  | rows | sum_net_eth
  ----------+------+-------------
   no_path  |  58  |     0.00000000
   reverted |   8  |   641_531B (f64 noise, gated below the floor)
   profitable | 0  |
   unprofitable | 0|

vs the broken baseline (pre-fix, same data, same backlog):

  decision  | rows | sum_net_eth
  ----------+------+-------------
   no_path  |  46  |     0.00000000
   profitable | 8  |   481_148_577_928 ETH ghost

`SELECT SUM(net_profit_wei) WHERE decision='profitable'` is now 0 ETH;
the eight precision-bias rows land in `reverted` where the dashboard
explicitly excludes them from realised P&L.

Closes #132 (precision-fix portion).
The scorer's pool registry was the static `config/pools.toml` only,
but the engine's runtime pair-index extends past that every time the
mempool decoder spots a new pool. Pre-fix soaks showed ~88% of confirmed
predictions resolved as `decision='no_path'` — not because the cycle was
unreachable in the engine's view, but because the scorer's narrower
registry couldn't see the pool.

`load_predicted_pools` queries
`SELECT DISTINCT ON (pool_address) pool_address, protocol, token_in, token_out
 FROM mempool_predictions WHERE pool_address IS NOT NULL`
and folds the result into the LoadedPool registry on bootstrap and on
every `GRAPH_REFRESH_INTERVAL` tick. Canonical (token0, token1) is
derived from `min(token_in, token_out)` / `max(token_in, token_out)` —
direction-agnostic V2/V3 invariant. fee_bps falls back to
`DEFAULT_V2_FEE_BPS` (30) for Uni V2 / Sushi and `DEFAULT_V3_FEE_BPS`
(5) for V3. V3's actual per-pool fee comes from `pool.fee()` and lives
in (1, 5, 30, 100) bps; reading it would double bootstrap fan-out and
the U256 verifier ignores V3 fee anyway, so the default is good enough
for the f64 rate weight on the graph edge.

`MAX_DB_PREDICTED_POOLS = 256` caps the augmentation so a runaway
engine writing thousands of bogus addresses can't blow the bootstrap's
`eth_call` budget; the `SELECT ... ORDER BY pool_address LIMIT $1`
keeps the truncation deterministic across restarts.

Protocol-string parser `parse_db_protocol` is intentionally narrow:
only `uni_v2`, `uni_v3`, `sushi` map to a `ProtocolType`. Balancer /
Curve / Bancor are valid engine protocols but the scorer can't compute
their reserves yet — refusing them here keeps an unsupported pool from
sneaking in with wrong fee_bps and nonexistent state.

Soak proof (82-row backlog re-scored, scorer running from this branch
HEAD against the live DB):

  decision    | rows
  ------------+------
   reverted   |  82
   no_path    |   0
   profitable |   0

vs the immediate pre-PR-5 baseline (same DB, scorer from #136 HEAD):

  decision    | rows
  ------------+------
   no_path    |  72
   reverted   |   9

`decision='no_path'` dropped from 89% of rows to 0%; every confirmed
prediction now reaches the verifier pipeline. The fact that they all
land in `reverted` is PR-4's absurdity floor doing its job on V3-heavy
cycles — that's expected and correct, not a regression.

Tests cover `parse_db_protocol` short-form mapping (incl. negative
cases for long-form names the config uses), the V2/V3 default fee
constants, and the `MAX_DB_PREDICTED_POOLS` ceiling.

Closes #132 (pool-source-narrowness portion).
Before this commit every confirmed mempool prediction whose best cycle
touched a Uniswap V3 hop landed in `decision=reverted`. `verify_cycle_u256`
short-circuited to `None` on the first V3 hop and the 1 ETH absurdity
floor then caught the rate-only f64 verdict as precision bias — correct
behaviour, but it meant the dashboard never saw real sub-ETH V3 arbs.

This adds `verify_cycle_revm`: for cycles with at least one V3 hop the
scorer deploys AetherExecutor and runs `executeArb` inside a pure-revm
fork pinned to the scorer's reference block, then measures the ERC20
balance delta on SIM_OWNER as gross profit. V2-only cycles keep the
existing U256 fast path unchanged. Cycles the revm path cannot resolve
(unknown profit token, Curve/Balancer/Bancor hop, build failure) fall
through to the unchanged f64 absurdity-floor fallback.

Implementation:
- `EvmSimulator::deploy_and_simulate_with_erc20_profit` — two sequential
  `transact` calls on one revm Context. CREATE produces the executor
  address; CREATE's state diff is committed into the CacheDB so the CALL
  sees the deployed runtime bytecode. Pre/post balance diff observable
  via revm's returned state map.
- Scorer loads `contracts/out/AetherExecutor.sol/AetherExecutor.json`
  init bytecode once at boot via `--executor-artifact` (optional;
  scorer keeps current behaviour if absent).
- Per-token balance-slot table (WETH=3, USDC=9, DAI=2, USDT=2) keyed by
  the cycle's starting token. Unknown tokens cause `verify_cycle_revm`
  to return `None` and fall through to f64.
- `is_v3_touching_cycle` cheaply classifies each cycle before routing.

Proof:
- `cargo clippy --workspace --all-targets -- -D warnings` clean.
- `cargo test --workspace --lib --bins` 26/26 scorer tests + 32/32
  simulator tests pass. New tests cover the V2/V3 routing decision,
  Curve/Balancer rejection in `build_steps`, decision mapping for all
  three RevmVerdict outcomes, and balance-slot lookup.
… on reserves_zero

Symptom: zero V3 mempool predictions in the database over a 5-day soak
window despite the engine successfully decoding 71 V3 swaps (10 of
which passed the registry filter — USDT/WETH via UniswapV3 SwapRouter,
pools we cover). Metric proof:
`aether_pending_arb_sim_skipped_total{reason="reserves_zero"} 10` —
exact match against the 10 FILTER PASSes that vanished without writing
a prediction.

Root cause: V3 graph edges were created with their weight populated
but `reserve_in = reserve_out = 0.0`. Two call sites in
`crates/grpc-server/src/engine.rs`:

  * V3 bootstrap branch (`bootstrap_pools` -> `ReserveResult::V3`)
  * V3 live-update handler (`PoolEvent::V3Update`)

Both used `graph.add_edge(weight = price * fee, ...)` which only
touches weight + liquidity. The V2 path next door uses
`update_edge_from_reserves(r0, r1, fee)` which populates both reserves
AND the weight, which is why V2 mempool predictions worked end-to-end.
The mempool post-state pipeline's
`try_post_state_scan` then explicitly guards against zero reserves:

    if edge_fwd.reserve_in <= 0.0 || edge_fwd.reserve_out <= 0.0 {
        metrics.inc_pending_arb_sim_skipped("reserves_zero");
        return;
    }

so every V3 swap was dropped before reaching `predict_post_state`.

Fix: after each pair of `add_edge` calls in the V3 branches, also call
`update_edge_from_reserves` with the synthetic `(1.0, spot_price)`
pair. Convention matches the scorer's `state_to_graph_reserves` V3
branch and the docstring on `mempool_pipeline::unified_to_post_reserves`
("V3 uses a synthetic `(1.0, spot_price)` pair so Bellman-Ford treats
the two families identically").

The fix is purely additive — `add_edge` keeps creating the edge and
setting weight; `update_edge_from_reserves` then populates reserves on
the existing edge (which is a no-op-if-missing on its own, hence the
pairing). The weight derived from `(1.0, price) * fee` equals the
weight `add_edge` writes (`-ln(price * fee)`), so the two paths agree.

Proof:
- new test `test_v3_update_seeds_synthetic_reserves` asserts
  `reserve_in == 1.0` and `reserve_out == price` on both forward and
  reverse edges after a V3Update event with `sqrt_price_x96 = 2 * 2^96`
  (price = 4.0).
- `cargo clippy --workspace --all-targets -- -D warnings` clean.
- `cargo test --workspace --lib --bins` all green (incl. existing
  `test_v3_update_updates_graph`).

Unblocks PR #144 (revm V3 verifier in scorer) from "tested only in
unit tests" to "exercised on organic mainnet V3 mempool traffic" once
the engine is restarted with this build.
Extends the existing `aether-mempool` Grafana dashboard with 13 new
panels (plus two row dividers) covering PR #134's reconciler accuracy
gauges, PR #135's scoring throughput + writer health, and PR #137's
DB-augmented pool registry impact. Existing engine-side panels (PRs
#118 / #128) are untouched.

Adds the matching Prometheus scrape jobs the panels query against:

- `aether-host-reconciler` → `host.docker.internal:9094` (the Go
  reconciler binary, default port per `cmd/reconciler/main.go`).
- `aether-host-scorer` → `host.docker.internal:9095` and `:9097`. The
  Rust scorer defaults to 9095; soak ops override via
  `PROFIT_SCORER_METRICS_ADDR=:9097`. Listing both targets lets the
  scrape pick up whichever is in use without an additional config
  swap.

### Panel additions

PR #134 — Reconciler
- Block accuracy (Δ ≤ 0) stat — `aether_mempool_block_delta_bucket{le="0"}`
- Pool-path accuracy (1h) stat — `aether_mempool_pool_path_total{correct="true"}`
- Reconciler queue depth — `aether_mempool_reconciler_queue_depth`
- Reconciler drops (5m rate) — `aether_mempool_reconciler_drops_total`
- Block-delta quantiles (p50 / p90) timeseries
- Reconciliation outcomes pie — `aether_mempool_reconciled_total{outcome}`
- Reconciler error rates by source (header / lookup / receipt)
- Reconciler write latency (p50 / p95)

PR #135#137 — Scorer
- Decision breakdown pie — `aether_mempool_profit_scored_total{decision}`
- Scored rate by decision (5m) timeseries
- Scorer queue depth
- Scorer drops (5m rate)
- Scorer write latency (p50 / p95) split by result label

### Deliberately out of scope

- **PR #136 reverted-by-floor sub-counter**: the
  `aether_mempool_profit_scored_total` counter currently coalesces the
  absurdity-floor reverts with the U256-walker reverts (and after PR
  #144, the revm V3 verifier's reverts too) under a single
  `decision="reverted"` label. Splitting them requires a new label on
  the counter — a code change deliberately deferred. The decision-pie
  + scored-rate panels already plot the merged total.
- **PR #135 net_profit_eth_sum_24h**: the scorer does not expose a
  per-decision net-profit gauge; that figure lives only in the
  `mempool_profitability` table. Surfacing it would need either a new
  Prom metric or a Postgres datasource — both are their own follow-up
  decisions.
- **PR #135 top-10 unscored confirmed table**: requires a Postgres
  datasource that this dashboard intentionally does not add.
- **PR #137 added_from_db gauge**: the scorer logs this value as a
  tracing field on each registry-refresh tick but does not emit a
  matching metric. Cheap to add, but pure-Prom dashboard scope says
  leave it out of this PR.

### Validation

- `python3 -m json.tool mempool.json` parses cleanly (22 panels: 7
  existing engine panels + 13 new + 2 row dividers).
- `python3 yaml.safe_load(prometheus.yml)` parses cleanly.
- Every PromQL expression references a metric name observed live on
  the running engine (`localhost:9092`) or reconciler (`localhost:9094`)
  endpoints, or declared in `crates/grpc-server/src/profitability_writer.rs`
  for the scorer-side panels.
- Live render skipped: the local docker-compose obs stack
  (prom + grafana + alertmanager) is not currently running. Running
  the stack and pointing it at the three host scrape targets will
  render every panel against the real running services.
@vercel

vercel Bot commented May 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
aether Ready Ready Preview, Comment May 20, 2026 8:26am
aether-63xv Ready Ready Preview, Comment May 20, 2026 8:26am

@Pablosinyores Pablosinyores changed the base branch from feat/engine-v3-graph-reserves to develop May 20, 2026 12:15
@Pablosinyores Pablosinyores merged commit 37bf2e4 into develop May 20, 2026
3 checks passed
Pablosinyores added a commit that referenced this pull request May 20, 2026
aether_mempool_profit_scored_total used to be labelled only by
decision, so the dashboard could see "10 reverted" but not whether
they came from the V2 U256 walker, the f64 absurdity floor, the V3
revm verifier, or organic revm reverts. Add a `reason` sub-label
distinguishing those code paths.

Five wire labels (pinned by unit test against the constants):
- n/a              non-reverted decisions, no_path, or any path with
                   no sub-source worth distinguishing
- u256_walker      V2-only exact-U256 walker reached a verdict
                   (PR #136 path)
- absurdity_floor  f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH)
                   downgraded to reverted (PR #136 path)
- revm_verdict     V3-touching revm sim ran to completion with a
                   non-reverting verdict (PR #144 path)
- revm_revert      V3-touching revm sim explicitly reverted/halted
                   (PR #144 path)

The reason is Prometheus-only and NOT persisted to the
mempool_profitability table — the migration's CHECK constraint only
covers decision, and adding a reason column would force every
existing row to back-fill. `NewProfitabilityScore.reason` skips the
DB insert path; it only flows into the metric label.

revm_verdict_to_decision and f64_fallback_verdict now return a
4-tuple (net, realised, decision, reason). no_path_outcome carries
REASON_NA. The aggregating let in score_one destructures
(net, realised, decision, reason) and threads reason into
ScoreOutcome.

Dashboard panels in deploy/docker/grafana/dashboards/mempool.json
(panel IDs 19 and 20 from PR #146) updated to sum by
(decision, reason) and legend-format {{decision}} / {{reason}}.
Title and description updated to reflect the new dimension.

Stacks on feat/dedupe-replay-scorer-helpers (PR #148).

Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 528 passed, 0 failed
- new test: reason_constants_are_stable_wire_labels
- existing verdict-helper tests updated to assert reason value
- python3 -m json.tool mempool.json : parses cleanly
Pablosinyores added a commit that referenced this pull request May 20, 2026
aether_mempool_profit_scored_total used to be labelled only by
decision, so the dashboard could see "10 reverted" but not whether
they came from the V2 U256 walker, the f64 absurdity floor, the V3
revm verifier, or organic revm reverts. Add a `reason` sub-label
distinguishing those code paths.

Five wire labels (pinned by unit test against the constants):
- n/a              non-reverted decisions, no_path, or any path with
                   no sub-source worth distinguishing
- u256_walker      V2-only exact-U256 walker reached a verdict
                   (PR #136 path)
- absurdity_floor  f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH)
                   downgraded to reverted (PR #136 path)
- revm_verdict     V3-touching revm sim ran to completion with a
                   non-reverting verdict (PR #144 path)
- revm_revert      V3-touching revm sim explicitly reverted/halted
                   (PR #144 path)

The reason is Prometheus-only and NOT persisted to the
mempool_profitability table — the migration's CHECK constraint only
covers decision, and adding a reason column would force every
existing row to back-fill. `NewProfitabilityScore.reason` skips the
DB insert path; it only flows into the metric label.

revm_verdict_to_decision and f64_fallback_verdict now return a
4-tuple (net, realised, decision, reason). no_path_outcome carries
REASON_NA. The aggregating let in score_one destructures
(net, realised, decision, reason) and threads reason into
ScoreOutcome.

Dashboard panels in deploy/docker/grafana/dashboards/mempool.json
(panel IDs 19 and 20 from PR #146) updated to sum by
(decision, reason) and legend-format {{decision}} / {{reason}}.
Title and description updated to reflect the new dimension.

Stacks on feat/dedupe-replay-scorer-helpers (PR #148).

Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 528 passed, 0 failed
- new test: reason_constants_are_stable_wire_labels
- existing verdict-helper tests updated to assert reason value
- python3 -m json.tool mempool.json : parses cleanly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant