Skip to content

refactor(grpc-server): extract shared historical helpers#148

Merged
Pablosinyores merged 1 commit into
developfrom
feat/dedupe-replay-scorer-helpers
May 20, 2026
Merged

refactor(grpc-server): extract shared historical helpers#148
Pablosinyores merged 1 commit into
developfrom
feat/dedupe-replay-scorer-helpers

Conversation

@Pablosinyores

Copy link
Copy Markdown
Owner

Summary

`aether-replay` (2220 LOC) and `aether-profit-scorer` (2085 LOC) carried
duplicate inline copies of every helper used to load pool registries,
fetch reserves at a fixed block, and convert U256 reserves into f64
graph weights. This PR moves the canonical versions into
`crates/grpc-server/src/historical.rs` and has both binaries depend on
it. The scorer's source even had an explicit `//! Inlined helpers ...
Follow-up: deduplicate after the mempool phase lands.` block-doc — this
PR is that follow-up.

Stacks on PR #146 (`feat/grafana-mempool-panels`).

What moved into `historical.rs`

  • `sol! { getReserves, slot0 }` block + the emitted `getReservesCall` /
    `slot0Call` types (re-exported `pub` so replay's anvil-driven
    `fetch_one_state_latest` and `fetch_all_states_latest` still link)
  • `enum PoolState { V2, V3 }`
  • `struct LoadedPool`, `struct PoolsConfig`, `struct PoolEntry`
  • `pub const Q96`
  • `fn parse_protocol` — long-form `pools.toml` strings (`uniswap_v2`,
    `sushiswap`, `uniswap_v3`, `curve`, `balancer_v2`, `bancor_v3`)
  • `fn load_pools` — TOML loader, filters to V2/Sushi/V3
  • `async fn fetch_pool_state_at` — `Result<Option>`. Replay's
    swallow-RPC-error-as-None call site adapts with `.ok().flatten()` to
    preserve observable behaviour.
  • `fn u256_to_f64` — saturating limb-wise truncation
  • `fn uniswap_v2_get_amount_out` — exact U256 math, `saturating_sub` on
    `(10_000 - fee_bps)` (the previous replay variant used straight
    subtraction; for any real fee_bps (≤ 100 bps) the two produce identical
    output).
  • `fn load_executor_init_bytecode` — parses `bytecode/object` out of
    forge's AetherExecutor.json

What stays binary-local

Scorer-only:

Replay-only:

  • `default_pool_set`, `token_label` (display helpers)
  • `build_graph` (full graph assembly — replay needs it per-tx-window;
    scorer uses `state_to_graph_reserves` against an edge-clone instead)
  • `AnvilHandle`, `spawn_anvil`, `fetch_one_state_latest`,
    `fetch_all_states_latest` (anvil-fork lifecycle)
  • `optimize_cycle_input`, `estimate_opp` (intra-block optimisation)
  • `run_batch_replay`, `run_full_block_replay`, `seed_sender_state`,
    `sim_arb_with_evm_simulator`, `deploy_executor_on_anvil`,
    `build_steps_from_cycle` (replay async pipeline)

Diff stats

```
crates/grpc-server/src/bin/aether_profit_scorer.rs | 188 ++-------------------
crates/grpc-server/src/bin/aether_replay.rs | 178 ++-----------------
crates/grpc-server/src/historical.rs | 319 ++++++++++++++++++++++
crates/grpc-server/src/lib.rs | 1 +
4 files changed, 341 insertions(+), 345 deletions(-)
```

Net: -4 LOC, one less copy of each helper to keep in sync.

Verification

```
cargo clippy --workspace --all-targets -- -D warnings ✓ clean
cargo test --workspace --lib --bins ✓ 527 passed, 0 failed
```

6 new `historical.rs` unit tests:

  • `parse_protocol_known_strings`
  • `u256_to_f64_small_values_exact`
  • `uniswap_v2_get_amount_out_canonical`
  • `uniswap_v2_get_amount_out_zero_inputs`
  • `load_pools_filters_unsupported_protocols`
  • `load_pools_handles_missing_pools_key`

Pre-existing scorer tests covering `uniswap_v2_get_amount_out` and
`PoolState` constructions are unchanged and still pass via the new
imports.

Empirical behaviour proof — simple-mode replay diff

`aether-replay` default mode (no `--full-block`, no anvil) against
mainnet block `24643151` with `--pools config/pools.toml` exercises
`load_pools` → `fetch_pool_state_at` → `u256_to_f64` →
`build_graph` → Bellman-Ford → `print_cycles`. `diff` of stdout before
vs. after the refactor:

```
7c7
< State fetched: 53/55 (4251 ms)

State fetched: 53/55 (3882 ms)
14c14
< Total time: 4989 ms


Total time: 4114 ms
```

Only the wall-clock timing strings differ — `Pools loaded`,
`State fetched` count, `Graph` topology, `Detection` count, top
`profit_factor` (`174745.3472%`), and `path`
(`DAI -> 0x9f8f72aa… -> WETH -> DAI`) are byte-identical.

Why the full-block `--csv` bar is intentionally NOT used

The original handoff suggested byte-identical `--full-block --csv`
output as the proof bar. Empirically that metric is inherently
nondeterministic
, even between two consecutive runs of the SAME
binary against the same block:

Run Binary CSV hash
baseline pre-refactor `a4bbccc5…`
after-1 this PR `198a6167…`
after-2 this PR `a2c0b4c8…`

`diff after-1 after-2` produces ~60 differing lines. Two factors:

  1. Bellman-Ford cycle-entry randomness. When the detector finds
    multiple negative cycles at the same tx-step, the "top" cycle picked
    for CSV emission depends on internal HashMap iteration order, which
    Rust randomises per-process. Same graph → different "top" each run.
  2. Anvil fork + RPC timing variance. Each tx replay re-fetches pool
    state through the forked Anvil; under any RPC slowness or
    connection-reset (which happened in one of the runs above), the
    per-tx state at detection-time can shift, changing the cycle counts.

The simple-mode diff above gives a clean signal because there's exactly
one detected cycle and no anvil in the loop.

Out of scope (tracked)

  • `build_steps_from_cycle` (replay async) vs `build_steps_from_cycle_sync`
    (scorer sync) — same shape, different exec model. Folding requires
    changing replay's signature; defer to PR-8b.
  • `sim_arb_with_evm_simulator` (replay anvil-driven) vs
    `verify_cycle_revm` (scorer pure-revm CREATE-then-CALL) — different
    deployment models. Some shared infrastructure (`RpcForkedState`
    construction, balance-slot table) could fold; tracked.
  • `build_graph` (replay full assembly) vs `state_to_graph_reserves`
    (scorer per-edge update) — different abstractions; tracked.

@vercel

vercel Bot commented May 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
aether Ready Ready Preview, Comment May 20, 2026 12:18pm
aether-63xv Ready Ready Preview, Comment May 20, 2026 12:18pm

@Pablosinyores Pablosinyores changed the base branch from feat/grafana-mempool-panels to develop May 20, 2026 12:16
aether-replay and aether-profit-scorer carried duplicate inline copies
of every helper used to load pool registries, fetch reserves at a fixed
block, and convert U256 reserves into f64 graph weights. Move the
canonical versions into crates/grpc-server/src/historical.rs and have
both binaries depend on it.

Shared surface:
- sol! getReserves / slot0 + the emitted getReservesCall / slot0Call
- PoolState, LoadedPool, PoolsConfig, PoolEntry, Q96
- parse_protocol, load_pools (long-form pools.toml strings)
- fetch_pool_state_at (V2 reserves / V3 sqrtPriceX96 at a BlockId)
- u256_to_f64, uniswap_v2_get_amount_out
- load_executor_init_bytecode (forge AetherExecutor.json reader)

Replay's swallow-RPC-error-as-None call site adapts with .ok().flatten()
to preserve observable behaviour against the new
Result<Option<PoolState>> return shape. Scorer-only helpers stay where
they are: parse_db_protocol, load_predicted_pools, verify_cycle_u256,
verify_cycle_revm, build_steps_from_cycle_sync, is_v3_touching_cycle,
balance_slot_for_token, state_to_graph_reserves.

Diff: -345 / +22 across the two binaries; +319 in historical.rs (which
includes 6 unit tests on the extracted helpers). Net: -4 LOC, one less
copy of each helper to keep in sync.

Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 527 passed, 0 failed
- 6 new historical.rs unit tests covering parse_protocol /
  u256_to_f64 / uniswap_v2_get_amount_out / load_pools
- aether-replay simple-mode (default) against block 24643151 : output
  byte-identical to pre-refactor baseline except wall-clock timing
  (state-fetch 4251 vs 3882 ms; total 4989 vs 4114 ms — same RPC,
  network variance). Pool counts (55 loaded), graph topology (21 tokens,
  106 edges), detected cycle (DAI -> 0x9f8f72aa... -> WETH -> DAI),
  profit_factor (174745.3472%) all match exactly.

Full-block --csv run is intentionally NOT used as the proof bar: the
metric is inherently nondeterministic across runs of the same binary.
Bellman-Ford cycle-entry selection iterates a HashMap when multiple
negative cycles exist for the same tx-step, so the "top" cycle (and
hence the CSV row) varies between runs. Two consecutive runs of the
post-refactor binary against the same block produced CSVs with hashes
198a6167... and a2c0b4c8... — full diff. The pre-refactor baseline
(a4bbccc5...) is just one more sample from the same nondeterministic
distribution.

Stacks on feat/grafana-mempool-panels.
@Pablosinyores Pablosinyores force-pushed the feat/dedupe-replay-scorer-helpers branch from b704f5b to f2cb65c Compare May 20, 2026 12:16
@Pablosinyores Pablosinyores merged commit 1f5bc4d into develop May 20, 2026
1 of 3 checks passed
Pablosinyores added a commit that referenced this pull request May 20, 2026
aether_mempool_profit_scored_total used to be labelled only by
decision, so the dashboard could see "10 reverted" but not whether
they came from the V2 U256 walker, the f64 absurdity floor, the V3
revm verifier, or organic revm reverts. Add a `reason` sub-label
distinguishing those code paths.

Five wire labels (pinned by unit test against the constants):
- n/a              non-reverted decisions, no_path, or any path with
                   no sub-source worth distinguishing
- u256_walker      V2-only exact-U256 walker reached a verdict
                   (PR #136 path)
- absurdity_floor  f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH)
                   downgraded to reverted (PR #136 path)
- revm_verdict     V3-touching revm sim ran to completion with a
                   non-reverting verdict (PR #144 path)
- revm_revert      V3-touching revm sim explicitly reverted/halted
                   (PR #144 path)

The reason is Prometheus-only and NOT persisted to the
mempool_profitability table — the migration's CHECK constraint only
covers decision, and adding a reason column would force every
existing row to back-fill. `NewProfitabilityScore.reason` skips the
DB insert path; it only flows into the metric label.

revm_verdict_to_decision and f64_fallback_verdict now return a
4-tuple (net, realised, decision, reason). no_path_outcome carries
REASON_NA. The aggregating let in score_one destructures
(net, realised, decision, reason) and threads reason into
ScoreOutcome.

Dashboard panels in deploy/docker/grafana/dashboards/mempool.json
(panel IDs 19 and 20 from PR #146) updated to sum by
(decision, reason) and legend-format {{decision}} / {{reason}}.
Title and description updated to reflect the new dimension.

Stacks on feat/dedupe-replay-scorer-helpers (PR #148).

Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 528 passed, 0 failed
- new test: reason_constants_are_stable_wire_labels
- existing verdict-helper tests updated to assert reason value
- python3 -m json.tool mempool.json : parses cleanly
Pablosinyores added a commit that referenced this pull request May 20, 2026
aether_mempool_profit_scored_total used to be labelled only by
decision, so the dashboard could see "10 reverted" but not whether
they came from the V2 U256 walker, the f64 absurdity floor, the V3
revm verifier, or organic revm reverts. Add a `reason` sub-label
distinguishing those code paths.

Five wire labels (pinned by unit test against the constants):
- n/a              non-reverted decisions, no_path, or any path with
                   no sub-source worth distinguishing
- u256_walker      V2-only exact-U256 walker reached a verdict
                   (PR #136 path)
- absurdity_floor  f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH)
                   downgraded to reverted (PR #136 path)
- revm_verdict     V3-touching revm sim ran to completion with a
                   non-reverting verdict (PR #144 path)
- revm_revert      V3-touching revm sim explicitly reverted/halted
                   (PR #144 path)

The reason is Prometheus-only and NOT persisted to the
mempool_profitability table — the migration's CHECK constraint only
covers decision, and adding a reason column would force every
existing row to back-fill. `NewProfitabilityScore.reason` skips the
DB insert path; it only flows into the metric label.

revm_verdict_to_decision and f64_fallback_verdict now return a
4-tuple (net, realised, decision, reason). no_path_outcome carries
REASON_NA. The aggregating let in score_one destructures
(net, realised, decision, reason) and threads reason into
ScoreOutcome.

Dashboard panels in deploy/docker/grafana/dashboards/mempool.json
(panel IDs 19 and 20 from PR #146) updated to sum by
(decision, reason) and legend-format {{decision}} / {{reason}}.
Title and description updated to reflect the new dimension.

Stacks on feat/dedupe-replay-scorer-helpers (PR #148).

Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 528 passed, 0 failed
- new test: reason_constants_are_stable_wire_labels
- existing verdict-helper tests updated to assert reason value
- python3 -m json.tool mempool.json : parses cleanly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant