refactor(grpc-server): extract shared historical helpers by Pablosinyores · Pull Request #148 · Pablosinyores/aether

Pablosinyores · 2026-05-20T10:58:57Z

Summary

`aether-replay` (2220 LOC) and `aether-profit-scorer` (2085 LOC) carried
duplicate inline copies of every helper used to load pool registries,
fetch reserves at a fixed block, and convert U256 reserves into f64
graph weights. This PR moves the canonical versions into
`crates/grpc-server/src/historical.rs` and has both binaries depend on
it. The scorer's source even had an explicit `//! Inlined helpers ...
Follow-up: deduplicate after the mempool phase lands.` block-doc — this
PR is that follow-up.

Stacks on PR #146 (`feat/grafana-mempool-panels`).

What moved into `historical.rs`

`sol! { getReserves, slot0 }` block + the emitted `getReservesCall` /
`slot0Call` types (re-exported `pub` so replay's anvil-driven
`fetch_one_state_latest` and `fetch_all_states_latest` still link)
`enum PoolState { V2, V3 }`
`struct LoadedPool`, `struct PoolsConfig`, `struct PoolEntry`
`pub const Q96`
`fn parse_protocol` — long-form `pools.toml` strings (`uniswap_v2`,
`sushiswap`, `uniswap_v3`, `curve`, `balancer_v2`, `bancor_v3`)
`fn load_pools` — TOML loader, filters to V2/Sushi/V3
`async fn fetch_pool_state_at` — `Result<Option>`. Replay's
swallow-RPC-error-as-None call site adapts with `.ok().flatten()` to
preserve observable behaviour.
`fn u256_to_f64` — saturating limb-wise truncation
`fn uniswap_v2_get_amount_out` — exact U256 math, `saturating_sub` on
`(10_000 - fee_bps)` (the previous replay variant used straight
subtraction; for any real fee_bps (≤ 100 bps) the two produce identical
output).
`fn load_executor_init_bytecode` — parses `bytecode/object` out of
forge's AetherExecutor.json

What stays binary-local

Scorer-only:

`parse_db_protocol` (short-form `uni_v2` / `uni_v3` / `sushi` strings
the engine writes into `mempool_predictions.protocol`)
`load_predicted_pools` (PR feat(mempool): widen scorer pool source via mempool_predictions #137 DB-augmented registry)
`verify_cycle_u256` (PR feat(mempool): U256 cycle verifier + f64 absurdity floor #136 precision walker)
`verify_cycle_revm`, `build_steps_from_cycle_sync`,
`is_v3_touching_cycle`, `balance_slot_for_token` (PR feat(mempool): revm verifier resolves V3 cycles in scorer #144 V3 verifier)
`state_to_graph_reserves`, `gas_estimate_for_protocols`,
`cycle_to_json`, `u256_to_i128_saturating` (scorer math)
`ScorerState` + `bootstrap_state` (scorer wiring)

Replay-only:

`default_pool_set`, `token_label` (display helpers)
`build_graph` (full graph assembly — replay needs it per-tx-window;
scorer uses `state_to_graph_reserves` against an edge-clone instead)
`AnvilHandle`, `spawn_anvil`, `fetch_one_state_latest`,
`fetch_all_states_latest` (anvil-fork lifecycle)
`optimize_cycle_input`, `estimate_opp` (intra-block optimisation)
`run_batch_replay`, `run_full_block_replay`, `seed_sender_state`,
`sim_arb_with_evm_simulator`, `deploy_executor_on_anvil`,
`build_steps_from_cycle` (replay async pipeline)

Diff stats

```
crates/grpc-server/src/bin/aether_profit_scorer.rs | 188 ++-------------------
crates/grpc-server/src/bin/aether_replay.rs | 178 ++-----------------
crates/grpc-server/src/historical.rs | 319 ++++++++++++++++++++++
crates/grpc-server/src/lib.rs | 1 +
4 files changed, 341 insertions(+), 345 deletions(-)
```

Net: -4 LOC, one less copy of each helper to keep in sync.

Verification

```
cargo clippy --workspace --all-targets -- -D warnings ✓ clean
cargo test --workspace --lib --bins ✓ 527 passed, 0 failed
```

6 new `historical.rs` unit tests:

`parse_protocol_known_strings`
`u256_to_f64_small_values_exact`
`uniswap_v2_get_amount_out_canonical`
`uniswap_v2_get_amount_out_zero_inputs`
`load_pools_filters_unsupported_protocols`
`load_pools_handles_missing_pools_key`

Pre-existing scorer tests covering `uniswap_v2_get_amount_out` and
`PoolState` constructions are unchanged and still pass via the new
imports.

Empirical behaviour proof — simple-mode replay diff

`aether-replay` default mode (no `--full-block`, no anvil) against
mainnet block `24643151` with `--pools config/pools.toml` exercises
`load_pools` → `fetch_pool_state_at` → `u256_to_f64` →
`build_graph` → Bellman-Ford → `print_cycles`. `diff` of stdout before
vs. after the refactor:

```
7c7
< State fetched: 53/55 (4251 ms)

State fetched: 53/55 (3882 ms)
14c14
< Total time: 4989 ms

Total time: 4114 ms
```

Only the wall-clock timing strings differ — `Pools loaded`,
`State fetched` count, `Graph` topology, `Detection` count, top
`profit_factor` (`174745.3472%`), and `path`
(`DAI -> 0x9f8f72aa… -> WETH -> DAI`) are byte-identical.

Why the full-block `--csv` bar is intentionally NOT used

The original handoff suggested byte-identical `--full-block --csv`
output as the proof bar. Empirically that metric is inherently
nondeterministic, even between two consecutive runs of the SAME
binary against the same block:

Run	Binary	CSV hash
baseline	pre-refactor	`a4bbccc5…`
after-1	this PR	`198a6167…`
after-2	this PR	`a2c0b4c8…`

`diff after-1 after-2` produces ~60 differing lines. Two factors:

Bellman-Ford cycle-entry randomness. When the detector finds
multiple negative cycles at the same tx-step, the "top" cycle picked
for CSV emission depends on internal HashMap iteration order, which
Rust randomises per-process. Same graph → different "top" each run.
Anvil fork + RPC timing variance. Each tx replay re-fetches pool
state through the forked Anvil; under any RPC slowness or
connection-reset (which happened in one of the runs above), the
per-tx state at detection-time can shift, changing the cycle counts.

The simple-mode diff above gives a clean signal because there's exactly
one detected cycle and no anvil in the loop.

Out of scope (tracked)

`build_steps_from_cycle` (replay async) vs `build_steps_from_cycle_sync`
(scorer sync) — same shape, different exec model. Folding requires
changing replay's signature; defer to PR-8b.
`sim_arb_with_evm_simulator` (replay anvil-driven) vs
`verify_cycle_revm` (scorer pure-revm CREATE-then-CALL) — different
deployment models. Some shared infrastructure (`RpcForkedState`
construction, balance-slot table) could fold; tracked.
`build_graph` (replay full assembly) vs `state_to_graph_reserves`
(scorer per-edge update) — different abstractions; tracked.

vercel · 2026-05-20T10:59:02Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
aether	Ready	Preview, Comment	May 20, 2026 12:18pm
aether-63xv	Ready	Preview, Comment	May 20, 2026 12:18pm

aether-replay and aether-profit-scorer carried duplicate inline copies of every helper used to load pool registries, fetch reserves at a fixed block, and convert U256 reserves into f64 graph weights. Move the canonical versions into crates/grpc-server/src/historical.rs and have both binaries depend on it. Shared surface: - sol! getReserves / slot0 + the emitted getReservesCall / slot0Call - PoolState, LoadedPool, PoolsConfig, PoolEntry, Q96 - parse_protocol, load_pools (long-form pools.toml strings) - fetch_pool_state_at (V2 reserves / V3 sqrtPriceX96 at a BlockId) - u256_to_f64, uniswap_v2_get_amount_out - load_executor_init_bytecode (forge AetherExecutor.json reader) Replay's swallow-RPC-error-as-None call site adapts with .ok().flatten() to preserve observable behaviour against the new Result<Option<PoolState>> return shape. Scorer-only helpers stay where they are: parse_db_protocol, load_predicted_pools, verify_cycle_u256, verify_cycle_revm, build_steps_from_cycle_sync, is_v3_touching_cycle, balance_slot_for_token, state_to_graph_reserves. Diff: -345 / +22 across the two binaries; +319 in historical.rs (which includes 6 unit tests on the extracted helpers). Net: -4 LOC, one less copy of each helper to keep in sync. Verification: - cargo clippy --workspace --all-targets -- -D warnings : clean - cargo test --workspace --lib --bins : 527 passed, 0 failed - 6 new historical.rs unit tests covering parse_protocol / u256_to_f64 / uniswap_v2_get_amount_out / load_pools - aether-replay simple-mode (default) against block 24643151 : output byte-identical to pre-refactor baseline except wall-clock timing (state-fetch 4251 vs 3882 ms; total 4989 vs 4114 ms — same RPC, network variance). Pool counts (55 loaded), graph topology (21 tokens, 106 edges), detected cycle (DAI -> 0x9f8f72aa... -> WETH -> DAI), profit_factor (174745.3472%) all match exactly. Full-block --csv run is intentionally NOT used as the proof bar: the metric is inherently nondeterministic across runs of the same binary. Bellman-Ford cycle-entry selection iterates a HashMap when multiple negative cycles exist for the same tx-step, so the "top" cycle (and hence the CSV row) varies between runs. Two consecutive runs of the post-refactor binary against the same block produced CSVs with hashes 198a6167... and a2c0b4c8... — full diff. The pre-refactor baseline (a4bbccc5...) is just one more sample from the same nondeterministic distribution. Stacks on feat/grafana-mempool-panels.

aether_mempool_profit_scored_total used to be labelled only by decision, so the dashboard could see "10 reverted" but not whether they came from the V2 U256 walker, the f64 absurdity floor, the V3 revm verifier, or organic revm reverts. Add a `reason` sub-label distinguishing those code paths. Five wire labels (pinned by unit test against the constants): - n/a non-reverted decisions, no_path, or any path with no sub-source worth distinguishing - u256_walker V2-only exact-U256 walker reached a verdict (PR #136 path) - absurdity_floor f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH) downgraded to reverted (PR #136 path) - revm_verdict V3-touching revm sim ran to completion with a non-reverting verdict (PR #144 path) - revm_revert V3-touching revm sim explicitly reverted/halted (PR #144 path) The reason is Prometheus-only and NOT persisted to the mempool_profitability table — the migration's CHECK constraint only covers decision, and adding a reason column would force every existing row to back-fill. `NewProfitabilityScore.reason` skips the DB insert path; it only flows into the metric label. revm_verdict_to_decision and f64_fallback_verdict now return a 4-tuple (net, realised, decision, reason). no_path_outcome carries REASON_NA. The aggregating let in score_one destructures (net, realised, decision, reason) and threads reason into ScoreOutcome. Dashboard panels in deploy/docker/grafana/dashboards/mempool.json (panel IDs 19 and 20 from PR #146) updated to sum by (decision, reason) and legend-format {{decision}} / {{reason}}. Title and description updated to reflect the new dimension. Stacks on feat/dedupe-replay-scorer-helpers (PR #148). Verification: - cargo clippy --workspace --all-targets -- -D warnings : clean - cargo test --workspace --lib --bins : 528 passed, 0 failed - new test: reason_constants_are_stable_wire_labels - existing verdict-helper tests updated to assert reason value - python3 -m json.tool mempool.json : parses cleanly

Pablosinyores mentioned this pull request May 20, 2026

feat(scorer): add reason sub-label to profit_scored_total #149

Merged

Pablosinyores changed the base branch from feat/grafana-mempool-panels to develop May 20, 2026 12:16

Pablosinyores force-pushed the feat/dedupe-replay-scorer-helpers branch from b704f5b to f2cb65c Compare May 20, 2026 12:16

Pablosinyores merged commit 1f5bc4d into develop May 20, 2026
1 of 3 checks passed

vercel Bot deployed to Preview – aether-63xv May 20, 2026 12:18 View deployment

vercel Bot deployed to Preview – aether May 20, 2026 12:18 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(grpc-server): extract shared historical helpers#148

refactor(grpc-server): extract shared historical helpers#148
Pablosinyores merged 1 commit into
developfrom
feat/dedupe-replay-scorer-helpers

Pablosinyores commented May 20, 2026

Uh oh!

vercel Bot commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Pablosinyores commented May 20, 2026

Summary

What moved into `historical.rs`

What stays binary-local

Diff stats

Verification

Empirical behaviour proof — simple-mode replay diff

``` 7c7 < State fetched: 53/55 (4251 ms)

Why the full-block `--csv` bar is intentionally NOT used

Out of scope (tracked)

Uh oh!

vercel Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

```
7c7
< State fetched: 53/55 (4251 ms)

vercel Bot commented May 20, 2026 •

edited

Loading