refactor(grpc-server): extract shared historical helpers#148
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
aether-replay and aether-profit-scorer carried duplicate inline copies of every helper used to load pool registries, fetch reserves at a fixed block, and convert U256 reserves into f64 graph weights. Move the canonical versions into crates/grpc-server/src/historical.rs and have both binaries depend on it. Shared surface: - sol! getReserves / slot0 + the emitted getReservesCall / slot0Call - PoolState, LoadedPool, PoolsConfig, PoolEntry, Q96 - parse_protocol, load_pools (long-form pools.toml strings) - fetch_pool_state_at (V2 reserves / V3 sqrtPriceX96 at a BlockId) - u256_to_f64, uniswap_v2_get_amount_out - load_executor_init_bytecode (forge AetherExecutor.json reader) Replay's swallow-RPC-error-as-None call site adapts with .ok().flatten() to preserve observable behaviour against the new Result<Option<PoolState>> return shape. Scorer-only helpers stay where they are: parse_db_protocol, load_predicted_pools, verify_cycle_u256, verify_cycle_revm, build_steps_from_cycle_sync, is_v3_touching_cycle, balance_slot_for_token, state_to_graph_reserves. Diff: -345 / +22 across the two binaries; +319 in historical.rs (which includes 6 unit tests on the extracted helpers). Net: -4 LOC, one less copy of each helper to keep in sync. Verification: - cargo clippy --workspace --all-targets -- -D warnings : clean - cargo test --workspace --lib --bins : 527 passed, 0 failed - 6 new historical.rs unit tests covering parse_protocol / u256_to_f64 / uniswap_v2_get_amount_out / load_pools - aether-replay simple-mode (default) against block 24643151 : output byte-identical to pre-refactor baseline except wall-clock timing (state-fetch 4251 vs 3882 ms; total 4989 vs 4114 ms — same RPC, network variance). Pool counts (55 loaded), graph topology (21 tokens, 106 edges), detected cycle (DAI -> 0x9f8f72aa... -> WETH -> DAI), profit_factor (174745.3472%) all match exactly. Full-block --csv run is intentionally NOT used as the proof bar: the metric is inherently nondeterministic across runs of the same binary. Bellman-Ford cycle-entry selection iterates a HashMap when multiple negative cycles exist for the same tx-step, so the "top" cycle (and hence the CSV row) varies between runs. Two consecutive runs of the post-refactor binary against the same block produced CSVs with hashes 198a6167... and a2c0b4c8... — full diff. The pre-refactor baseline (a4bbccc5...) is just one more sample from the same nondeterministic distribution. Stacks on feat/grafana-mempool-panels.
b704f5b to
f2cb65c
Compare
Pablosinyores
added a commit
that referenced
this pull request
May 20, 2026
aether_mempool_profit_scored_total used to be labelled only by
decision, so the dashboard could see "10 reverted" but not whether
they came from the V2 U256 walker, the f64 absurdity floor, the V3
revm verifier, or organic revm reverts. Add a `reason` sub-label
distinguishing those code paths.
Five wire labels (pinned by unit test against the constants):
- n/a non-reverted decisions, no_path, or any path with
no sub-source worth distinguishing
- u256_walker V2-only exact-U256 walker reached a verdict
(PR #136 path)
- absurdity_floor f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH)
downgraded to reverted (PR #136 path)
- revm_verdict V3-touching revm sim ran to completion with a
non-reverting verdict (PR #144 path)
- revm_revert V3-touching revm sim explicitly reverted/halted
(PR #144 path)
The reason is Prometheus-only and NOT persisted to the
mempool_profitability table — the migration's CHECK constraint only
covers decision, and adding a reason column would force every
existing row to back-fill. `NewProfitabilityScore.reason` skips the
DB insert path; it only flows into the metric label.
revm_verdict_to_decision and f64_fallback_verdict now return a
4-tuple (net, realised, decision, reason). no_path_outcome carries
REASON_NA. The aggregating let in score_one destructures
(net, realised, decision, reason) and threads reason into
ScoreOutcome.
Dashboard panels in deploy/docker/grafana/dashboards/mempool.json
(panel IDs 19 and 20 from PR #146) updated to sum by
(decision, reason) and legend-format {{decision}} / {{reason}}.
Title and description updated to reflect the new dimension.
Stacks on feat/dedupe-replay-scorer-helpers (PR #148).
Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 528 passed, 0 failed
- new test: reason_constants_are_stable_wire_labels
- existing verdict-helper tests updated to assert reason value
- python3 -m json.tool mempool.json : parses cleanly
Pablosinyores
added a commit
that referenced
this pull request
May 20, 2026
aether_mempool_profit_scored_total used to be labelled only by
decision, so the dashboard could see "10 reverted" but not whether
they came from the V2 U256 walker, the f64 absurdity floor, the V3
revm verifier, or organic revm reverts. Add a `reason` sub-label
distinguishing those code paths.
Five wire labels (pinned by unit test against the constants):
- n/a non-reverted decisions, no_path, or any path with
no sub-source worth distinguishing
- u256_walker V2-only exact-U256 walker reached a verdict
(PR #136 path)
- absurdity_floor f64 fallback above MAX_PLAUSIBLE_F64_NET_WEI (1 ETH)
downgraded to reverted (PR #136 path)
- revm_verdict V3-touching revm sim ran to completion with a
non-reverting verdict (PR #144 path)
- revm_revert V3-touching revm sim explicitly reverted/halted
(PR #144 path)
The reason is Prometheus-only and NOT persisted to the
mempool_profitability table — the migration's CHECK constraint only
covers decision, and adding a reason column would force every
existing row to back-fill. `NewProfitabilityScore.reason` skips the
DB insert path; it only flows into the metric label.
revm_verdict_to_decision and f64_fallback_verdict now return a
4-tuple (net, realised, decision, reason). no_path_outcome carries
REASON_NA. The aggregating let in score_one destructures
(net, realised, decision, reason) and threads reason into
ScoreOutcome.
Dashboard panels in deploy/docker/grafana/dashboards/mempool.json
(panel IDs 19 and 20 from PR #146) updated to sum by
(decision, reason) and legend-format {{decision}} / {{reason}}.
Title and description updated to reflect the new dimension.
Stacks on feat/dedupe-replay-scorer-helpers (PR #148).
Verification:
- cargo clippy --workspace --all-targets -- -D warnings : clean
- cargo test --workspace --lib --bins : 528 passed, 0 failed
- new test: reason_constants_are_stable_wire_labels
- existing verdict-helper tests updated to assert reason value
- python3 -m json.tool mempool.json : parses cleanly
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
`aether-replay` (2220 LOC) and `aether-profit-scorer` (2085 LOC) carried
duplicate inline copies of every helper used to load pool registries,
fetch reserves at a fixed block, and convert U256 reserves into f64
graph weights. This PR moves the canonical versions into
`crates/grpc-server/src/historical.rs` and has both binaries depend on
it. The scorer's source even had an explicit `//! Inlined helpers ...
Follow-up: deduplicate after the mempool phase lands.` block-doc — this
PR is that follow-up.
Stacks on PR #146 (`feat/grafana-mempool-panels`).
What moved into `historical.rs`
`slot0Call` types (re-exported `pub` so replay's anvil-driven
`fetch_one_state_latest` and `fetch_all_states_latest` still link)
`sushiswap`, `uniswap_v3`, `curve`, `balancer_v2`, `bancor_v3`)
swallow-RPC-error-as-None call site adapts with `.ok().flatten()` to
preserve observable behaviour.
`(10_000 - fee_bps)` (the previous replay variant used straight
subtraction; for any real fee_bps (≤ 100 bps) the two produce identical
output).
forge's AetherExecutor.json
What stays binary-local
Scorer-only:
the engine writes into `mempool_predictions.protocol`)
`is_v3_touching_cycle`, `balance_slot_for_token` (PR feat(mempool): revm verifier resolves V3 cycles in scorer #144 V3 verifier)
`cycle_to_json`, `u256_to_i128_saturating` (scorer math)
Replay-only:
scorer uses `state_to_graph_reserves` against an edge-clone instead)
`fetch_all_states_latest` (anvil-fork lifecycle)
`sim_arb_with_evm_simulator`, `deploy_executor_on_anvil`,
`build_steps_from_cycle` (replay async pipeline)
Diff stats
```
crates/grpc-server/src/bin/aether_profit_scorer.rs | 188 ++-------------------
crates/grpc-server/src/bin/aether_replay.rs | 178 ++-----------------
crates/grpc-server/src/historical.rs | 319 ++++++++++++++++++++++
crates/grpc-server/src/lib.rs | 1 +
4 files changed, 341 insertions(+), 345 deletions(-)
```
Net: -4 LOC, one less copy of each helper to keep in sync.
Verification
```
cargo clippy --workspace --all-targets -- -D warnings ✓ clean
cargo test --workspace --lib --bins ✓ 527 passed, 0 failed
```
6 new `historical.rs` unit tests:
Pre-existing scorer tests covering `uniswap_v2_get_amount_out` and
`PoolState` constructions are unchanged and still pass via the new
imports.
Empirical behaviour proof — simple-mode replay diff
`aether-replay` default mode (no `--full-block`, no anvil) against
mainnet block `24643151` with `--pools config/pools.toml` exercises
`load_pools` → `fetch_pool_state_at` → `u256_to_f64` →
`build_graph` → Bellman-Ford → `print_cycles`. `diff` of stdout before
vs. after the refactor:
```
7c7
< State fetched: 53/55 (4251 ms)
Only the wall-clock timing strings differ — `Pools loaded`,
`State fetched` count, `Graph` topology, `Detection` count, top
`profit_factor` (`174745.3472%`), and `path`
(`DAI -> 0x9f8f72aa… -> WETH -> DAI`) are byte-identical.
Why the full-block `--csv` bar is intentionally NOT used
The original handoff suggested byte-identical `--full-block --csv`
output as the proof bar. Empirically that metric is inherently
nondeterministic, even between two consecutive runs of the SAME
binary against the same block:
`diff after-1 after-2` produces ~60 differing lines. Two factors:
multiple negative cycles at the same tx-step, the "top" cycle picked
for CSV emission depends on internal HashMap iteration order, which
Rust randomises per-process. Same graph → different "top" each run.
state through the forked Anvil; under any RPC slowness or
connection-reset (which happened in one of the runs above), the
per-tx state at detection-time can shift, changing the cycle counts.
The simple-mode diff above gives a clean signal because there's exactly
one detected cycle and no anvil in the loop.
Out of scope (tracked)
(scorer sync) — same shape, different exec model. Folding requires
changing replay's signature; defer to PR-8b.
`verify_cycle_revm` (scorer pure-revm CREATE-then-CALL) — different
deployment models. Some shared infrastructure (`RpcForkedState`
construction, balance-slot table) could fold; tracked.
(scorer per-edge update) — different abstractions; tracked.