Context
Aether's current detection pipeline reacts to confirmed Sync / Swap events — i.e. it sees pool moves only after the block lands. To compete in live MEV (backrun-class arbitrage), we need to see pending victim swaps before the next block, simulate the post-tx state, and detect arbs against that state.
This issue is the free-tier testing scaffold that validates the entire plumbing stack — pending-tx subscription, calldata decoding, post-state simulation, and metrics — without paying for a private mempool feed yet. Once the scaffold proves out (we observe pending DEX swaps in real time and can run Bellman-Ford on the simulated post-state), a separate issue will scope the paid-feed integration (Chainbound Fiber / bloXroute / self-hosted Reth sentries — see the standalone deep-dive report for that).
What this issue ships
A log-only, zero-execution-risk pipeline:
Alchemy WS pending stream ──┐
├─► PendingTxDecoder ─► PostStateSim ─► Detector ─► aether_pending_arb_candidates_total
Flashbots MEV-Share SSE ────┘ (count only — no bundle build, no submission)
No backrun bundle is constructed. No bundle is submitted. The output of this issue is metrics + structured logs that prove every stage of the pipeline works, so the paid-feed swap is a one-line config change later.
Scope
Phase A — Pending-tx ingestion (Rust)
- New
crates/ingestion/src/mempool.rs with a PendingTxStream trait + an AlchemyPendingStream impl.
- Subscribe via
eth_subscribe with alchemy_pendingTransactions and toAddress filtered to UniV2 Router02, UniV3 SwapRouter, UniV3 SwapRouter02, SushiSwap Router, Curve registry routers, Balancer Vault, 1inch AggregationRouter.
- Lock-free
tokio::sync::broadcast channel pending_tx_tx so multiple consumers (decoder + future Fiber stream + future sentry) can fan-in / fan-out without re-subscribing.
- Reuses the existing multi-node
node_pool.rs reconnect / health-state-machine logic.
- Per-source dedup keyed on tx hash.
Phase B — Calldata decoder (Rust)
- New
crates/pools/src/router_decoder.rs with alloy::sol! ABIs for the 7 router selectors above.
- Decode each pending tx →
(pool_address, token_in, token_out, amount_in, deadline).
- Drop tx if pool is not in the registry; emit
aether_pending_dex_tx_total{router, pool, decoded}.
- Emit a
decode_failure counter so we can see the long tail of unsupported router shapes (1inch v6 multi-step, Balancer batch swaps, etc.).
Phase C — Post-state simulation (Rust)
- Extend
crates/simulator with simulate_pending_then_detect:
- Fork latest block via existing revm
CacheDB + EthersDB path.
- Apply the pending tx in the forked EVM.
- Read post-state pool reserves for every affected pool.
- Run Bellman-Ford on the post-state subgraph (reuse the existing
BellmanFord from crates/detector).
- Emit
aether_pending_arb_candidates_total{router, profit_bucket} and structured log MEMPOOL ARB CANDIDATE with arb_id, victim_tx_hash, hops, gross_profit_wei, sim_us.
- Strictly log-only. Do NOT publish to the gRPC arb stream — the Go executor must remain unaware in this phase to avoid accidental live submission.
Phase D — MEV-Share SSE consumer (Go)
- New
cmd/monitor/mev_share.go consuming https://mev-share.flashbots.net SSE.
- Decode hints (
tx_hash, function_selector, optional calldata, optional logs).
- Emit
aether_mev_share_hints_total{has_calldata, has_logs}.
- Cross-check: when a hint and an Alchemy pending-tx point at the same
tx_hash, log first-seen latency delta into aether_mempool_first_seen_delta_ms{source} so we have data on which signal is faster, per source.
Phase E — Metrics + Grafana panel
- New panel on the existing observability dashboard (or a sub-dashboard
Mempool — testing) showing:
rate(aether_pending_dex_tx_total[1m]) per router
rate(aether_pending_arb_candidates_total[1m]) per profit bucket
rate(aether_mev_share_hints_total[1m])
- First-seen latency histogram (Alchemy vs MEV-Share)
- Decoder failure rate
Acceptance Criteria
Out of scope (separate follow-up issues)
- Paid mempool feeds — Chainbound Fiber gRPC, bloXroute Mempool Tx, Merkle searcher API. Each gets its own issue scoped against the
MempoolSource trait introduced in Phase A.
- Self-hosted Reth sentry mesh at NY5 / AMS / SGP. Infra workstream, separate budget approval.
- Backrun bundle construction —
[victim_raw_tx, arb_tx, tip_tx] with revertingTxHashes, plus signing-flow updates in cmd/executor/bundle.go. Tracked separately to keep this scaffold log-only.
- MEV Blocker / CoW DAO searcher bid integration — orderflow auction surface, separate Go-side workstream.
- JIT-LP detection for UniV3 — different code path entirely.
- Sandwich-class strategies — explicit team policy decision required before any code lands.
Acceptance metrics that justify moving to paid feeds
When this issue is merged and we have a week of staging data, the team should review:
- Coverage gap: pending txs we observe vs the public mempool tx-hash dump (mempool-dumpster). Anything <80% means partial-view risk and justifies Fiber / sentries.
- Decode hit rate: <90% across the 7 routers means the decoder needs more shapes before paid feeds add value.
- Arb-candidate rate: pending-arb candidates / minute. <0.1/min means the simulation path is too slow or the detection threshold is wrong; fix before scaling cost.
- First-seen latency vs MEV-Share: median Alchemy WS first-seen delta. >150ms is the line at which paid feeds pay for themselves.
Related
- Mempool source comparison report (in-flight, separate doc)
crates/ingestion/src/node_pool.rs — existing multi-node pool we extend
crates/simulator/src/lib.rs — fork-mode simulator we add the new entry point to
- CLAUDE.md "Hot Path" section — the new pipeline targets the same <15ms end-to-end budget
Epoch
E5 — live MEV detection. Unblocks paid-feed integration, backrun bundle construction, and every downstream live-execution feature.
Context
Aether's current detection pipeline reacts to confirmed
Sync/Swapevents — i.e. it sees pool moves only after the block lands. To compete in live MEV (backrun-class arbitrage), we need to see pending victim swaps before the next block, simulate the post-tx state, and detect arbs against that state.This issue is the free-tier testing scaffold that validates the entire plumbing stack — pending-tx subscription, calldata decoding, post-state simulation, and metrics — without paying for a private mempool feed yet. Once the scaffold proves out (we observe pending DEX swaps in real time and can run Bellman-Ford on the simulated post-state), a separate issue will scope the paid-feed integration (Chainbound Fiber / bloXroute / self-hosted Reth sentries — see the standalone deep-dive report for that).
What this issue ships
A log-only, zero-execution-risk pipeline:
No backrun bundle is constructed. No bundle is submitted. The output of this issue is metrics + structured logs that prove every stage of the pipeline works, so the paid-feed swap is a one-line config change later.
Scope
Phase A — Pending-tx ingestion (Rust)
crates/ingestion/src/mempool.rswith aPendingTxStreamtrait + anAlchemyPendingStreamimpl.eth_subscribewithalchemy_pendingTransactionsandtoAddressfiltered to UniV2 Router02, UniV3 SwapRouter, UniV3 SwapRouter02, SushiSwap Router, Curve registry routers, Balancer Vault, 1inch AggregationRouter.tokio::sync::broadcastchannelpending_tx_txso multiple consumers (decoder + future Fiber stream + future sentry) can fan-in / fan-out without re-subscribing.node_pool.rsreconnect / health-state-machine logic.Phase B — Calldata decoder (Rust)
crates/pools/src/router_decoder.rswithalloy::sol!ABIs for the 7 router selectors above.(pool_address, token_in, token_out, amount_in, deadline).aether_pending_dex_tx_total{router, pool, decoded}.decode_failurecounter so we can see the long tail of unsupported router shapes (1inch v6 multi-step, Balancer batch swaps, etc.).Phase C — Post-state simulation (Rust)
crates/simulatorwithsimulate_pending_then_detect:CacheDB+EthersDBpath.BellmanFordfromcrates/detector).aether_pending_arb_candidates_total{router, profit_bucket}and structured logMEMPOOL ARB CANDIDATEwitharb_id, victim_tx_hash, hops, gross_profit_wei, sim_us.Phase D — MEV-Share SSE consumer (Go)
cmd/monitor/mev_share.goconsuminghttps://mev-share.flashbots.netSSE.tx_hash,function_selector, optionalcalldata, optionallogs).aether_mev_share_hints_total{has_calldata, has_logs}.tx_hash, log first-seen latency delta intoaether_mempool_first_seen_delta_ms{source}so we have data on which signal is faster, per source.Phase E — Metrics + Grafana panel
Mempool — testing) showing:rate(aether_pending_dex_tx_total[1m])per routerrate(aether_pending_arb_candidates_total[1m])per profit bucketrate(aether_mev_share_hints_total[1m])Acceptance Criteria
aether-rustsubscribes to Alchemyalchemy_pendingTransactionsWS at startup whenMEMPOOL_TRACKING=1is set; binary boots identically when unset (zero behaviour change for current users)simulate_pending_then_detectproduces a non-zeroaether_pending_arb_candidates_totalover a 1-hour run (proves the post-state path works)cmd/monitoremits MEV-Share hint counts and first-seen latency deltas vs Alchemycargo test --workspace --release,cargo clippy --workspace --all-targets -- -D warnings,go test ./... -race -count=1all cleanMEMPOOL_TRACKING=1env-var contractOut of scope (separate follow-up issues)
MempoolSourcetrait introduced in Phase A.[victim_raw_tx, arb_tx, tip_tx]withrevertingTxHashes, plus signing-flow updates incmd/executor/bundle.go. Tracked separately to keep this scaffold log-only.Acceptance metrics that justify moving to paid feeds
When this issue is merged and we have a week of staging data, the team should review:
Related
crates/ingestion/src/node_pool.rs— existing multi-node pool we extendcrates/simulator/src/lib.rs— fork-mode simulator we add the new entry point toEpoch
E5 — live MEV detection. Unblocks paid-feed integration, backrun bundle construction, and every downstream live-execution feature.