[E5] Live mempool tracking — Phase 0/1 testing scaffold (Alchemy WS + MEV-Share SSE + pending-tx decoder + post-state sim)

## Context

Aether's current detection pipeline reacts to **confirmed** `Sync` / `Swap` events — i.e. it sees pool moves only after the block lands. To compete in live MEV (backrun-class arbitrage), we need to see pending victim swaps **before** the next block, simulate the post-tx state, and detect arbs against that state.

This issue is the **free-tier testing scaffold** that validates the entire plumbing stack — pending-tx subscription, calldata decoding, post-state simulation, and metrics — **without paying for a private mempool feed yet**. Once the scaffold proves out (we observe pending DEX swaps in real time and can run Bellman-Ford on the simulated post-state), a separate issue will scope the paid-feed integration (Chainbound Fiber / bloXroute / self-hosted Reth sentries — see the standalone deep-dive report for that).

## What this issue ships

A log-only, zero-execution-risk pipeline:

```
Alchemy WS pending stream ──┐
                            ├─► PendingTxDecoder ─► PostStateSim ─► Detector ─► aether_pending_arb_candidates_total
Flashbots MEV-Share SSE ────┘                                                    (count only — no bundle build, no submission)
```

No backrun bundle is constructed. No bundle is submitted. The output of this issue is metrics + structured logs that prove every stage of the pipeline works, so the paid-feed swap is a one-line config change later.

## Scope

### Phase A — Pending-tx ingestion (Rust)

- New `crates/ingestion/src/mempool.rs` with a `PendingTxStream` trait + an `AlchemyPendingStream` impl.
- Subscribe via `eth_subscribe` with `alchemy_pendingTransactions` and `toAddress` filtered to UniV2 Router02, UniV3 SwapRouter, UniV3 SwapRouter02, SushiSwap Router, Curve registry routers, Balancer Vault, 1inch AggregationRouter.
- Lock-free `tokio::sync::broadcast` channel `pending_tx_tx` so multiple consumers (decoder + future Fiber stream + future sentry) can fan-in / fan-out without re-subscribing.
- Reuses the existing multi-node `node_pool.rs` reconnect / health-state-machine logic.
- Per-source dedup keyed on tx hash.

### Phase B — Calldata decoder (Rust)

- New `crates/pools/src/router_decoder.rs` with `alloy::sol!` ABIs for the 7 router selectors above.
- Decode each pending tx → `(pool_address, token_in, token_out, amount_in, deadline)`.
- Drop tx if pool is not in the registry; emit `aether_pending_dex_tx_total{router, pool, decoded}`.
- Emit a `decode_failure` counter so we can see the long tail of unsupported router shapes (1inch v6 multi-step, Balancer batch swaps, etc.).

### Phase C — Post-state simulation (Rust)

- Extend `crates/simulator` with `simulate_pending_then_detect`:
  1. Fork latest block via existing revm `CacheDB` + `EthersDB` path.
  2. Apply the pending tx in the forked EVM.
  3. Read post-state pool reserves for every affected pool.
  4. Run Bellman-Ford on the post-state subgraph (reuse the existing `BellmanFord` from `crates/detector`).
- Emit `aether_pending_arb_candidates_total{router, profit_bucket}` and structured log `MEMPOOL ARB CANDIDATE` with `arb_id, victim_tx_hash, hops, gross_profit_wei, sim_us`.
- **Strictly log-only**. Do NOT publish to the gRPC arb stream — the Go executor must remain unaware in this phase to avoid accidental live submission.

### Phase D — MEV-Share SSE consumer (Go)

- New `cmd/monitor/mev_share.go` consuming `https://mev-share.flashbots.net` SSE.
- Decode hints (`tx_hash`, `function_selector`, optional `calldata`, optional `logs`).
- Emit `aether_mev_share_hints_total{has_calldata, has_logs}`.
- Cross-check: when a hint and an Alchemy pending-tx point at the same `tx_hash`, log first-seen latency delta into `aether_mempool_first_seen_delta_ms{source}` so we have data on which signal is faster, per source.

### Phase E — Metrics + Grafana panel

- New panel on the existing observability dashboard (or a sub-dashboard `Mempool — testing`) showing:
  - `rate(aether_pending_dex_tx_total[1m])` per router
  - `rate(aether_pending_arb_candidates_total[1m])` per profit bucket
  - `rate(aether_mev_share_hints_total[1m])`
  - First-seen latency histogram (Alchemy vs MEV-Share)
  - Decoder failure rate

## Acceptance Criteria

- [ ] `aether-rust` subscribes to Alchemy `alchemy_pendingTransactions` WS at startup when `MEMPOOL_TRACKING=1` is set; binary boots identically when unset (zero behaviour change for current users)
- [ ] At least 95% of pending txs against the 7 supported routers decode cleanly (verified over a 1-hour staging run)
- [ ] `simulate_pending_then_detect` produces a non-zero `aether_pending_arb_candidates_total` over a 1-hour run (proves the post-state path works)
- [ ] `cmd/monitor` emits MEV-Share hint counts and first-seen latency deltas vs Alchemy
- [ ] Grafana panel renders all five charts with live data
- [ ] No bundle is constructed or submitted by the Go executor in any code path activated by this issue
- [ ] `cargo test --workspace --release`, `cargo clippy --workspace --all-targets -- -D warnings`, `go test ./... -race -count=1` all clean
- [ ] Public-facing docs / README updated with the `MEMPOOL_TRACKING=1` env-var contract

## Out of scope (separate follow-up issues)

- **Paid mempool feeds** — Chainbound Fiber gRPC, bloXroute Mempool Tx, Merkle searcher API. Each gets its own issue scoped against the `MempoolSource` trait introduced in Phase A.
- **Self-hosted Reth sentry mesh** at NY5 / AMS / SGP. Infra workstream, separate budget approval.
- **Backrun bundle construction** — `[victim_raw_tx, arb_tx, tip_tx]` with `revertingTxHashes`, plus signing-flow updates in `cmd/executor/bundle.go`. Tracked separately to keep this scaffold log-only.
- **MEV Blocker / CoW DAO searcher bid integration** — orderflow auction surface, separate Go-side workstream.
- **JIT-LP detection** for UniV3 — different code path entirely.
- **Sandwich-class strategies** — explicit team policy decision required before any code lands.

## Acceptance metrics that justify moving to paid feeds

When this issue is merged and we have a week of staging data, the team should review:

1. **Coverage gap**: pending txs we observe vs the public mempool tx-hash dump (mempool-dumpster). Anything <80% means partial-view risk and justifies Fiber / sentries.
2. **Decode hit rate**: <90% across the 7 routers means the decoder needs more shapes before paid feeds add value.
3. **Arb-candidate rate**: pending-arb candidates / minute. <0.1/min means the simulation path is too slow or the detection threshold is wrong; fix before scaling cost.
4. **First-seen latency vs MEV-Share**: median Alchemy WS first-seen delta. >150ms is the line at which paid feeds pay for themselves.

## Related

- Mempool source comparison report (in-flight, separate doc)
- `crates/ingestion/src/node_pool.rs` — existing multi-node pool we extend
- `crates/simulator/src/lib.rs` — fork-mode simulator we add the new entry point to
- CLAUDE.md "Hot Path" section — the new pipeline targets the same <15ms end-to-end budget

## Epoch

E5 — live MEV detection. Unblocks paid-feed integration, backrun bundle construction, and every downstream live-execution feature.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[E5] Live mempool tracking — Phase 0/1 testing scaffold (Alchemy WS + MEV-Share SSE + pending-tx decoder + post-state sim) #117

Context

What this issue ships

Scope

Phase A — Pending-tx ingestion (Rust)

Phase B — Calldata decoder (Rust)

Phase C — Post-state simulation (Rust)

Phase D — MEV-Share SSE consumer (Go)

Phase E — Metrics + Grafana panel

Acceptance Criteria

Out of scope (separate follow-up issues)

Acceptance metrics that justify moving to paid feeds

Related

Epoch

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[E5] Live mempool tracking — Phase 0/1 testing scaffold (Alchemy WS + MEV-Share SSE + pending-tx decoder + post-state sim) #117

Description

Context

What this issue ships

Scope

Phase A — Pending-tx ingestion (Rust)

Phase B — Calldata decoder (Rust)

Phase C — Post-state simulation (Rust)

Phase D — MEV-Share SSE consumer (Go)

Phase E — Metrics + Grafana panel

Acceptance Criteria

Out of scope (separate follow-up issues)

Acceptance metrics that justify moving to paid feeds

Related

Epoch

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions