Skip to content

feat(bench): add attack_replay benchmark#299

Open
vincent-k2026 wants to merge 1 commit into
mainfrom
krabat/bench/attack-replay
Open

feat(bench): add attack_replay benchmark#299
vincent-k2026 wants to merge 1 commit into
mainfrom
krabat/bench/attack-replay

Conversation

@vincent-k2026
Copy link
Copy Markdown

Summary

Adds attack_replay, a hermetic regression benchmark that replays a real MegaETH mainnet attack contract deployment through MegaEvm.

The fixture is a self-contained ~64 KB JSON snapshot captured via debug_traceCall + prestateTracer (diffMode=false) at a fixed block:

  • tx: caller / nonce / gas / value / 17 KB initcode / chain_id
  • prestate (11 accounts): 4 proxy contracts at 0x4200..., 1 ERC-20 with code + 10 storage slots, the caller, and 5 supporting storage contracts
  • block env: number / timestamp / basefee / gas_limit / beneficiary / mix_hash

Bench arms

Three arms on the same in-memory state:

arm typical wall time engine
attack_replay/equivalence ~1.15 ms MegaSpecId::EQUIVALENCE
attack_replay/mini_rex ~35.9 ms MegaSpecId::MINI_REX
attack_replay/pure_revm ~1.10 ms vanilla revm (Context::mainnet) baseline

Both mega-evm specs execute the exact same 205,951 opcodes and deploy the same 582-byte runtime. The ~30x gap between EQUIVALENCE and MINI_REX isolates the cost of the multi-dimensional AdditionalLimit accounting (quadratic LOG / compute / storage / data / KV buckets) that MINI_REX enables.

The pure_revm arm is a self-check: it should land near the EQUIVALENCE arm, confirming the bench is honest and not short-circuiting.

Why this bench

Numbers correlate directly with production: the mini_rex arm matches the sequencer-monitor's observed ~33 ms inside api.inspect(...) for this exact transaction, making the bench a stable, reproducible target for any limit-tracker / hot-path optimization (e.g. caching net_usage in FrameLimitTracker).

Sanity checks

Run before criterion warm-up:

  • Asserts ExecutionResult::Success variant + reports deployed code length + addr + accounts/slots touched.
  • Counts opcode steps via a minimal OpcodeCounter inspector and asserts steps >= MIN_EXPECTED_OPCODE_STEPS (100,000). Any future setup mistake that silently short-circuits tx validation will fail the bench loudly instead of producing artificially fast numbers.

Test plan

  • cargo bench --bench attack_replay -p mega-evm --no-run builds clean
  • cargo +nightly fmt --check -p mega-evm clean
  • cargo clippy --bench attack_replay -p mega-evm 0 warnings
  • cargo bench --bench attack_replay -p mega-evm -- --quick runs, all sanity assertions pass, numbers as expected

Adds `attack_replay`, a hermetic regression bench that replays a real
MegaETH mainnet attack contract deployment through `MegaEvm`.

The fixture is a self-contained ~64 KB JSON snapshot captured via
`debug_traceCall` + `prestateTracer` (diffMode=false) at a fixed block:
  - tx (caller / nonce / gas / value / 17 KB initcode / chain_id)
  - prestate (11 accounts: 4 proxy contracts at 0x4200..., 1 ERC-20 with
    code + 10 storage slots, caller, and 5 supporting storage contracts)
  - block env (number / timestamp / basefee / gas_limit / beneficiary /
    mix_hash)

The bench produces three arms on the same in-memory state:

  attack_replay/equivalence   ~1.15 ms   (MegaSpecId::EQUIVALENCE)
  attack_replay/mini_rex      ~35.9 ms   (MegaSpecId::MINI_REX)
  attack_replay/pure_revm     ~1.10 ms   (vanilla revm baseline)

Both mega-evm specs execute the exact same 205,951 opcodes and deploy
the same 582-byte runtime. The ~30x gap between EQUIVALENCE and
MINI_REX isolates the cost of the multi-dimensional AdditionalLimit
accounting (quadratic LOG / compute / storage / data / KV buckets).
The pure_revm arm is a self-check: it should land near EQUIVALENCE,
confirming the bench is honest and not short-circuiting.

Numbers correlate directly with production: the MINI_REX arm matches
the sequencer-monitor's observed ~33 ms inside `api.inspect(...)` for
this tx, making the bench a stable target for any limit-tracker /
hot-path optimization.

Sanity checks run before criterion warm-up:
  - ExecutionResult variant + deployed code + accounts/slots touched
  - opcode step count via a minimal OpcodeCounter inspector, with a
    MIN_EXPECTED_OPCODE_STEPS guard so any future setup mistake that
    silently short-circuits validation fails the bench loudly instead
    of producing artificially fast numbers.

Run:

    cargo bench --bench attack_replay
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant