Skip to content

Conversation

@jochem-brouwer
Copy link
Member

🗒️ Description

Part of #1755.
NOTE: this is built on top of #1734 (but can't figure out how to point this PR to that branch, maybe because it is on a different remote?)

🔗 Related Issues or PRs

#1755

✅ Checklist

TODOs:

  • Figure out how to setup VSCode in execution-specs, typing does not work in VSCode yet (as opposed to execution-spec-tests)

  • Cleanup tests

  • Test these tests (:sweat_smile:) - and verify these match the expected.

  • Tackle problems to run these tests on top of differently sized storages, Add extra storage read & write benchmark cases #1755 (comment)

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:

    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.

  • All: Considered adding an entry to CHANGELOG.md.

  • All: Considered updating the online docs in the ./docs/ directory.

  • All: Set appropriate labels for the changes (only maintainers can apply labels).

  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.

  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

CPerezz and others added 2 commits November 3, 2025 15:57
Implements parametrized storage benchmarks to measure gas costs for different
storage state transitions:
- Cold writes (0 -> non-zero): ~20,000 gas per slot
- Warm updates (non-zero -> non-zero): ~2,900 gas per slot
- Storage clearing (non-zero -> 0): ~2,900 gas + refund

The implementation uses a single contract that accepts parameters via calldata,
pre-deployed with 500 filled slots to enable testing all transition types.
This provides accurate gas measurements with minimal loop overhead (~39 gas/slot).

Tests are marked as stateful to avoid gas exhaustion expectations and work
with the execution-specs testing framework.
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jochem-brouwer for this PR, please run tox -e static and resolve the linting issue! I only review test_vector_sload.py as the remaining files I already reviewed in CPerezz's PR.

Comment on lines +1 to +28
"""
abstract: Vector storage benchmark with single parametrized contract,
targeting SLOAD.

This parametrized test takes takes these arguments:
- The amount of slots to load
- The slot key incrementer

The final value is used in the test as boolean: if 0 is used,
the key is not incremented, and thus the same key is read each time.

Each test is also tested against these keys in the access list (or not).
This thus marks if the target slots are warm or cold.
"""

import pytest
from execution_testing import (
Account,
Alloc,
Block,
BlockchainTestFiller,
Bytecode,
Fork,
Op,
Storage,
Transaction,
While
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems duplicate as import & doc string below!

Comment on lines +98 to +105
bytecode += Op.ADD(Op.CALLDATALOAD(Op.PUSH0))

bytecode += Op.SWAP1

bytecode += Op.PUSH1(start_marker)

bytecode += Op.JUMPDEST # end_marker
bytecode += Op.STOP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if i understand here correctly:

Initial stack here: [current_slot, entries_lest]

PUSH0 # [0, current_slot, entries_left]
CALLDATALOAD # [incrementer, current_slot, entries_left]
ADD # [current_slot + incrementer, entries_left]
SWAP1 # [entires_left, current_slot + incrementer]
PUSH1 marker # [marker, entries_left, current_slot+incrementer]
JUMPDEST # 
STOP

When executing to JUMPDEST, should we jump back to the start marker via JUMPI? But here seems missing a DUP somewhere. Or maybe there is missing some parameter: Op.ADD(Op.CALLDATALOAD(Op.PUSH0)) should be Op.ADD(Op.CALLDATALOAD(Op.PUSH0), N)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think the bytecode is missing both correct stack initialization and the JUMP instruction to complete the loop.

# for a way how I believe we can do this (using 7702 accounts with prefilled
# storage and then executing code on there, which we can change because its a 7702 account)
@pytest.mark.valid_from("Prague")
@pytest.mark.stateful # Mark as stateful instead of benchmark
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the tests under stateful folder will automatically inherit this label.


# Create transaction to call the contract
# Use a reasonable gas limit that covers the operation
gas_limit = 21000 + 10000 + (num_slots * 50000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do not need to specify the gas limit for this transaction? By default it would be transaction gas limit cap or block gas limit in our framework

@LouisTsai-Csie LouisTsai-Csie added A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) C-test Category: test P-high labels Nov 10, 2025
Copy link
Contributor

@spencer-tb spencer-tb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some small comments. Thanks.

sender=sender,
)

blockchain_test(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should use the benchmark_test format now! Please see here for usage and skip validation field.

storage_contract: Account(storage=expected_storage),
}

blockchain_test(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here for benchmark_test


calldata = incrementer.to_bytes(32, "big") + num_slots.to_bytes(32, "big")

access_lists: List[AccessList] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a missing import.

# storage and then executing code on there, which we can change because its a 7702 account)
@pytest.mark.valid_from("Prague")
@pytest.mark.stateful # Mark as stateful instead of benchmark
@pytest.mark.parametrize("num_slots", [1, 10, 50, 100, 200])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reduce the parameterization slightly and maybe drop the 10/50?

Comment on lines +98 to +105
bytecode += Op.ADD(Op.CALLDATALOAD(Op.PUSH0))

bytecode += Op.SWAP1

bytecode += Op.PUSH1(start_marker)

bytecode += Op.JUMPDEST # end_marker
bytecode += Op.STOP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think the bytecode is missing both correct stack initialization and the JUMP instruction to complete the loop.

Comment on lines +136 to +141
if storage_keys_set:
key = 0
for i in range(num_slots):
initial_storage[key] = 1
slots.add(key)
key += incrementer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When storage_keys_set=False and warm_slots=True, I think the test should make slots warm (via access list) even though they don't exist in storage, but it doesn't because slots is empty. As a result, I think the access list contains no storage keys, so the slots remain cold instead of being warmed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) C-test Category: test P-high

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants