Skip to content

Fuzz: tokenizer, opcodes and engine#43

Open
chris-ricketts wants to merge 3 commits intomasterfrom
chris/fuzz-everything
Open

Fuzz: tokenizer, opcodes and engine#43
chris-ricketts wants to merge 3 commits intomasterfrom
chris/fuzz-everything

Conversation

@chris-ricketts
Copy link
Copy Markdown
Contributor

@chris-ricketts chris-ricketts commented Mar 31, 2026

Works towards #39

Fuzz Targets:

Future refactoring work:

  • Tweak world builders to Increase coverage - current coverage of fuzz tests with local cached corpus: 81% of pkg/arkade
  • Refactor fuzz helpers - a lot of this is AI generated and could be condensed while maintaining coverage

@chris-ricketts chris-ricketts marked this pull request as draft March 31, 2026 13:29
@chris-ricketts chris-ricketts force-pushed the chris/fuzz-everything branch from 099bc89 to 5dc6b69 Compare March 31, 2026 15:29
@arkanaai
Copy link
Copy Markdown

arkanaai Bot commented Mar 31, 2026

🔍 Arkana Review — introspector#43

Fuzz all the things — ArkadeScript tokenizer + opcode fuzzing

Adds 2 fuzz test files (210 lines) targeting critical ArkadeScript execution paths.

tokenizer_fuzz_test.go

  • Seeds from existing valid/invalid fixtures + handcrafted edge cases for all PUSHDATA variants (1-75 byte pushes, PUSHDATA1/2/4 with truncated/oversized payloads).
  • Fuzz body exercises Next(), Opcode(), Data(), ByteIndex(), OpcodePosition(), Script(), Err() — full tokenizer surface.
  • Key value: Catches panics from malformed scripts that could crash nodes during script parsing. This is a network-facing attack surface.

opcodes_fuzz_test.go

  • FuzzInspectInputOutpoint: Fuzzes index validation, outpoint hash/vout extraction. Seeds cover edge cases (negative indices, boundary values, 0/1/16-input txs).
  • FuzzInspectInputValue: Fuzzes value extraction from prevouts via MultiPrevOutFetcher. Verifies 8-byte LE encoding of output values.
  • Both correctly validate error-vs-success based on index bounds, then assert stack depth and values.

Security observations:

  • These are exactly the right targets for fuzzing — introspection opcodes operate on untrusted transaction data and any panic/crash is a potential DoS vector for ASPs running ArkadeScript.
  • hashWithSalt helper deterministically generates unique hashes per input — good for reproducibility.
  • inputCount % 200 caps tx size to prevent OOM during fuzzing — sensible.

Suggestions for future iterations (non-blocking):

  • The PR body shows many opcodes still unchecked (SHA256*, arithmetic, CHECKSIG*, MERKLEBRANCH*, asset group ops). Consider prioritizing CHECKSIGFROMSTACK and the SHA256 family next — those have the most complex stack manipulation.
  • Whole-script execution fuzzing (random valid PSBT + witness) would be the highest-value target once individual opcodes are covered.

Strong foundation for the fuzzing infrastructure. LGTM 👍

@chris-ricketts chris-ricketts force-pushed the chris/fuzz-everything branch 2 times, most recently from a39ea74 to 94c468c Compare April 7, 2026 17:01
@arkanaai
Copy link
Copy Markdown

arkanaai Bot commented Apr 8, 2026

Code Review — "Fuzz all the things" (PR #43)

Significant expansion since the last review. This is now a comprehensive fuzzing and property-based testing framework for the entire ArkadeScript opcode set. 3489 lines added across two fuzz targets, a complete rewrite of opcode_test.go, and 4 seed corpus entries from prior fuzzing runs.

Fuzzing Coverage

Tokenizer fuzzer (tokenizer_fuzz_test.go): Well-seeded from existing valid/invalid fixtures plus handcrafted edge cases for every PUSHDATA variant (truncated payloads, boundary sizes, oversized inputs). Exercises the full tokenizer API surface (Next, Opcode, Data, ByteIndex, OpcodePosition, Script, Err). This is the right target -- malformed scripts hitting the tokenizer are the first line of defense against DoS on ASP nodes.

Opcode fuzzer (opcode_fuzz_test.go): Unified FuzzOpcodes target that derives a consistent transaction world from a single fuzz input, then exercises every opcode three ways:

  1. Fresh VM -- isolates failures to individual opcodes
  2. Chained VM -- reuses state across opcodes to catch interaction bugs
  3. Serialized script -- packs opcode snippets into real scripts and runs them through Step(), catching tokenizer/dispatch interactions

The fuzzCaseBuilders dispatch table (indexCaseBuilder for inspect opcodes, pushDataCaseBuilder for data push opcodes, defaultCaseBuilder for everything else) is a clean pattern that ensures each opcode gets structurally appropriate fuzz input rather than pure random noise.

The seed corpus of 4 entries is small but these are likely regression cases found during development. The in-code seeds (f.Add([]byte{}), f.Add(make([]byte, 32))) provide minimal bootstrapping -- the fuzzer will expand from there.

Coverage gaps noted in PR description: Whole-script execution fuzzing (random valid PSBT + valid script + witness) is still TODO. This is the highest-value remaining target for catching end-to-end issues.

Property-Based Test Framework (opcode_test.go)

The rewrite from a simple disasm test to a full property-based spec framework is the most impactful part of this PR. Every opcode now has:

  • checkProperties -- an invariant checker used by both deterministic vectors and the fuzzer
  • validVectors / invalidVectors -- concrete test cases with expected outputs
  • disasm -- replaces the old monolithic disasm test

This is excellent test architecture. The property checkers serve as oracles for the fuzzer, meaning every fuzz-found input gets validated against the same invariants as the hand-written tests. Key observations:

  • Property checkers consistently verify alt-stack and condStack preservation, which is the right invariant for most opcodes.
  • Error paths check for specific txscript.ErrorCode values via requireScriptErrorCode/requireScriptErrorCodeIn, not just "any error". This catches error type regressions.
  • The skipped-branch fast path in executeOpcodeCase correctly handles conditional branch behavior -- verifying that non-executing branches are true no-ops.

Code Correctness

buildFuzzWorld: Constructs internally consistent tx/prevout/packet/PSBT state. The bias toward "valid enough to execute" is the right trade-off -- fully random inputs would spend most fuzzing time failing at setup rather than exercising opcode logic. fuzzMaxInOutCount = 200 prevents OOM while still allowing non-trivial transaction sizes.

buildFuzzAssetPacket: Gracefully falls back to a minimal valid asset group if packet construction fails, keeping asset opcodes exercisable. The deduplication via seenAssetIDs prevents duplicate asset ID panics.

cloneEngineForExpectedResult: Deep-copies the VM state before opcode execution for before/after comparison. Copies tx, scripts, stacks, condStack, introspectorPacket, taprootCtx. This looks complete -- I don't see any shared mutable state that could leak between the "before" snapshot and the "after" engine.

Minor nits (non-blocking):

  • saltedBytes appends the salt byte at the end and SHA256-hashes. This always produces 32-byte output regardless of input length, which means fuzzStructFromBytes will always deserialize from exactly 32 bytes. That's fine for the current struct sizes but worth noting if larger structs are added later.
  • deriveIndex case 3 (1<<31 - int64(indexSeed)) produces very large positive indices which will always be out of range. That's intentional -- it tests the boundary error path -- but a comment would help future readers.
  • pushDataPayloadLength for OP_PUSHDATA2 caps at 128 and OP_PUSHDATA4 at 196. These are well under the actual PUSHDATA size limits, presumably to keep serialized scripts compact. Worth a brief comment explaining the rationale.

Security Relevance

This PR is security-positive:

  • All introspection opcodes (INSPECTINPUT*, INSPECTOUTPUT*, INSPECTNUMINPUTS, INSPECTNUMOUTPUTS) are fuzzed with both valid and out-of-bounds indices.
  • Asset introspection opcodes get the assetOpcodeFuzzChecker that allows only ErrInvalidStackOperation, ErrNumberTooBig, and ErrMinimalData -- any other error or panic would be caught.
  • The serialized script pass catches panics that could only manifest when opcodes interact through the tokenizer, which is exactly the kind of bug that targeted unit tests miss.
  • The 4 seed corpus entries suggest prior fuzzing runs already found and fixed issues -- these seeds serve as regression tests.

No protocol-critical concerns. This is test-only code that does not touch transaction signing, VTXO management, or round lifecycle.

Remaining TODO from PR description

  • Random valid PSBT with valid script + witness cannot cause panic
  • Known valid scripts with discrete unlocking conditions will not pass with random mutations to witness/tx state

These would be high-value additions for a follow-up PR.

Overall: Strong fuzzing infrastructure with well-designed property-based oracles. The coverage is comprehensive for individual opcode behavior and opcode-interaction bugs. The test architecture makes it easy to add coverage for new opcodes going forward.

@chris-ricketts
Copy link
Copy Markdown
Contributor Author

Once #55 is merged, I'll rebase and add an opcode spec for unit tests and fuzzing.

- fuzz command: go test . -run=^$ -fuzz=FuzzArkadeScriptTokenizer
-fuzztime=10m (run in pkg/arkade)
- total fuzz time: ~20m
- errors: none (PASS, no crashes/panics)
- total interesting cases: 63
@chris-ricketts chris-ricketts force-pushed the chris/fuzz-everything branch from 94c468c to 8cc9b8c Compare April 24, 2026 13:22
@chris-ricketts chris-ricketts changed the title Fuzz all the things Fuzz: tokenizer, opcodes and engine Apr 24, 2026
@chris-ricketts chris-ricketts marked this pull request as ready for review April 24, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant