Skip to content

fix(chain): preserve last error when TxSettler exhausts retries#2532

Merged
OjusWiZard merged 1 commit into
mainfrom
fix/txsettler-preserve-error-on-timeout
Jun 9, 2026
Merged

fix(chain): preserve last error when TxSettler exhausts retries#2532
OjusWiZard merged 1 commit into
mainfrom
fix/txsettler-preserve-error-on-timeout

Conversation

@OjusWiZard

Copy link
Copy Markdown
Member

Problem

When TxSettler.transact() exhausts its retry/timeout budget, it raises:

ChainTimeoutError("Failed to send transaction after {retries} retries")

This drops the underlying RPC error that drove every retry. Callers that classify failures by inspecting the error string cannot see the cause, so a diagnosable failure (e.g. an insufficient-gas / insufficient-funds RPC rejection that was repriced on every attempt) surfaces as an opaque timeout.

Concrete impact

A downstream consumer maps gas/funds RPC rejections to a typed insufficient-funds error by matching the RPC error string. When the same condition happens to return a non-retryable wording, it propagates via ChainInteractionError (carrying the original RPC payload) and is classified correctly. But when every attempt returns a retryable wording (e.g. intrinsic gas too low: gas 0 → reprice), the loop runs to exhaustion and raises the bare ChainTimeoutError, whose message contains none of the original error — so the same root cause is misclassified and returned as a generic 500 instead of a typed 4xx.

Fix

  • Capture the last caught exception in last_error inside the retry loop.
  • On retry/timeout exhaustion, append it to the ChainTimeoutError message and chain it as the cause (raise ChainTimeoutError(message) from last_error).

Before:

Failed to send transaction after 60 retries

After:

Failed to send transaction after 60 retries: {'code': -32000, 'message': 'intrinsic gas too low: gas 0, minimum needed 24796'}

…with __cause__ set for a full traceback.

The success path, signature, and return type are unchanged. The existing test_same_tx_sent_twice assertion uses re.search on the prefix, which the appended suffix preserves.

Tests

  • Added test_timeout_preserves_last_error: drives the reprice→exhaustion path and asserts the timeout message includes the underlying error and that __cause__ is set.

Linters

black ✓ · isort ✓ · flake8 ✓ · darglint ✓ · mypy (autonomy/chain/tx.py) ✓ — no new issues introduced.

TxSettler.transact() raised a bare "Failed to send transaction after N
retries" on retry exhaustion, dropping the underlying RPC error that
drove every retry. Callers that classify failures by inspecting the
error string (e.g. mapping a gas/funds RPC rejection to a typed
insufficient-funds error) never saw the cause, so a diagnosable
insufficient-gas failure surfaced as an opaque timeout.

Capture the last caught exception and include it in the
ChainTimeoutError message and as the exception cause
(`raise ... from last_error`).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@OjusWiZard OjusWiZard force-pushed the fix/txsettler-preserve-error-on-timeout branch from aad78e9 to 0dfcaa1 Compare June 8, 2026 19:53
Comment thread tests/test_autonomy/test_chain/test_tx.py

@jmoreira-valory jmoreira-valory left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is tight and well-scoped: 9 lines in tx.py + copyright bump + a 42-line test, touching exactly 3 files. Nothing extraneous.

Key points verified:

  • raise ChainTimeoutError(...) from last_error with last_error is None on the deadline path is valid PEP-3134 (raise X from None suppresses context but is legal and intentional here — the caller gets a clean timeout error with no misleading implicit chain when no retry ever ran).
  • The reprice branch correctly assigns last_error = e before continue, so the final timeout always carries the most recent underlying error rather than None or a stale one.
  • The existing test_timeout test already covers deadline-driven exhaustion with zero exceptions raised; the new test_timeout_preserves_last_error closes the gap for the case where every attempt raises.

One NIT filed (non-blocking): the time.sleep mock comment reads as load-bearing for correctness when it is actually just a speed optimization. Cosmetic only.

LGTM.

@OjusWiZard OjusWiZard merged commit 85d928f into main Jun 9, 2026
22 checks passed
@OjusWiZard OjusWiZard deleted the fix/txsettler-preserve-error-on-timeout branch June 9, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants