Skip to content

feat: client-supplied root TX ID and path hint headers#628

Merged
djwhitt merged 8 commits intodevelopfrom
PE-8987-root-tx-hint-headers
Mar 7, 2026
Merged

feat: client-supplied root TX ID and path hint headers#628
djwhitt merged 8 commits intodevelopfrom
PE-8987-root-tx-hint-headers

Conversation

@djwhitt
Copy link
Collaborator

@djwhitt djwhitt commented Mar 6, 2026

Summary

  • Adds support for X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path request headers that let clients supply root TX ID and nesting path hints
  • Hints are tried first in RootParentDataSource.getData() before server-side index lookups; failures silently fall through to normal resolution
  • Includes tools/fetch-with-hint CLI tool that resolves hints via GraphQL bundledIn traversal

Closes PE-8987

Test plan

  • Unit tests for hint header parsing in getRequestAttributes (valid/invalid TX IDs, paths, absent headers)
  • Unit tests for hint resolution flow (success skips normal traversal, null/error falls through, path uses getDataItemOffsetWithPath, offsets cached)
  • yarn lint:check passes
  • Full test suite passes
  • Manual: start gateway, use tools/fetch-with-hint <data-item-id> --verbose to verify end-to-end
  • Manual: send request with invalid hint headers, verify normal resolution still works

🤖 Generated with Claude Code

Allow clients to supply X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path
request headers to bypass server-side index lookups. Hints are tried first
in RootParentDataSource; failures silently fall through to normal resolution.

Includes fetch-with-hint CLI tool that resolves hints via GraphQL and
fetches data items with the hint headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@djwhitt
Copy link
Collaborator Author

djwhitt commented Mar 6, 2026

@coderabbitai Please review this.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

@djwhitt Sure! I'll start reviewing the PR now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds three new client-supplied root hint headers and types, a hint-first fast-path in RootParentDataSource.getData with unified region calculation and conditional caching, exposes ANS-104 header parsing, a CLI tool to emit the new headers when fetching raw items, and extensive tests for hint behaviors.

Changes

Cohort / File(s) Summary
Header names
src/constants.ts
Added header mappings: X-AR-IO-Root-Path, X-AR-IO-Root-Item-Offset, X-AR-IO-Root-Item-Size.
Types
src/types.d.ts
Extended RequestAttributes with rootTransactionIdHint?, rootPathHint?, rootByteHint?.
Request parsing / handlers
src/routes/data/handlers.ts, src/routes/data/handlers.test.ts
Parse/validate new headers into request attributes (rootTransactionIdHint, rootPathHint, rootByteHint); added tests for valid/invalid and edge cases.
Fast-path resolution & caching
src/data/root-parent-data-source.ts, src/data/root-parent-data-source.test.ts
Introduced hint-first getData flow (direct offset and path/bundle hints), calculateFinalRegion, tryCacheAttributes, adjusted caching to avoid overwriting precomputed attributes; expanded tests for hints, caching, propagation, and edge cases.
ANS-104 parsing
src/data/ans104-offset-source.ts
Made parseDataItemHeader public and extended its return with computed id (sha256 of signature) for downstream offset/header parsing.
CLI tool & docs
tools/fetch-with-hint, tools/lib/fetch-with-hint.ts, tools/README.md
New CLI to resolve root/path via GraphQL or use direct offset hints, emits new X-AR-IO-Root-* headers and streams gateway/raw responses; docs/usage added.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler as Server/Handler
    participant RPDS as RootParentDataSource
    participant Ans104 as Ans104OffsetSource
    participant Store as DataAttributesStore
    participant Gateway as Gateway/RawFetch

    Client->>Handler: GET /raw/{id} + X-AR-IO-Root-* headers
    Handler->>Handler: parseRequestAttributes()
    Handler->>RPDS: getData(attributes)

    alt Hint-based fast-path
        RPDS->>Ans104: parse header / resolve offsets (direct or via path)
        Ans104-->>RPDS: offsets, headerSize, contentType, id
        RPDS->>Store: tryCacheAttributes(context: hint-derived)
        RPDS->>Gateway: GET raw/{bundleId}?region=calculated
        Gateway-->>RPDS: data + contentType
        RPDS-->>Handler: return data (propagated contentType)
    else Hints absent/fail (legacy)
        RPDS->>RPDS: traverseToRootUsingAttributes() (legacy traversal)
        RPDS->>Store: setDataAttributes(...) (if discovered)
        RPDS->>Gateway: GET via computed offsets
        Gateway-->>RPDS: data
        RPDS-->>Handler: return data
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • arielmelendez
  • dtfiedler
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the primary change: adding support for client-supplied root transaction ID and path hint headers to enable faster data retrieval.
Description check ✅ Passed The description is directly related to the changeset, explaining the new hint headers, their integration into the resolution flow, and the accompanying CLI tool with test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch PE-8987-root-tx-hint-headers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
src/routes/data/handlers.ts (1)

510-543: Document the new hint-header contract on getRequestAttributes().

This export now owns the parsing and silent-drop behavior for X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path, but callers still have to infer that contract from implementation and tests.

As per coding guidelines, **/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code to enhance documentation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/data/handlers.ts` around lines 510 - 543, Add TSDoc to document
the new hint-header contract on getRequestAttributes: explain that
getRequestAttributes parses X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path
from request headers, validates them with isValidTxId, silently drops invalid
values, and returns optional rootTransactionIdHint and rootPathHint properties;
mention header names (headerNames.rootTransactionId, headerNames.rootPath), the
formats accepted (single TX ID and comma-separated TX IDs), and that callers
should not rely on errors for malformed hints but check for presence of the
returned hint fields instead.
src/data/root-parent-data-source.ts (1)

271-403: Add TSDoc for the hint fast path.

This block changes getData()'s contract in a non-obvious way: client hints now take precedence, may write cached offsets, may override sourceContentType, and silently fall back on lookup failure. Please document that behavior on getData or the class so callers know when hints are used and when they are ignored.

As per coding guidelines, **/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code to enhance documentation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/root-parent-data-source.ts` around lines 271 - 403, Add/update TSDoc
on the getData method (or the containing class) to clearly document the hint
fast-path behavior: that requestAttributes.rootTransactionIdHint (and optional
rootPathHint) are tried first and take precedence over normal lookup, that
successful hint resolution will cache offsets via
dataAttributesStore.setDataAttributes, may override the returned
sourceContentType with the bundle's contentType, and that failures in hint
resolution silently fall back to the normal resolution path (including logging
but not throwing). Reference the hint-related calls in the doc
(ans104OffsetSource.getDataItemOffset / getDataItemOffsetWithPath,
dataAttributesStore.setDataAttributes, and the final dataSource.getData call) so
callers know when hints are used, when they are ignored, and the implications
for caching and content-type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 274-401: The current broad try/catch around hint resolution
catches not only the ans104 offset lookup but also later logic (region bounds
checks and this.dataSource.getData), causing real errors to be swallowed; narrow
the try so only the calls to this.ans104OffsetSource.getDataItemOffsetWithPath /
getDataItemOffset (the lookup that produces bundleParseResult) are wrapped and
caught, logging a hint-miss and leaving bundleParseResult null, while letting
any errors after bundleParseResult is set (region calculations,
this.dataAttributesStore.setDataAttributes, and the this.dataSource.getData
call) propagate normally so they aren’t converted into silent hint-miss
fallbacks.
- Around line 348-365: The code in root-parent-data-source.ts that builds
finalRegion (variables region, offset, finalRegion) incorrectly leaves
finalRegion.size as offset.size when region.offset is provided but region.size
is omitted, causing reads past the item; change the logic so that when
region.offset !== undefined and region.size is undefined you set
finalRegion.size to the remaining bytes (offset.size - region.offset), ensuring
it is clamped >= 0, and keep the existing checks that throw when region.offset
>= offset.size and adjust requestedEnd when both offset and size are provided.

In `@src/routes/data/handlers.ts`:
- Around line 519-529: The handler currently accepts rootPathHint whenever all
segments are valid TX IDs without checking rootTransactionIdHint, which allows
conflicting hints; update the parsing logic (around rawRootPathHint /
rootPathHint and using isValidTxId) to detect when rootTransactionIdHint is
present and the inferred root from rootPathHint (e.g., its first TX ID) does not
match rootTransactionIdHint, and in that case reject the request at the boundary
(return a 400 error) or clear rootPathHint so the code falls back consistently;
ensure you reference rawRootPathHint, rootPathHint, rootTransactionIdHint and
isValidTxId when adding the guard.

In `@tools/lib/fetch-with-hint.ts`:
- Around line 44-67: resolveRootPath is adding the leaf dataItemId into the
chain, producing paths like [root, dataItemId]; change the traversal so you
append the parent/bundle id (the next hop) to chain rather than the currentId
(leaf) before moving upward. Specifically, in the loop that reads bundleId and
updates currentId, push bundleId (or the value you will set currentId to) into
chain instead of pushing the starting dataItemId, so the final path built from
root and chain.reverse() becomes [root, ..., immediateParent]; apply the same
change to the equivalent logic around lines 70-74.

---

Nitpick comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 271-403: Add/update TSDoc on the getData method (or the containing
class) to clearly document the hint fast-path behavior: that
requestAttributes.rootTransactionIdHint (and optional rootPathHint) are tried
first and take precedence over normal lookup, that successful hint resolution
will cache offsets via dataAttributesStore.setDataAttributes, may override the
returned sourceContentType with the bundle's contentType, and that failures in
hint resolution silently fall back to the normal resolution path (including
logging but not throwing). Reference the hint-related calls in the doc
(ans104OffsetSource.getDataItemOffset / getDataItemOffsetWithPath,
dataAttributesStore.setDataAttributes, and the final dataSource.getData call) so
callers know when hints are used, when they are ignored, and the implications
for caching and content-type.

In `@src/routes/data/handlers.ts`:
- Around line 510-543: Add TSDoc to document the new hint-header contract on
getRequestAttributes: explain that getRequestAttributes parses
X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path from request headers,
validates them with isValidTxId, silently drops invalid values, and returns
optional rootTransactionIdHint and rootPathHint properties; mention header names
(headerNames.rootTransactionId, headerNames.rootPath), the formats accepted
(single TX ID and comma-separated TX IDs), and that callers should not rely on
errors for malformed hints but check for presence of the returned hint fields
instead.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 65b29254-2b7c-4bad-a785-17f293ecc427

📥 Commits

Reviewing files that changed from the base of the PR and between 0197e8e and ec4ccab.

📒 Files selected for processing (9)
  • src/constants.ts
  • src/data/root-parent-data-source.test.ts
  • src/data/root-parent-data-source.ts
  • src/routes/data/handlers.test.ts
  • src/routes/data/handlers.ts
  • src/types.d.ts
  • tools/README.md
  • tools/fetch-with-hint
  • tools/lib/fetch-with-hint.ts

Comment on lines +519 to +529
// Parse client-supplied root path hint (comma-separated TX IDs)
const rawRootPathHint = req.headers[headerNames.rootPath.toLowerCase()] as
| string
| undefined;
let rootPathHint: string[] | undefined;
if (rawRootPathHint != null) {
const parts = rawRootPathHint.split(',').map((s) => s.trim());
if (parts.length > 0 && parts.every(isValidTxId)) {
rootPathHint = parts;
}
}
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject conflicting root/path hints at the request boundary.

rootPathHint is accepted whenever every segment is a valid TX ID, even if rootTransactionIdHint points at a different root. That lets an inconsistent pair through to the fast path instead of cleanly falling back, and downstream resolution can end up pairing a path-derived offset with the wrong root transaction.

🛡️ Suggested guard
   let rootPathHint: string[] | undefined;
   if (rawRootPathHint != null) {
     const parts = rawRootPathHint.split(',').map((s) => s.trim());
-    if (parts.length > 0 && parts.every(isValidTxId)) {
+    if (
+      parts.length > 0 &&
+      parts.every(isValidTxId) &&
+      (rootTransactionIdHint == null || parts[0] === rootTransactionIdHint)
+    ) {
       rootPathHint = parts;
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/data/handlers.ts` around lines 519 - 529, The handler currently
accepts rootPathHint whenever all segments are valid TX IDs without checking
rootTransactionIdHint, which allows conflicting hints; update the parsing logic
(around rawRootPathHint / rootPathHint and using isValidTxId) to detect when
rootTransactionIdHint is present and the inferred root from rootPathHint (e.g.,
its first TX ID) does not match rootTransactionIdHint, and in that case reject
the request at the boundary (return a 400 error) or clear rootPathHint so the
code falls back consistently; ensure you reference rawRootPathHint,
rootPathHint, rootTransactionIdHint and isValidTxId when adding the guard.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not adding handler-level validation for this. When a path hint is provided, the data source uses hintPath[0] as the resolved root TX ID, effectively ignoring any standalone rootTransactionIdHint. This makes conflicting hints a non-issue — the path always wins.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djwhitt, understood — thanks for the clarification. Since RootParentDataSource derives the root TX ID from hintPath[0] whenever a path hint is present, the standalone rootTransactionIdHint is superseded and the conflict is naturally resolved without any handler-level guard. I'll note this for future reviews.


✏️ Learnings added
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 457
File: src/routes/data/handlers.ts:429-433
Timestamp: 2025-07-30T18:32:44.622Z
Learning: In src/routes/data/handlers.ts, the ID validation logic uses AND operators intentionally to only reject IDs that match the regex pattern /^[a-zA-Z0-9-_]{43}$/ but fail base64url round-trip validation. IDs that are null or don't match the regex pattern are allowed to proceed further in the pipeline rather than being immediately rejected with "Invalid ID", indicating different error handling is desired for different types of invalid IDs.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional “hint fast-path” for resolving a data item’s root transaction and nesting path via two new request headers, and provides a CLI helper to generate and use those hints.

Changes:

  • Introduces X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path request headers, parsed into RequestAttributes.
  • Updates RootParentDataSource.getData() to attempt hint-based offset resolution before normal traversal and cache discovered offsets.
  • Adds tools/fetch-with-hint + documentation to resolve hint paths via GraphQL bundledIn traversal and fetch from the gateway with hint headers.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tools/lib/fetch-with-hint.ts New CLI implementation: resolves root/path via GraphQL and fetches /raw/:id with hint headers.
tools/fetch-with-hint Bash wrapper to run the TypeScript CLI via register.js.
tools/README.md Documents the new CLI tool and its options.
src/types.d.ts Extends RequestAttributes with rootTransactionIdHint and rootPathHint.
src/constants.ts Adds headerNames.rootPath constant.
src/routes/data/handlers.ts Parses/validates hint headers into RequestAttributes.
src/routes/data/handlers.test.ts Unit tests for hint header parsing behavior.
src/data/root-parent-data-source.ts Implements hint-based fast-path resolution and caching in getData().
src/data/root-parent-data-source.test.ts Unit tests for hint-resolution success/fallback behavior and caching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +289 to +300
if (hintPath && hintPath.length > 0) {
bundleParseResult =
await this.ans104OffsetSource.getDataItemOffsetWithPath(
id,
hintPath,
signal,
);
} else {
bundleParseResult = await this.ans104OffsetSource.getDataItemOffset(
id,
hintRootTxId,
signal,
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When rootPathHint is provided, getDataItemOffsetWithPath() uses path[0] as the root bundle ID, but this code still fetches and caches using hintRootTxId. If the client sends inconsistent headers (or rootPathHint[0] differs), offsets will be computed against one root bundle and bytes will be fetched/cached against another, returning incorrect data and poisoning the attributes cache. Consider deriving the effective root TX from hintPath[0] when a path is present (and/or validating it matches hintRootTxId; otherwise ignore the hint and fall through).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 6fb2855 — the code now consistently uses hintPath[0] as the root TX ID when a path hint is present, and falls back to hintRootTxId only when no path is provided.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally addressed in commit 9743de7 — the direct byte offset hint path (Step 0a) now computes SHA-256(signature) from the parsed data item header and verifies it matches the requested ID before serving data or caching offsets. This prevents cache poisoning from hints that point to a valid but wrong data item. On mismatch, the code logs and falls through to the next resolution path.

Comment on lines +65 to +71
// Path is [rootTxId, ...intermediate bundles] (root first)
const path = [currentId, ...chain.reverse()];
return { rootTxId: currentId, path };
}

chain.push(currentId);
if (verbose) {
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveRootPath() currently pushes currentId into chain, which makes the returned path include the leaf dataItemId (and generally all children), but getDataItemOffsetWithPath expects a path of bundle TX IDs from the root to the immediate parent bundle (excluding the data item itself). As-is, the generated X-AR-IO-Root-Path won’t match the gateway’s expected path structure. Consider collecting parent bundle IDs instead and ensuring the returned path ends at the immediate parent bundle only.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 6fb2855 — the leaf item is now excluded from the resolved path.

Comment on lines +1744 to +1749
describe('hint-based resolution', () => {
it('should use hint with valid offset result and skip normal traversal', async () => {
const dataItemId = 'hint-data-item';
const hintRootTxId = 'hint-root-tx';
const dataStream = Readable.from([Buffer.from('hint data')]);

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hint-based resolution tests don’t cover the case where rootTransactionIdHint and rootPathHint[0] disagree. Given the implementation currently uses rootPathHint for offset discovery but rootTransactionIdHint for fetch/caching, adding a regression test for mismatched hints (expecting fallback to normal traversal) would help prevent incorrect data/caching bugs.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not adding this test. When a path hint is provided, the data source consistently uses hintPath[0] as the root TX ID — the standalone rootTransactionIdHint is effectively ignored. This makes a "mismatch" between the two a non-issue by design, so testing for it wouldn't exercise meaningful behavior.

@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 96.84543% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.65%. Comparing base (0197e8e) to head (b7b23b0).
⚠️ Report is 8 commits behind head on develop.

Files with missing lines Patch % Lines
src/data/root-parent-data-source.ts 96.74% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #628      +/-   ##
===========================================
+ Coverage    76.50%   76.65%   +0.14%     
===========================================
  Files          104      104              
  Lines        35437    35572     +135     
  Branches      2565     2587      +22     
===========================================
+ Hits         27110    27266     +156     
+ Misses        8293     8271      -22     
- Partials        34       35       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

djwhitt and others added 2 commits March 7, 2026 08:25
Allow clients to supply X-AR-IO-Root-Data-Offset and X-AR-IO-Root-Data-Size
headers alongside X-AR-IO-Root-Transaction-Id to skip bundle parsing entirely,
enabling the fastest possible data item resolution path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract calculateFinalRegion and tryCacheAttributes helpers to
  eliminate 4x region calculation and 5x attribute storage duplication
- Fix premature caching of untrusted client hints (move to after fetch)
- Change direct byte hint from data payload offset to item offset so
  the gateway can parse the ANS-104 header to extract content type and
  compute the data offset itself
- Group rootDataOffsetHint/rootDataSizeHint into rootByteHint to
  enforce co-requirement at the type level
- Replace local parseNonNegativeInt with existing utility from http-utils
- Remove duplicate rootDataOffset key in headerNames constants
- Eliminate redundant attribute write-back for pre-computed values
- Eliminate double getDataAttributes fetch by passing prefetched result
  to traversal
- Make Ans104OffsetSource.parseDataItemHeader public for direct use
- Rename hint headers to X-AR-IO-Root-Item-Offset/Size

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +511 to +518
// Parse client-supplied root TX ID hint
const rawRootTxIdHint = req.headers[
headerNames.rootTransactionId.toLowerCase()
] as string | undefined;
const rootTransactionIdHint =
rawRootTxIdHint != null && isValidTxId(rawRootTxIdHint)
? rawRootTxIdHint
: undefined;
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req.headers[...] can be a string | string[] | undefined in Node/Express. Casting to string and then calling isValidTxId() can throw at runtime if the header is sent multiple times (array), since string[].match is undefined. Consider normalizing header values (e.g., take the first element when it’s an array) before validation.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — header values are now normalized with Array.isArray(v) ? v[0] : v for all four hint header reads.

Comment on lines +520 to +527
// Parse client-supplied root path hint (comma-separated TX IDs)
const rawRootPathHint = req.headers[headerNames.rootPath.toLowerCase()] as
| string
| undefined;
let rootPathHint: string[] | undefined;
if (rawRootPathHint != null) {
const parts = rawRootPathHint.split(',').map((s) => s.trim());
if (parts.length > 0 && parts.every(isValidTxId)) {
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same header normalization issue for X-AR-IO-Root-Path: if the header value is an array, rawRootPathHint.split(',') will throw. Normalize to a single string before splitting (or handle arrays by joining).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — same normalization applied to all hint headers including root path.

Comment on lines +68 to +73
// Path is [rootTxId, ...intermediate bundles] (root first)
const path = [currentId, ...chain.reverse()];
return { rootTxId: currentId, path };
}

chain.push(currentId);
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveRootPath() builds path from chain.reverse(), but chain currently includes the starting dataItemId (because chain.push(currentId) happens before advancing). This makes the emitted X-AR-IO-Root-Path include the data item ID, which does not match Ans104OffsetSource.getDataItemOffsetWithPath()’s expected structure [rootTxId, ..., immediateParentBundle] (excluding the data item). Adjust the chain/path construction to only include bundle transaction IDs and end at the immediate parent bundle.

Suggested change
// Path is [rootTxId, ...intermediate bundles] (root first)
const path = [currentId, ...chain.reverse()];
return { rootTxId: currentId, path };
}
chain.push(currentId);
// Path is [rootTxId, ...intermediate bundles] (root first), excluding the data item.
let path: string[];
if (chain.length === 0) {
// Unbundled case: the starting ID is already the root transaction.
path = [currentId];
} else {
// Bundled case: chain already contains [immediateParentBundle, ..., rootTxId].
path = chain.reverse();
}
return { rootTxId: currentId, path };
}
// Track only bundle transaction IDs (exclude the starting data item).
chain.push(bundleId);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 6fb2855 — the leaf item is now excluded from the resolved path.

Comment on lines +107 to +112
case '--offset':
directOffset = parseInt(args[++i], 10);
break;
case '--size':
directSize = parseInt(args[++i], 10);
break;
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--offset/--size parsing uses parseInt(), but if the user passes a non-numeric value, the result is NaN (still != null). That will incorrectly activate “direct offset mode” and send X-AR-IO-Root-Item-Offset/Size: NaN. Validate with Number.isFinite(...) (or reuse parseNonNegativeInt) and require both to be valid numbers before setting the direct offset headers.

Suggested change
case '--offset':
directOffset = parseInt(args[++i], 10);
break;
case '--size':
directSize = parseInt(args[++i], 10);
break;
case '--offset': {
const offsetArg = args[++i];
const parsedOffset = parseInt(offsetArg, 10);
if (!Number.isFinite(parsedOffset) || parsedOffset < 0) {
console.error(
`Invalid value for --offset: "${offsetArg}". Expected a non-negative integer.`,
);
usage();
return;
}
directOffset = parsedOffset;
break;
}
case '--size': {
const sizeArg = args[++i];
const parsedSize = parseInt(sizeArg, 10);
if (!Number.isFinite(parsedSize) || parsedSize < 0) {
console.error(
`Invalid value for --size: "${sizeArg}". Expected a non-negative integer.`,
);
usage();
return;
}
directSize = parsedSize;
break;
}

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fixing — this is a dev-only CLI tool for testing hint resolution, not a user-facing utility. The added validation isn't worth the churn.

--root-tx-id <id> Root TX ID (skips GraphQL resolution when used with --offset/--size)
--verbose Show resolution details
--help Show this help`);
process.exit(1);
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usage() always exits with status code 1, including for --help. Consider exiting 0 for the help path so scripting/CI can treat help output as non-error.

Suggested change
process.exit(1);
const argv = process.argv || [];
const hasHelpFlag = argv.includes('--help') || argv.includes('-h');
process.exit(hasHelpFlag ? 0 : 1);

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fixing — dev-only CLI tool, not worth the churn for a help exit code.

Comment on lines +532 to +546
// Parse client-supplied root item offset and size hints (both required together)
const rawRootItemOffsetHint = req.headers[
headerNames.rootItemOffset.toLowerCase()
] as string | undefined;
const rawRootItemSizeHint = req.headers[
headerNames.rootItemSize.toLowerCase()
] as string | undefined;
const rootItemOffsetParsed =
rawRootItemOffsetHint != null
? parseNonNegativeInt(rawRootItemOffsetHint)
: undefined;
const rootItemSizeParsed =
rawRootItemSizeHint != null
? parseNonNegativeInt(rawRootItemSizeHint)
: undefined;
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X-AR-IO-Root-Item-Offset / X-AR-IO-Root-Item-Size have the same string | string[] header-value issue: if multiple headers are sent, parseNonNegativeInt() will receive a non-string at runtime (after the cast) and throw when calling .trim(). Normalize these header values to a single string before parsing.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — same Array.isArray normalization applied to offset and size headers as well.

Comment on lines +465 to +469
const data = await this.dataSource.getData({
id: hintRootTxId,
requestAttributes,
region: finalRegion,
parentSpan: span,
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a path hint is provided, getDataItemOffsetWithPath() computes offsets relative to hintPath[0] (root bundle ID), but the subsequent fetch uses id: hintRootTxId. If rootTransactionIdHint and rootPathHint[0] ever disagree (client mistake/malicious), this can fetch bytes from the wrong root TX. Consider either validating hintPath[0] === hintRootTxId (otherwise ignore the path hint) or using hintPath[0] consistently for both offset lookup and dataSource.getData/caching.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 6fb2855 — the code now consistently uses hintPath[0] as the root TX ID when a path hint is present.

Comment on lines +86 to +87
offset: dataOffset + (region.offset || 0),
size: region.size || dataSize,
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculateFinalRegion() uses region.size || dataSize, which will treat a valid region.size of 0 as falsy and return the full dataSize instead. Use nullish coalescing (??) or explicit undefined checks so 0 is preserved (and similarly avoid region.offset || 0 unless you really want to treat 0 as unset).

Suggested change
offset: dataOffset + (region.offset || 0),
size: region.size || dataSize,
offset: dataOffset + (region.offset ?? 0),
size: region.size ?? dataSize,

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — changed || to ?? so that valid 0 values for offset and size are not treated as falsy.

…f from path

- Narrow try/catch in hint resolution to only wrap offset lookups,
  letting region calculation and data fetch errors propagate normally
- Derive root TX ID from most-specific hint (path[0] over header hint)
  to prevent mismatched offset/fetch when both hints are provided
- Skip pushing leaf data item into resolveRootPath chain so the path
  matches the [root, ...parents] format expected by getDataItemOffsetWithPath

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tools/lib/fetch-with-hint.ts (1)

42-46: Please add TSDoc to the new helpers.

resolveRootPath() and main() are the core entry points in this tool, but neither documents expected arguments, modes, or failure behavior.

As per coding guidelines, "Add or improve TSDoc comments when modifying code to enhance documentation".

Also applies to: 85-214

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/lib/fetch-with-hint.ts` around lines 42 - 46, Add TSDoc comments to
both resolveRootPath and main: document each parameter (graphqlUrl, dataItemId,
verbose for resolveRootPath; main's args/modes), describe the return type
(Promise with rootTxId and path), list possible failure behaviors and exceptions
thrown/returned on error, and note side effects (network calls, logging, process
exit) and expected modes of operation; include brief examples or usage notes and
reference the functions resolveRootPath and main so maintainers can quickly
understand inputs, outputs, and error semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 85-101: The code builds finalRegion but when region.offset is
provided and region.size is omitted it incorrectly uses size = dataSize and can
over-read; update the logic that computes finalRegion (using symbols
finalRegion, region, dataSize, dataOffset) so that if region.offset is defined
and region.size is undefined you set finalRegion.size = dataSize - region.offset
(clamped to >= 0), and unify/truncate sizing logic so both the case where
region.size is provided and the open-ended case apply the same truncation rules
to prevent reads past the item boundary.

In `@tools/lib/fetch-with-hint.ts`:
- Around line 97-117: The flag-parsing loop (the for loop over args) can accept
missing or invalid values and let directOffset, directSize, and directRootTxId
become undefined or NaN; update the parsing in the switch cases for '--offset',
'--size', '--root-tx-id', '--gateway', '--graphql', and '--output' to validate
that an argument exists after the flag, that numeric flags ('--offset' and
'--size') parse to a finite integer (use Number.isFinite/Number.isInteger or
equivalent) and that string flags are non-empty; on invalid or missing values
throw a clear error or exit with a message before headers are built so invalid
headers aren't emitted; after changes run yarn lint:fix on the file to satisfy
formatting.
- Around line 24-39: Change the usage() helper to accept an optional exitCode
parameter (e.g., usage(exitCode = 1)) and call process.exit(exitCode) instead of
unconditionally process.exit(1); update any callers that print help (the --help
handling in the CLI argument parsing) to call usage(0) so `fetch-with-hint
--help` exits with code 0 while other error paths still call usage() or
usage(1). Ensure the function signature and all invocations referencing usage()
and the process.exit call are updated accordingly.
- Around line 198-212: The download stream errors aren't propagated because
response.data errors aren't wired to the write stream or stdout; update the code
in the outputFile branch (where writeStream is created) and the stdout branch
(where response.data is piped to process.stdout) to use pipeline from
node:stream/promises with response.data as the source and writeStream or
process.stdout as the destination so any stream errors reject the awaiting
promise; ensure you import pipeline and replace the manual event listeners
around writeStream and response.data with await pipeline(response.data,
writeStream) and await pipeline(response.data, process.stdout) respectively,
preserving the verbose log after a successful pipeline.

---

Nitpick comments:
In `@tools/lib/fetch-with-hint.ts`:
- Around line 42-46: Add TSDoc comments to both resolveRootPath and main:
document each parameter (graphqlUrl, dataItemId, verbose for resolveRootPath;
main's args/modes), describe the return type (Promise with rootTxId and path),
list possible failure behaviors and exceptions thrown/returned on error, and
note side effects (network calls, logging, process exit) and expected modes of
operation; include brief examples or usage notes and reference the functions
resolveRootPath and main so maintainers can quickly understand inputs, outputs,
and error semantics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5c900e3d-b736-4e0f-a7b9-5f72a43a99b8

📥 Commits

Reviewing files that changed from the base of the PR and between 0351167 and 6fb2855.

📒 Files selected for processing (2)
  • src/data/root-parent-data-source.ts
  • tools/lib/fetch-with-hint.ts

Comment on lines +85 to +101
const finalRegion: Region = {
offset: dataOffset + (region.offset || 0),
size: region.size || dataSize,
};

if (region.offset !== undefined && region.offset >= dataSize) {
throw new Error(
`Requested region offset ${region.offset} exceeds data item size ${dataSize}`,
);
}

if (region.size !== undefined && region.offset !== undefined) {
const requestedEnd = region.offset + region.size;
if (requestedEnd > dataSize) {
finalRegion.size = dataSize - region.offset;
}
}
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Open-ended regions still over-read past the item boundary.

When region.offset is provided but region.size is omitted, line 87 sets finalRegion.size to dataSize. The truncation logic (lines 96-101) only executes when both region.size and region.offset are defined, so the size remains incorrect.

Example: dataSize=100, region={offset:10} → returns size=100 instead of 90, causing a read past the item boundary.

🐛 Proposed fix
   const finalRegion: Region = {
     offset: dataOffset + (region.offset || 0),
-    size: region.size || dataSize,
+    size:
+      region.size !== undefined
+        ? region.size
+        : region.offset !== undefined
+          ? dataSize - region.offset
+          : dataSize,
   };

   if (region.offset !== undefined && region.offset >= dataSize) {
     throw new Error(
       `Requested region offset ${region.offset} exceeds data item size ${dataSize}`,
     );
   }

-  if (region.size !== undefined && region.offset !== undefined) {
+  // Truncate if requested end exceeds data item bounds
+  if (region.size !== undefined) {
-    const requestedEnd = region.offset + region.size;
+    const requestedEnd = (region.offset || 0) + region.size;
     if (requestedEnd > dataSize) {
-      finalRegion.size = dataSize - region.offset;
+      finalRegion.size = dataSize - (region.offset || 0);
     }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const finalRegion: Region = {
offset: dataOffset + (region.offset || 0),
size: region.size || dataSize,
};
if (region.offset !== undefined && region.offset >= dataSize) {
throw new Error(
`Requested region offset ${region.offset} exceeds data item size ${dataSize}`,
);
}
if (region.size !== undefined && region.offset !== undefined) {
const requestedEnd = region.offset + region.size;
if (requestedEnd > dataSize) {
finalRegion.size = dataSize - region.offset;
}
}
const finalRegion: Region = {
offset: dataOffset + (region.offset || 0),
size:
region.size !== undefined
? region.size
: region.offset !== undefined
? dataSize - region.offset
: dataSize,
};
if (region.offset !== undefined && region.offset >= dataSize) {
throw new Error(
`Requested region offset ${region.offset} exceeds data item size ${dataSize}`,
);
}
// Truncate if requested end exceeds data item bounds
if (region.size !== undefined) {
const requestedEnd = (region.offset || 0) + region.size;
if (requestedEnd > dataSize) {
finalRegion.size = dataSize - (region.offset || 0);
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/root-parent-data-source.ts` around lines 85 - 101, The code builds
finalRegion but when region.offset is provided and region.size is omitted it
incorrectly uses size = dataSize and can over-read; update the logic that
computes finalRegion (using symbols finalRegion, region, dataSize, dataOffset)
so that if region.offset is defined and region.size is undefined you set
finalRegion.size = dataSize - region.offset (clamped to >= 0), and
unify/truncate sizing logic so both the case where region.size is provided and
the open-ended case apply the same truncation rules to prevent reads past the
item boundary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the earlier open-ended regions comment — in practice we never receive offset without size (both are required together in the handler). Not adding clamping for an unreachable case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djwhitt, understood — noted that region at the calculateFinalRegion call site is also always either undefined or has both offset and size set together, making the open-ended case unreachable. I already have this captured from our earlier exchange (the rootByteHint structural guarantee learning), so no new learning needed. Thanks for the clarification!

(ฅ^•ﻌ•^ฅ)


🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.

Learnt from: karlprieb
Repo: ar-io/ar-io-node PR: 228
File: src/workers/ans104-unbundler.ts:41-45
Timestamp: 2024-10-31T17:48:59.599Z
Learning: In `src/workers/ans104-unbundler.ts`, when defining `isNormalizedBundleDataItem`, avoid checking for `null` values of `root_parent_offset` and `data_offset`, as `null` values signify that the item is a `NormalizedOptimisticDataItem`, not a `NormalizedBundleDataItem`.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 457
File: src/routes/data/handlers.ts:429-433
Timestamp: 2025-07-30T18:32:44.622Z
Learning: In src/routes/data/handlers.ts, the ID validation logic uses AND operators intentionally to only reject IDs that match the regex pattern /^[a-zA-Z0-9-_]{43}$/ but fail base64url round-trip validation. IDs that are null or don't match the regex pattern are allowed to proceed further in the pipeline rather than being immediately rejected with "Invalid ID", indicating different error handling is desired for different types of invalid IDs.

Comment on lines +24 to +39
function usage() {
console.error(`Usage: fetch-with-hint <data-item-id> [options]

Fetches a data item from the gateway using client-supplied root TX ID
and nesting path hints resolved via GraphQL.

Options:
--gateway <url> Gateway URL (default: http://localhost:4000)
--graphql <url> GraphQL endpoint (default: https://arweave.net/graphql)
--output <file> Output file (default: stdout)
--offset <n> Data item offset hint (byte offset of item start within root TX)
--size <n> Data item size hint (total item size including headers)
--root-tx-id <id> Root TX ID (skips GraphQL resolution when used with --offset/--size)
--verbose Show resolution details
--help Show this help`);
process.exit(1);
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "fetch-with-hint.ts" -o -name "fetch-with-hint.js"

Repository: ar-io/ar-io-node

Length of output: 90


🏁 Script executed:

wc -l tools/lib/fetch-with-hint.ts

Repository: ar-io/ar-io-node

Length of output: 92


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | head -150

Repository: ar-io/ar-io-node

Length of output: 5152


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | tail -100

Repository: ar-io/ar-io-node

Length of output: 3212


🏁 Script executed:

rg "usage\(" tools/lib/fetch-with-hint.ts

Repository: ar-io/ar-io-node

Length of output: 127


🏁 Script executed:

head -20 package.json

Repository: ar-io/ar-io-node

Length of output: 650


🏁 Script executed:

rg "function\s+" tools/lib/fetch-with-hint.ts -A 3 | head -40

Repository: ar-io/ar-io-node

Length of output: 417


🏁 Script executed:

grep -A 5 "scripts" package.json | head -20

Repository: ar-io/ar-io-node

Length of output: 744


🏁 Script executed:

rg "^\s*\*\s*@" tools/lib/fetch-with-hint.ts

Repository: ar-io/ar-io-node

Length of output: 42


🏁 Script executed:

fd "*.ts" tools/ --exec grep -l "^\s*/\*\*" {} \; | head -5

Repository: ar-io/ar-io-node

Length of output: 343


🏁 Script executed:

grep "lint" package.json

Repository: ar-io/ar-io-node

Length of output: 526


🏁 Script executed:

find tools/ -name "*.ts" -exec grep -l "/\*\*" {} \;

Repository: ar-io/ar-io-node

Length of output: 734


🏁 Script executed:

cat tools/lib/fetch-with-hint.ts | grep -B 5 "async function resolveRootPath"

Repository: ar-io/ar-io-node

Length of output: 203


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '121,128p'

Repository: ar-io/ar-io-node

Length of output: 298


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '24,40p'

Repository: ar-io/ar-io-node

Length of output: 964


Return success exit code for --help.

The usage() function unconditionally exits with status 1, causing fetch-with-hint --help to report as a failure to shells and CI scripts. Add an optional exitCode parameter to allow the --help case to exit with 0.

Suggested fix
-function usage() {
+function usage(exitCode = 1) {
   console.error(`Usage: fetch-with-hint <data-item-id> [options]
@@
-  process.exit(1);
+  process.exit(exitCode);
 }
@@
       case '--help':
-        usage();
+        usage(0);
         break;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/lib/fetch-with-hint.ts` around lines 24 - 39, Change the usage() helper
to accept an optional exitCode parameter (e.g., usage(exitCode = 1)) and call
process.exit(exitCode) instead of unconditionally process.exit(1); update any
callers that print help (the --help handling in the CLI argument parsing) to
call usage(0) so `fetch-with-hint --help` exits with code 0 while other error
paths still call usage() or usage(1). Ensure the function signature and all
invocations referencing usage() and the process.exit call are updated
accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fixing — dev-only CLI tool, not worth the churn for a help exit code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djwhitt, understood — fair call for a dev-only tool.


🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.

Comment on lines +97 to +117
for (let i = 0; i < args.length; i++) {
const arg = args[i];
switch (arg) {
case '--gateway':
gateway = args[++i];
break;
case '--graphql':
graphqlUrl = args[++i];
break;
case '--output':
outputFile = args[++i];
break;
case '--offset':
directOffset = parseInt(args[++i], 10);
break;
case '--size':
directSize = parseInt(args[++i], 10);
break;
case '--root-tx-id':
directRootTxId = args[++i];
break;
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '90,160p'

Repository: ar-io/ar-io-node

Length of output: 2397


Validate direct-hint flags before building headers.

Missing option values currently become undefined, and parseInt() will accept partial numerics or produce NaN. This allows inputs like --offset foo or incomplete --root-tx-id/--offset/--size flags to emit invalid headers (e.g., "NaN" in numeric fields) or silently fall back to GraphQL mode instead of failing fast.

Suggested fix
+  const readOptionValue = (flag: string) => {
+    const value = args[++i];
+    if (value == null || value.startsWith('-')) {
+      throw new Error(`${flag} requires a value`);
+    }
+    return value;
+  };
+
   for (let i = 0; i < args.length; i++) {
     const arg = args[i];
     switch (arg) {
       case '--gateway':
-        gateway = args[++i];
+        gateway = readOptionValue('--gateway');
         break;
       case '--graphql':
-        graphqlUrl = args[++i];
+        graphqlUrl = readOptionValue('--graphql');
         break;
       case '--output':
-        outputFile = args[++i];
+        outputFile = readOptionValue('--output');
         break;
       case '--offset':
-        directOffset = parseInt(args[++i], 10);
+        directOffset = Number(readOptionValue('--offset'));
+        if (!Number.isInteger(directOffset) || directOffset < 0) {
+          throw new Error('--offset must be a non-negative integer');
+        }
         break;
       case '--size':
-        directSize = parseInt(args[++i], 10);
+        directSize = Number(readOptionValue('--size'));
+        if (!Number.isInteger(directSize) || directSize < 0) {
+          throw new Error('--size must be a non-negative integer');
+        }
         break;
       case '--root-tx-id':
-        directRootTxId = args[++i];
+        directRootTxId = readOptionValue('--root-tx-id');
         break;
@@
+  const hasAnyDirectHint =
+    directRootTxId != null || directOffset != null || directSize != null;
+  if (
+    hasAnyDirectHint &&
+    (directRootTxId == null || directOffset == null || directSize == null)
+  ) {
+    throw new Error(
+      '--root-tx-id, --offset, and --size must be provided together',
+    );
+  }

After applying fixes, run yarn lint:fix on this file to ensure code formatting compliance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/lib/fetch-with-hint.ts` around lines 97 - 117, The flag-parsing loop
(the for loop over args) can accept missing or invalid values and let
directOffset, directSize, and directRootTxId become undefined or NaN; update the
parsing in the switch cases for '--offset', '--size', '--root-tx-id',
'--gateway', '--graphql', and '--output' to validate that an argument exists
after the flag, that numeric flags ('--offset' and '--size') parse to a finite
integer (use Number.isFinite/Number.isInteger or equivalent) and that string
flags are non-empty; on invalid or missing values throw a clear error or exit
with a message before headers are built so invalid headers aren't emitted; after
changes run yarn lint:fix on the file to satisfy formatting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fixing — dev-only CLI tool for testing hint resolution. The added validation and error handling isn't worth the churn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djwhitt, understood — makes sense for an internal testing tool. I'll leave it as-is.

(◕ᴗ◕✿)


🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 457
File: src/routes/data/handlers.ts:429-433
Timestamp: 2025-07-30T18:32:44.622Z
Learning: In src/routes/data/handlers.ts, the ID validation logic uses AND operators intentionally to only reject IDs that match the regex pattern /^[a-zA-Z0-9-_]{43}$/ but fail base64url round-trip validation. IDs that are null or don't match the regex pattern are allowed to proceed further in the pipeline rather than being immediately rejected with "Invalid ID", indicating different error handling is desired for different types of invalid IDs.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.

Comment on lines +198 to +212
if (outputFile) {
const writeStream = fs.createWriteStream(outputFile);
response.data.pipe(writeStream);
await new Promise<void>((resolve, reject) => {
writeStream.on('finish', resolve);
writeStream.on('error', reject);
});
if (verbose) {
console.error(`Written to ${outputFile}`);
}
} else {
response.data.pipe(process.stdout);
await new Promise<void>((resolve) => {
response.data.on('end', resolve);
});
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '190,220p'

Repository: ar-io/ar-io-node

Length of output: 1097


🏁 Script executed:

head -20 tools/lib/fetch-with-hint.ts

Repository: ar-io/ar-io-node

Length of output: 420


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '1,50p'

Repository: ar-io/ar-io-node

Length of output: 1841


🏁 Script executed:

rg "import.*pipeline" tools/lib/

Repository: ar-io/ar-io-node

Length of output: 136


🏁 Script executed:

cd tools/lib && head -200 fetch-with-hint.ts | tail -80

Repository: ar-io/ar-io-node

Length of output: 2028


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '100,145p'

Repository: ar-io/ar-io-node

Length of output: 1365


🏁 Script executed:

rg "async function" tools/lib/fetch-with-hint.ts -A 5

Repository: ar-io/ar-io-node

Length of output: 431


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '42,47p'

Repository: ar-io/ar-io-node

Length of output: 278


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '85,100p'

Repository: ar-io/ar-io-node

Length of output: 655


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | wc -l

Repository: ar-io/ar-io-node

Length of output: 63


🏁 Script executed:

rg "pipeline" tools/lib/download-cdb64.ts -B 2 -A 8

Repository: ar-io/ar-io-node

Length of output: 837


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '85,95p'

Repository: ar-io/ar-io-node

Length of output: 509


🏁 Script executed:

git log --oneline tools/lib/fetch-with-hint.ts | head -5

Repository: ar-io/ar-io-node

Length of output: 148


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '85,98p'

Repository: ar-io/ar-io-node

Length of output: 598


🏁 Script executed:

rg "function main" tools/lib/ -A 3

Repository: ar-io/ar-io-node

Length of output: 4730


🏁 Script executed:

cat -n tools/lib/fetch-with-hint.ts | sed -n '213,221p'

Repository: ar-io/ar-io-node

Length of output: 272


🏁 Script executed:

cat -n tools/lib/download-cdb64.ts | sed -n '1,15p'

Repository: ar-io/ar-io-node

Length of output: 607


Add error handling for response stream failures during file transfer.

The current implementation does not propagate errors from response.data (the read stream). If the download fails mid-transfer, the promise may never reject:

  • File case (lines 199–204): Only listens to writeStream events; errors from response.data are ignored
  • Stdout case (lines 209–212): No error handler; failures cause silent hangs or incomplete output

Use pipeline() from node:stream/promises to properly handle errors in both directions:

Suggested fix
 import fs from 'node:fs';
+import { pipeline } from 'node:stream/promises';
@@
   if (outputFile) {
-    const writeStream = fs.createWriteStream(outputFile);
-    response.data.pipe(writeStream);
-    await new Promise<void>((resolve, reject) => {
-      writeStream.on('finish', resolve);
-      writeStream.on('error', reject);
-    });
+    await pipeline(response.data, fs.createWriteStream(outputFile));
     if (verbose) {
       console.error(`Written to ${outputFile}`);
     }
   } else {
-    response.data.pipe(process.stdout);
-    await new Promise<void>((resolve) => {
-      response.data.on('end', resolve);
-    });
+    await pipeline(response.data, process.stdout);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (outputFile) {
const writeStream = fs.createWriteStream(outputFile);
response.data.pipe(writeStream);
await new Promise<void>((resolve, reject) => {
writeStream.on('finish', resolve);
writeStream.on('error', reject);
});
if (verbose) {
console.error(`Written to ${outputFile}`);
}
} else {
response.data.pipe(process.stdout);
await new Promise<void>((resolve) => {
response.data.on('end', resolve);
});
if (outputFile) {
await pipeline(response.data, fs.createWriteStream(outputFile));
if (verbose) {
console.error(`Written to ${outputFile}`);
}
} else {
await pipeline(response.data, process.stdout);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/lib/fetch-with-hint.ts` around lines 198 - 212, The download stream
errors aren't propagated because response.data errors aren't wired to the write
stream or stdout; update the code in the outputFile branch (where writeStream is
created) and the stdout branch (where response.data is piped to process.stdout)
to use pipeline from node:stream/promises with response.data as the source and
writeStream or process.stdout as the destination so any stream errors reject the
awaiting promise; ensure you import pipeline and replace the manual event
listeners around writeStream and response.data with await
pipeline(response.data, writeStream) and await pipeline(response.data,
process.stdout) respectively, preserving the verbose log after a successful
pipeline.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fixing — dev-only CLI tool for testing hint resolution. Adding stream error handling isn't worth the complexity for a diagnostic utility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djwhitt, understood — makes sense for a dev-only diagnostic tool. Acknowledged.


🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 236
File: src/lib/data-root.ts:165-248
Timestamp: 2024-11-18T20:54:18.763Z
Learning: In this project, using `async` functions as Promise executors is acceptable, even if it might lead to unhandled rejections in general cases.

Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.

Use ?? instead of || in calculateFinalRegion to avoid treating valid 0
values as falsy. Normalize Express header reads with Array.isArray check
to handle duplicate headers safely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/data/root-parent-data-source.ts (1)

125-138: Update this TSDoc for the new traversal contract.

The signature now accepts prefetchedAttributes and returns fromPreComputed, but the docblock still only describes the old traversal/null behavior.

As per coding guidelines, "**/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code to enhance documentation`."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/root-parent-data-source.ts` around lines 125 - 138, Update the TSDoc
for traverseToRootUsingAttributes to describe the new traversal contract:
document the prefetchedAttributes?: ContiguousDataAttributes parameter and
explain when it is used, and update the return description to list all returned
properties (rootTxId, totalOffset, rootDataOffset, size, and fromPreComputed)
and what each means; also clarify the null return case (when traversal is
incomplete due to missing attributes) and explicitly state what
fromPreComputed=true indicates (that the result came from
prefetched/pre-computed attributes rather than live traversal).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 389-399: The precomputed cache writes that call tryCacheAttributes
(the "direct offset hint" call and the separate "root index" precompute write)
are omitting the known contentType, causing later hits to use the root
transaction's type; update both object literals passed to tryCacheAttributes to
include contentType set from the available variable that holds the item's
MIME/type (e.g., contentType or the local hintContentType variable), so the
cache entry persists the correct content type for subsequent reads.
- Around line 347-387: The direct-offset hint branch uses
parseDataItemHeader(hintRootTxId, hintItemOffset, hintItemSize, signal) and then
trusts headerInfo to fetch payload for hintRootTxId without verifying the item
actually corresponds to the requested id; validate the parsed data item's id (or
signature/owner) matches the requested id before computing
dataOffset/finalRegion and calling dataSource.getData so stale/malicious hints
cannot return other payloads. Locate the block that sets headerInfo and where
dataOffset/dataSize/hintContentType are derived (parseDataItemHeader,
headerInfo, calculateFinalRegion, getData) and add a check that compares the
parsed item id with the requested id (or verifies a cryptographic binding) and
bail/fall-through if it does not match, preserving existing logs/tracing and
error handling.
- Around line 335-336: The code currently reads
requestAttributes?.rootTransactionIdHint into hintRootTxId and branches, which
makes rootPathHint a no-op; change the resolution so you first check
requestAttributes?.rootPathHint and, if present, resolve it into the
transaction-id (using the existing path-to-transaction-id helper or the same
resolution logic used elsewhere) and assign that to hintRootTxId, otherwise fall
back to requestAttributes?.rootTransactionIdHint; update the direct-offset
branch (same pattern at the other occurrence) to use this resolved hintRootTxId
so "path" always takes precedence over the explicit transaction-id hint.

---

Nitpick comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 125-138: Update the TSDoc for traverseToRootUsingAttributes to
describe the new traversal contract: document the prefetchedAttributes?:
ContiguousDataAttributes parameter and explain when it is used, and update the
return description to list all returned properties (rootTxId, totalOffset,
rootDataOffset, size, and fromPreComputed) and what each means; also clarify the
null return case (when traversal is incomplete due to missing attributes) and
explicitly state what fromPreComputed=true indicates (that the result came from
prefetched/pre-computed attributes rather than live traversal).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a9d7d5fd-334a-41a0-b789-09df4ce96917

📥 Commits

Reviewing files that changed from the base of the PR and between 6fb2855 and da402d2.

📒 Files selected for processing (2)
  • src/data/root-parent-data-source.ts
  • src/routes/data/handlers.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/routes/data/handlers.ts

Comment on lines +335 to +336
const hintRootTxId = requestAttributes?.rootTransactionIdHint;
if (hintRootTxId != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve hintRootTxId from rootPathHint first.

Step 0 is currently gated on rootTransactionIdHint, and the direct-offset branch always uses that value. That makes rootPathHint a no-op when sent alone and bypasses the documented path precedence when both hints are present.

💡 Minimal fix
-      const hintRootTxId = requestAttributes?.rootTransactionIdHint;
+      const hintPath = requestAttributes?.rootPathHint;
+      const hintRootTxId =
+        hintPath?.[0] ?? requestAttributes?.rootTransactionIdHint;
       if (hintRootTxId != null) {
@@
-        const hintPath = requestAttributes?.rootPathHint;
         let bundleParseResult: {
Based on learnings, "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.

Also applies to: 418-418

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/root-parent-data-source.ts` around lines 335 - 336, The code
currently reads requestAttributes?.rootTransactionIdHint into hintRootTxId and
branches, which makes rootPathHint a no-op; change the resolution so you first
check requestAttributes?.rootPathHint and, if present, resolve it into the
transaction-id (using the existing path-to-transaction-id helper or the same
resolution logic used elsewhere) and assign that to hintRootTxId, otherwise fall
back to requestAttributes?.rootTransactionIdHint; update the direct-offset
branch (same pattern at the other occurrence) to use this resolved hintRootTxId
so "path" always takes precedence over the explicit transaction-id hint.

Compute SHA-256(signature) from the parsed data item header and compare
it against the requested ID before serving data or caching offsets. This
prevents a malicious or buggy client from poisoning the offset cache by
supplying hints that point to a different data item.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
src/data/root-parent-data-source.test.ts (1)

1916-1976: Consider adding getDataItemOffsetWithPath to beforeEach for consistency.

The method is dynamically added via (ans104OffsetSource as any).getDataItemOffsetWithPath = mock.fn(...) in this test. While this works, it's inconsistent with how getDataItemOffset and parseDataItemHeader are set up in beforeEach. If other tests depend on this method existing, they could fail unexpectedly.

💡 Suggested improvement

Add getDataItemOffsetWithPath to the beforeEach setup:

     ans104OffsetSource = {
       getDataItemOffset: mock.fn(),
       parseDataItemHeader: mock.fn(),
+      getDataItemOffsetWithPath: mock.fn(),
     } as any;

Then mock its implementation in individual tests as needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/root-parent-data-source.test.ts` around lines 1916 - 1976, Add a
default mock of getDataItemOffsetWithPath to the test suite setup so it's
consistently present like getDataItemOffset and parseDataItemHeader; update the
beforeEach to assign (ans104OffsetSource as any).getDataItemOffsetWithPath =
mock.fn(async () => {/* noop or default return */}) and leave individual tests
(like the one using hintPath) free to override that mock implementation as
needed; this ensures tests won't fail when they expect the method to exist and
keeps mocking consistent across the suite.
src/data/ans104-offset-source.ts (1)

765-775: Add TSDoc comment for the now-public method.

Since parseDataItemHeader is now public and will be called by RootParentDataSource, it should have a TSDoc comment describing its purpose, parameters, and return value — consistent with the other public methods in this class like getDataItemOffset and getDataItemOffsetWithPath.

📝 Suggested TSDoc
+  /**
+   * Parses the header of a data item at a specific offset within a bundle.
+   * Computes the data item ID from the signature and extracts header metadata.
+   *
+   * `@param` bundleId - The ID of the bundle containing the data item
+   * `@param` itemOffset - Byte offset where the data item begins
+   * `@param` totalSize - Total size of the data item (header + payload)
+   * `@param` signal - Optional abort signal to cancel the operation
+   * `@returns` Object containing the computed ID, header size, payload size, and optional content type
+   */
   async parseDataItemHeader(

As per coding guidelines: **/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/ans104-offset-source.ts` around lines 765 - 775, Add a TSDoc comment
for the now-public method parseDataItemHeader explaining its purpose (parses the
data item header at a given bundle offset and returns id, headerSize,
payloadSize, and optional contentType), and document each parameter (bundleId:
string, itemOffset: number, totalSize: number, signal?: AbortSignal) plus the
Promise return shape; follow the style and phrasing used by the other public
methods in this class (e.g., getDataItemOffset, getDataItemOffsetWithPath) so it
clearly states behavior, possible errors, and that the method is used by
RootParentDataSource.
src/data/root-parent-data-source.ts (1)

335-336: Step 0 requires rootTransactionIdHint even when rootPathHint is provided alone.

The hint fast-path is gated on rootTransactionIdHint != null. If a client provides only rootPathHint (which contains the root TX ID as path[0]), Step 0 is skipped entirely and the path hint goes unused.

Consider deriving hintRootTxId from the path when the explicit hint is absent:

💡 Suggested fix
-      const hintRootTxId = requestAttributes?.rootTransactionIdHint;
-      if (hintRootTxId != null) {
+      const hintPath = requestAttributes?.rootPathHint;
+      const hintRootTxId =
+        hintPath?.[0] ?? requestAttributes?.rootTransactionIdHint;
+      if (hintRootTxId != null) {
         // Step 0a: Direct item offset hint ...

Based on learnings: "Path always wins" — the path hint should take precedence and be usable even without a separate rootTransactionIdHint.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/root-parent-data-source.ts` around lines 335 - 336, The fast-path is
skipped when rootTransactionIdHint is null even if
requestAttributes.rootPathHint contains the root TX id; derive hintRootTxId from
the path when the explicit hint is absent so the path hint can be used. Update
the logic around hintRootTxId (the variable and its use in the Step 0 fast-path)
to set hintRootTxId to requestAttributes.rootTransactionIdHint if present,
otherwise to requestAttributes.rootPathHint.path[0] (with safe null checks), and
keep the existing conditional (if hintRootTxId != null) so the path-derived id
triggers Step 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/data/ans104-offset-source.ts`:
- Around line 765-775: Add a TSDoc comment for the now-public method
parseDataItemHeader explaining its purpose (parses the data item header at a
given bundle offset and returns id, headerSize, payloadSize, and optional
contentType), and document each parameter (bundleId: string, itemOffset: number,
totalSize: number, signal?: AbortSignal) plus the Promise return shape; follow
the style and phrasing used by the other public methods in this class (e.g.,
getDataItemOffset, getDataItemOffsetWithPath) so it clearly states behavior,
possible errors, and that the method is used by RootParentDataSource.

In `@src/data/root-parent-data-source.test.ts`:
- Around line 1916-1976: Add a default mock of getDataItemOffsetWithPath to the
test suite setup so it's consistently present like getDataItemOffset and
parseDataItemHeader; update the beforeEach to assign (ans104OffsetSource as
any).getDataItemOffsetWithPath = mock.fn(async () => {/* noop or default return
*/}) and leave individual tests (like the one using hintPath) free to override
that mock implementation as needed; this ensures tests won't fail when they
expect the method to exist and keeps mocking consistent across the suite.

In `@src/data/root-parent-data-source.ts`:
- Around line 335-336: The fast-path is skipped when rootTransactionIdHint is
null even if requestAttributes.rootPathHint contains the root TX id; derive
hintRootTxId from the path when the explicit hint is absent so the path hint can
be used. Update the logic around hintRootTxId (the variable and its use in the
Step 0 fast-path) to set hintRootTxId to requestAttributes.rootTransactionIdHint
if present, otherwise to requestAttributes.rootPathHint.path[0] (with safe null
checks), and keep the existing conditional (if hintRootTxId != null) so the
path-derived id triggers Step 0.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2bcb2dbe-e0ea-4d32-a8e6-c9c25a79c1f8

📥 Commits

Reviewing files that changed from the base of the PR and between da402d2 and 9743de7.

📒 Files selected for processing (3)
  • src/data/ans104-offset-source.ts
  • src/data/root-parent-data-source.test.ts
  • src/data/root-parent-data-source.ts

djwhitt and others added 2 commits March 7, 2026 11:06
The direct-offset hint branch was omitting contentType when caching
attributes, causing later cache hits to potentially serve data with the
wrong content type. This matches the pattern already used in the
path/linear hint branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When only X-AR-IO-Root-Path is provided (without X-AR-IO-Root-TX-ID),
extract the root TX ID from path[0] to enable the Step 0 hint fast path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@djwhitt djwhitt merged commit 047d413 into develop Mar 7, 2026
4 checks passed
@djwhitt djwhitt deleted the PE-8987-root-tx-hint-headers branch March 7, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants