feat: client-supplied root TX ID and path hint headers#628
Conversation
Allow clients to supply X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path request headers to bypass server-side index lookups. Hints are tried first in RootParentDataSource; failures silently fall through to normal resolution. Includes fetch-with-hint CLI tool that resolves hints via GraphQL and fetches data items with the hint headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@coderabbitai Please review this. |
|
✅ Actions performedReview triggered.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds three new client-supplied root hint headers and types, a hint-first fast-path in RootParentDataSource.getData with unified region calculation and conditional caching, exposes ANS-104 header parsing, a CLI tool to emit the new headers when fetching raw items, and extensive tests for hint behaviors. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as Server/Handler
participant RPDS as RootParentDataSource
participant Ans104 as Ans104OffsetSource
participant Store as DataAttributesStore
participant Gateway as Gateway/RawFetch
Client->>Handler: GET /raw/{id} + X-AR-IO-Root-* headers
Handler->>Handler: parseRequestAttributes()
Handler->>RPDS: getData(attributes)
alt Hint-based fast-path
RPDS->>Ans104: parse header / resolve offsets (direct or via path)
Ans104-->>RPDS: offsets, headerSize, contentType, id
RPDS->>Store: tryCacheAttributes(context: hint-derived)
RPDS->>Gateway: GET raw/{bundleId}?region=calculated
Gateway-->>RPDS: data + contentType
RPDS-->>Handler: return data (propagated contentType)
else Hints absent/fail (legacy)
RPDS->>RPDS: traverseToRootUsingAttributes() (legacy traversal)
RPDS->>Store: setDataAttributes(...) (if discovered)
RPDS->>Gateway: GET via computed offsets
Gateway-->>RPDS: data
RPDS-->>Handler: return data
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
src/routes/data/handlers.ts (1)
510-543: Document the new hint-header contract ongetRequestAttributes().This export now owns the parsing and silent-drop behavior for
X-AR-IO-Root-Transaction-IdandX-AR-IO-Root-Path, but callers still have to infer that contract from implementation and tests.As per coding guidelines,
**/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code to enhance documentation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routes/data/handlers.ts` around lines 510 - 543, Add TSDoc to document the new hint-header contract on getRequestAttributes: explain that getRequestAttributes parses X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path from request headers, validates them with isValidTxId, silently drops invalid values, and returns optional rootTransactionIdHint and rootPathHint properties; mention header names (headerNames.rootTransactionId, headerNames.rootPath), the formats accepted (single TX ID and comma-separated TX IDs), and that callers should not rely on errors for malformed hints but check for presence of the returned hint fields instead.src/data/root-parent-data-source.ts (1)
271-403: Add TSDoc for the hint fast path.This block changes
getData()'s contract in a non-obvious way: client hints now take precedence, may write cached offsets, may overridesourceContentType, and silently fall back on lookup failure. Please document that behavior ongetDataor the class so callers know when hints are used and when they are ignored.As per coding guidelines,
**/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code to enhance documentation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/data/root-parent-data-source.ts` around lines 271 - 403, Add/update TSDoc on the getData method (or the containing class) to clearly document the hint fast-path behavior: that requestAttributes.rootTransactionIdHint (and optional rootPathHint) are tried first and take precedence over normal lookup, that successful hint resolution will cache offsets via dataAttributesStore.setDataAttributes, may override the returned sourceContentType with the bundle's contentType, and that failures in hint resolution silently fall back to the normal resolution path (including logging but not throwing). Reference the hint-related calls in the doc (ans104OffsetSource.getDataItemOffset / getDataItemOffsetWithPath, dataAttributesStore.setDataAttributes, and the final dataSource.getData call) so callers know when hints are used, when they are ignored, and the implications for caching and content-type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 274-401: The current broad try/catch around hint resolution
catches not only the ans104 offset lookup but also later logic (region bounds
checks and this.dataSource.getData), causing real errors to be swallowed; narrow
the try so only the calls to this.ans104OffsetSource.getDataItemOffsetWithPath /
getDataItemOffset (the lookup that produces bundleParseResult) are wrapped and
caught, logging a hint-miss and leaving bundleParseResult null, while letting
any errors after bundleParseResult is set (region calculations,
this.dataAttributesStore.setDataAttributes, and the this.dataSource.getData
call) propagate normally so they aren’t converted into silent hint-miss
fallbacks.
- Around line 348-365: The code in root-parent-data-source.ts that builds
finalRegion (variables region, offset, finalRegion) incorrectly leaves
finalRegion.size as offset.size when region.offset is provided but region.size
is omitted, causing reads past the item; change the logic so that when
region.offset !== undefined and region.size is undefined you set
finalRegion.size to the remaining bytes (offset.size - region.offset), ensuring
it is clamped >= 0, and keep the existing checks that throw when region.offset
>= offset.size and adjust requestedEnd when both offset and size are provided.
In `@src/routes/data/handlers.ts`:
- Around line 519-529: The handler currently accepts rootPathHint whenever all
segments are valid TX IDs without checking rootTransactionIdHint, which allows
conflicting hints; update the parsing logic (around rawRootPathHint /
rootPathHint and using isValidTxId) to detect when rootTransactionIdHint is
present and the inferred root from rootPathHint (e.g., its first TX ID) does not
match rootTransactionIdHint, and in that case reject the request at the boundary
(return a 400 error) or clear rootPathHint so the code falls back consistently;
ensure you reference rawRootPathHint, rootPathHint, rootTransactionIdHint and
isValidTxId when adding the guard.
In `@tools/lib/fetch-with-hint.ts`:
- Around line 44-67: resolveRootPath is adding the leaf dataItemId into the
chain, producing paths like [root, dataItemId]; change the traversal so you
append the parent/bundle id (the next hop) to chain rather than the currentId
(leaf) before moving upward. Specifically, in the loop that reads bundleId and
updates currentId, push bundleId (or the value you will set currentId to) into
chain instead of pushing the starting dataItemId, so the final path built from
root and chain.reverse() becomes [root, ..., immediateParent]; apply the same
change to the equivalent logic around lines 70-74.
---
Nitpick comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 271-403: Add/update TSDoc on the getData method (or the containing
class) to clearly document the hint fast-path behavior: that
requestAttributes.rootTransactionIdHint (and optional rootPathHint) are tried
first and take precedence over normal lookup, that successful hint resolution
will cache offsets via dataAttributesStore.setDataAttributes, may override the
returned sourceContentType with the bundle's contentType, and that failures in
hint resolution silently fall back to the normal resolution path (including
logging but not throwing). Reference the hint-related calls in the doc
(ans104OffsetSource.getDataItemOffset / getDataItemOffsetWithPath,
dataAttributesStore.setDataAttributes, and the final dataSource.getData call) so
callers know when hints are used, when they are ignored, and the implications
for caching and content-type.
In `@src/routes/data/handlers.ts`:
- Around line 510-543: Add TSDoc to document the new hint-header contract on
getRequestAttributes: explain that getRequestAttributes parses
X-AR-IO-Root-Transaction-Id and X-AR-IO-Root-Path from request headers,
validates them with isValidTxId, silently drops invalid values, and returns
optional rootTransactionIdHint and rootPathHint properties; mention header names
(headerNames.rootTransactionId, headerNames.rootPath), the formats accepted
(single TX ID and comma-separated TX IDs), and that callers should not rely on
errors for malformed hints but check for presence of the returned hint fields
instead.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 65b29254-2b7c-4bad-a785-17f293ecc427
📒 Files selected for processing (9)
src/constants.tssrc/data/root-parent-data-source.test.tssrc/data/root-parent-data-source.tssrc/routes/data/handlers.test.tssrc/routes/data/handlers.tssrc/types.d.tstools/README.mdtools/fetch-with-hinttools/lib/fetch-with-hint.ts
| // Parse client-supplied root path hint (comma-separated TX IDs) | ||
| const rawRootPathHint = req.headers[headerNames.rootPath.toLowerCase()] as | ||
| | string | ||
| | undefined; | ||
| let rootPathHint: string[] | undefined; | ||
| if (rawRootPathHint != null) { | ||
| const parts = rawRootPathHint.split(',').map((s) => s.trim()); | ||
| if (parts.length > 0 && parts.every(isValidTxId)) { | ||
| rootPathHint = parts; | ||
| } | ||
| } |
There was a problem hiding this comment.
Reject conflicting root/path hints at the request boundary.
rootPathHint is accepted whenever every segment is a valid TX ID, even if rootTransactionIdHint points at a different root. That lets an inconsistent pair through to the fast path instead of cleanly falling back, and downstream resolution can end up pairing a path-derived offset with the wrong root transaction.
🛡️ Suggested guard
let rootPathHint: string[] | undefined;
if (rawRootPathHint != null) {
const parts = rawRootPathHint.split(',').map((s) => s.trim());
- if (parts.length > 0 && parts.every(isValidTxId)) {
+ if (
+ parts.length > 0 &&
+ parts.every(isValidTxId) &&
+ (rootTransactionIdHint == null || parts[0] === rootTransactionIdHint)
+ ) {
rootPathHint = parts;
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/routes/data/handlers.ts` around lines 519 - 529, The handler currently
accepts rootPathHint whenever all segments are valid TX IDs without checking
rootTransactionIdHint, which allows conflicting hints; update the parsing logic
(around rawRootPathHint / rootPathHint and using isValidTxId) to detect when
rootTransactionIdHint is present and the inferred root from rootPathHint (e.g.,
its first TX ID) does not match rootTransactionIdHint, and in that case reject
the request at the boundary (return a 400 error) or clear rootPathHint so the
code falls back consistently; ensure you reference rawRootPathHint,
rootPathHint, rootTransactionIdHint and isValidTxId when adding the guard.
There was a problem hiding this comment.
Not adding handler-level validation for this. When a path hint is provided, the data source uses hintPath[0] as the resolved root TX ID, effectively ignoring any standalone rootTransactionIdHint. This makes conflicting hints a non-issue — the path always wins.
There was a problem hiding this comment.
@djwhitt, understood — thanks for the clarification. Since RootParentDataSource derives the root TX ID from hintPath[0] whenever a path hint is present, the standalone rootTransactionIdHint is superseded and the conflict is naturally resolved without any handler-level guard. I'll note this for future reviews.
✏️ Learnings added
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 457
File: src/routes/data/handlers.ts:429-433
Timestamp: 2025-07-30T18:32:44.622Z
Learning: In src/routes/data/handlers.ts, the ID validation logic uses AND operators intentionally to only reject IDs that match the regex pattern /^[a-zA-Z0-9-_]{43}$/ but fail base64url round-trip validation. IDs that are null or don't match the regex pattern are allowed to proceed further in the pipeline rather than being immediately rejected with "Invalid ID", indicating different error handling is desired for different types of invalid IDs.
There was a problem hiding this comment.
Pull request overview
Adds an optional “hint fast-path” for resolving a data item’s root transaction and nesting path via two new request headers, and provides a CLI helper to generate and use those hints.
Changes:
- Introduces
X-AR-IO-Root-Transaction-IdandX-AR-IO-Root-Pathrequest headers, parsed intoRequestAttributes. - Updates
RootParentDataSource.getData()to attempt hint-based offset resolution before normal traversal and cache discovered offsets. - Adds
tools/fetch-with-hint+ documentation to resolve hint paths via GraphQLbundledIntraversal and fetch from the gateway with hint headers.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/lib/fetch-with-hint.ts | New CLI implementation: resolves root/path via GraphQL and fetches /raw/:id with hint headers. |
| tools/fetch-with-hint | Bash wrapper to run the TypeScript CLI via register.js. |
| tools/README.md | Documents the new CLI tool and its options. |
| src/types.d.ts | Extends RequestAttributes with rootTransactionIdHint and rootPathHint. |
| src/constants.ts | Adds headerNames.rootPath constant. |
| src/routes/data/handlers.ts | Parses/validates hint headers into RequestAttributes. |
| src/routes/data/handlers.test.ts | Unit tests for hint header parsing behavior. |
| src/data/root-parent-data-source.ts | Implements hint-based fast-path resolution and caching in getData(). |
| src/data/root-parent-data-source.test.ts | Unit tests for hint-resolution success/fallback behavior and caching. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (hintPath && hintPath.length > 0) { | ||
| bundleParseResult = | ||
| await this.ans104OffsetSource.getDataItemOffsetWithPath( | ||
| id, | ||
| hintPath, | ||
| signal, | ||
| ); | ||
| } else { | ||
| bundleParseResult = await this.ans104OffsetSource.getDataItemOffset( | ||
| id, | ||
| hintRootTxId, | ||
| signal, |
There was a problem hiding this comment.
When rootPathHint is provided, getDataItemOffsetWithPath() uses path[0] as the root bundle ID, but this code still fetches and caches using hintRootTxId. If the client sends inconsistent headers (or rootPathHint[0] differs), offsets will be computed against one root bundle and bytes will be fetched/cached against another, returning incorrect data and poisoning the attributes cache. Consider deriving the effective root TX from hintPath[0] when a path is present (and/or validating it matches hintRootTxId; otherwise ignore the hint and fall through).
There was a problem hiding this comment.
Fixed in commit 6fb2855 — the code now consistently uses hintPath[0] as the root TX ID when a path hint is present, and falls back to hintRootTxId only when no path is provided.
There was a problem hiding this comment.
Additionally addressed in commit 9743de7 — the direct byte offset hint path (Step 0a) now computes SHA-256(signature) from the parsed data item header and verifies it matches the requested ID before serving data or caching offsets. This prevents cache poisoning from hints that point to a valid but wrong data item. On mismatch, the code logs and falls through to the next resolution path.
| // Path is [rootTxId, ...intermediate bundles] (root first) | ||
| const path = [currentId, ...chain.reverse()]; | ||
| return { rootTxId: currentId, path }; | ||
| } | ||
|
|
||
| chain.push(currentId); | ||
| if (verbose) { |
There was a problem hiding this comment.
resolveRootPath() currently pushes currentId into chain, which makes the returned path include the leaf dataItemId (and generally all children), but getDataItemOffsetWithPath expects a path of bundle TX IDs from the root to the immediate parent bundle (excluding the data item itself). As-is, the generated X-AR-IO-Root-Path won’t match the gateway’s expected path structure. Consider collecting parent bundle IDs instead and ensuring the returned path ends at the immediate parent bundle only.
There was a problem hiding this comment.
Fixed in commit 6fb2855 — the leaf item is now excluded from the resolved path.
| describe('hint-based resolution', () => { | ||
| it('should use hint with valid offset result and skip normal traversal', async () => { | ||
| const dataItemId = 'hint-data-item'; | ||
| const hintRootTxId = 'hint-root-tx'; | ||
| const dataStream = Readable.from([Buffer.from('hint data')]); | ||
|
|
There was a problem hiding this comment.
The hint-based resolution tests don’t cover the case where rootTransactionIdHint and rootPathHint[0] disagree. Given the implementation currently uses rootPathHint for offset discovery but rootTransactionIdHint for fetch/caching, adding a regression test for mismatched hints (expecting fallback to normal traversal) would help prevent incorrect data/caching bugs.
There was a problem hiding this comment.
Not adding this test. When a path hint is provided, the data source consistently uses hintPath[0] as the root TX ID — the standalone rootTransactionIdHint is effectively ignored. This makes a "mismatch" between the two a non-issue by design, so testing for it wouldn't exercise meaningful behavior.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #628 +/- ##
===========================================
+ Coverage 76.50% 76.65% +0.14%
===========================================
Files 104 104
Lines 35437 35572 +135
Branches 2565 2587 +22
===========================================
+ Hits 27110 27266 +156
+ Misses 8293 8271 -22
- Partials 34 35 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Allow clients to supply X-AR-IO-Root-Data-Offset and X-AR-IO-Root-Data-Size headers alongside X-AR-IO-Root-Transaction-Id to skip bundle parsing entirely, enabling the fastest possible data item resolution path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract calculateFinalRegion and tryCacheAttributes helpers to eliminate 4x region calculation and 5x attribute storage duplication - Fix premature caching of untrusted client hints (move to after fetch) - Change direct byte hint from data payload offset to item offset so the gateway can parse the ANS-104 header to extract content type and compute the data offset itself - Group rootDataOffsetHint/rootDataSizeHint into rootByteHint to enforce co-requirement at the type level - Replace local parseNonNegativeInt with existing utility from http-utils - Remove duplicate rootDataOffset key in headerNames constants - Eliminate redundant attribute write-back for pre-computed values - Eliminate double getDataAttributes fetch by passing prefetched result to traversal - Make Ans104OffsetSource.parseDataItemHeader public for direct use - Rename hint headers to X-AR-IO-Root-Item-Offset/Size Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Parse client-supplied root TX ID hint | ||
| const rawRootTxIdHint = req.headers[ | ||
| headerNames.rootTransactionId.toLowerCase() | ||
| ] as string | undefined; | ||
| const rootTransactionIdHint = | ||
| rawRootTxIdHint != null && isValidTxId(rawRootTxIdHint) | ||
| ? rawRootTxIdHint | ||
| : undefined; |
There was a problem hiding this comment.
req.headers[...] can be a string | string[] | undefined in Node/Express. Casting to string and then calling isValidTxId() can throw at runtime if the header is sent multiple times (array), since string[].match is undefined. Consider normalizing header values (e.g., take the first element when it’s an array) before validation.
There was a problem hiding this comment.
Fixed — header values are now normalized with Array.isArray(v) ? v[0] : v for all four hint header reads.
| // Parse client-supplied root path hint (comma-separated TX IDs) | ||
| const rawRootPathHint = req.headers[headerNames.rootPath.toLowerCase()] as | ||
| | string | ||
| | undefined; | ||
| let rootPathHint: string[] | undefined; | ||
| if (rawRootPathHint != null) { | ||
| const parts = rawRootPathHint.split(',').map((s) => s.trim()); | ||
| if (parts.length > 0 && parts.every(isValidTxId)) { |
There was a problem hiding this comment.
Same header normalization issue for X-AR-IO-Root-Path: if the header value is an array, rawRootPathHint.split(',') will throw. Normalize to a single string before splitting (or handle arrays by joining).
There was a problem hiding this comment.
Fixed — same normalization applied to all hint headers including root path.
tools/lib/fetch-with-hint.ts
Outdated
| // Path is [rootTxId, ...intermediate bundles] (root first) | ||
| const path = [currentId, ...chain.reverse()]; | ||
| return { rootTxId: currentId, path }; | ||
| } | ||
|
|
||
| chain.push(currentId); |
There was a problem hiding this comment.
resolveRootPath() builds path from chain.reverse(), but chain currently includes the starting dataItemId (because chain.push(currentId) happens before advancing). This makes the emitted X-AR-IO-Root-Path include the data item ID, which does not match Ans104OffsetSource.getDataItemOffsetWithPath()’s expected structure [rootTxId, ..., immediateParentBundle] (excluding the data item). Adjust the chain/path construction to only include bundle transaction IDs and end at the immediate parent bundle.
| // Path is [rootTxId, ...intermediate bundles] (root first) | |
| const path = [currentId, ...chain.reverse()]; | |
| return { rootTxId: currentId, path }; | |
| } | |
| chain.push(currentId); | |
| // Path is [rootTxId, ...intermediate bundles] (root first), excluding the data item. | |
| let path: string[]; | |
| if (chain.length === 0) { | |
| // Unbundled case: the starting ID is already the root transaction. | |
| path = [currentId]; | |
| } else { | |
| // Bundled case: chain already contains [immediateParentBundle, ..., rootTxId]. | |
| path = chain.reverse(); | |
| } | |
| return { rootTxId: currentId, path }; | |
| } | |
| // Track only bundle transaction IDs (exclude the starting data item). | |
| chain.push(bundleId); |
There was a problem hiding this comment.
Fixed in commit 6fb2855 — the leaf item is now excluded from the resolved path.
| case '--offset': | ||
| directOffset = parseInt(args[++i], 10); | ||
| break; | ||
| case '--size': | ||
| directSize = parseInt(args[++i], 10); | ||
| break; |
There was a problem hiding this comment.
--offset/--size parsing uses parseInt(), but if the user passes a non-numeric value, the result is NaN (still != null). That will incorrectly activate “direct offset mode” and send X-AR-IO-Root-Item-Offset/Size: NaN. Validate with Number.isFinite(...) (or reuse parseNonNegativeInt) and require both to be valid numbers before setting the direct offset headers.
| case '--offset': | |
| directOffset = parseInt(args[++i], 10); | |
| break; | |
| case '--size': | |
| directSize = parseInt(args[++i], 10); | |
| break; | |
| case '--offset': { | |
| const offsetArg = args[++i]; | |
| const parsedOffset = parseInt(offsetArg, 10); | |
| if (!Number.isFinite(parsedOffset) || parsedOffset < 0) { | |
| console.error( | |
| `Invalid value for --offset: "${offsetArg}". Expected a non-negative integer.`, | |
| ); | |
| usage(); | |
| return; | |
| } | |
| directOffset = parsedOffset; | |
| break; | |
| } | |
| case '--size': { | |
| const sizeArg = args[++i]; | |
| const parsedSize = parseInt(sizeArg, 10); | |
| if (!Number.isFinite(parsedSize) || parsedSize < 0) { | |
| console.error( | |
| `Invalid value for --size: "${sizeArg}". Expected a non-negative integer.`, | |
| ); | |
| usage(); | |
| return; | |
| } | |
| directSize = parsedSize; | |
| break; | |
| } |
There was a problem hiding this comment.
Not fixing — this is a dev-only CLI tool for testing hint resolution, not a user-facing utility. The added validation isn't worth the churn.
| --root-tx-id <id> Root TX ID (skips GraphQL resolution when used with --offset/--size) | ||
| --verbose Show resolution details | ||
| --help Show this help`); | ||
| process.exit(1); |
There was a problem hiding this comment.
usage() always exits with status code 1, including for --help. Consider exiting 0 for the help path so scripting/CI can treat help output as non-error.
| process.exit(1); | |
| const argv = process.argv || []; | |
| const hasHelpFlag = argv.includes('--help') || argv.includes('-h'); | |
| process.exit(hasHelpFlag ? 0 : 1); |
There was a problem hiding this comment.
Not fixing — dev-only CLI tool, not worth the churn for a help exit code.
| // Parse client-supplied root item offset and size hints (both required together) | ||
| const rawRootItemOffsetHint = req.headers[ | ||
| headerNames.rootItemOffset.toLowerCase() | ||
| ] as string | undefined; | ||
| const rawRootItemSizeHint = req.headers[ | ||
| headerNames.rootItemSize.toLowerCase() | ||
| ] as string | undefined; | ||
| const rootItemOffsetParsed = | ||
| rawRootItemOffsetHint != null | ||
| ? parseNonNegativeInt(rawRootItemOffsetHint) | ||
| : undefined; | ||
| const rootItemSizeParsed = | ||
| rawRootItemSizeHint != null | ||
| ? parseNonNegativeInt(rawRootItemSizeHint) | ||
| : undefined; |
There was a problem hiding this comment.
X-AR-IO-Root-Item-Offset / X-AR-IO-Root-Item-Size have the same string | string[] header-value issue: if multiple headers are sent, parseNonNegativeInt() will receive a non-string at runtime (after the cast) and throw when calling .trim(). Normalize these header values to a single string before parsing.
There was a problem hiding this comment.
Fixed — same Array.isArray normalization applied to offset and size headers as well.
src/data/root-parent-data-source.ts
Outdated
| const data = await this.dataSource.getData({ | ||
| id: hintRootTxId, | ||
| requestAttributes, | ||
| region: finalRegion, | ||
| parentSpan: span, |
There was a problem hiding this comment.
When a path hint is provided, getDataItemOffsetWithPath() computes offsets relative to hintPath[0] (root bundle ID), but the subsequent fetch uses id: hintRootTxId. If rootTransactionIdHint and rootPathHint[0] ever disagree (client mistake/malicious), this can fetch bytes from the wrong root TX. Consider either validating hintPath[0] === hintRootTxId (otherwise ignore the path hint) or using hintPath[0] consistently for both offset lookup and dataSource.getData/caching.
There was a problem hiding this comment.
Fixed in commit 6fb2855 — the code now consistently uses hintPath[0] as the root TX ID when a path hint is present.
src/data/root-parent-data-source.ts
Outdated
| offset: dataOffset + (region.offset || 0), | ||
| size: region.size || dataSize, |
There was a problem hiding this comment.
calculateFinalRegion() uses region.size || dataSize, which will treat a valid region.size of 0 as falsy and return the full dataSize instead. Use nullish coalescing (??) or explicit undefined checks so 0 is preserved (and similarly avoid region.offset || 0 unless you really want to treat 0 as unset).
| offset: dataOffset + (region.offset || 0), | |
| size: region.size || dataSize, | |
| offset: dataOffset + (region.offset ?? 0), | |
| size: region.size ?? dataSize, |
There was a problem hiding this comment.
Fixed — changed || to ?? so that valid 0 values for offset and size are not treated as falsy.
…f from path - Narrow try/catch in hint resolution to only wrap offset lookups, letting region calculation and data fetch errors propagate normally - Derive root TX ID from most-specific hint (path[0] over header hint) to prevent mismatched offset/fetch when both hints are provided - Skip pushing leaf data item into resolveRootPath chain so the path matches the [root, ...parents] format expected by getDataItemOffsetWithPath Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
tools/lib/fetch-with-hint.ts (1)
42-46: Please add TSDoc to the new helpers.
resolveRootPath()andmain()are the core entry points in this tool, but neither documents expected arguments, modes, or failure behavior.As per coding guidelines, "Add or improve TSDoc comments when modifying code to enhance documentation".
Also applies to: 85-214
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/lib/fetch-with-hint.ts` around lines 42 - 46, Add TSDoc comments to both resolveRootPath and main: document each parameter (graphqlUrl, dataItemId, verbose for resolveRootPath; main's args/modes), describe the return type (Promise with rootTxId and path), list possible failure behaviors and exceptions thrown/returned on error, and note side effects (network calls, logging, process exit) and expected modes of operation; include brief examples or usage notes and reference the functions resolveRootPath and main so maintainers can quickly understand inputs, outputs, and error semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 85-101: The code builds finalRegion but when region.offset is
provided and region.size is omitted it incorrectly uses size = dataSize and can
over-read; update the logic that computes finalRegion (using symbols
finalRegion, region, dataSize, dataOffset) so that if region.offset is defined
and region.size is undefined you set finalRegion.size = dataSize - region.offset
(clamped to >= 0), and unify/truncate sizing logic so both the case where
region.size is provided and the open-ended case apply the same truncation rules
to prevent reads past the item boundary.
In `@tools/lib/fetch-with-hint.ts`:
- Around line 97-117: The flag-parsing loop (the for loop over args) can accept
missing or invalid values and let directOffset, directSize, and directRootTxId
become undefined or NaN; update the parsing in the switch cases for '--offset',
'--size', '--root-tx-id', '--gateway', '--graphql', and '--output' to validate
that an argument exists after the flag, that numeric flags ('--offset' and
'--size') parse to a finite integer (use Number.isFinite/Number.isInteger or
equivalent) and that string flags are non-empty; on invalid or missing values
throw a clear error or exit with a message before headers are built so invalid
headers aren't emitted; after changes run yarn lint:fix on the file to satisfy
formatting.
- Around line 24-39: Change the usage() helper to accept an optional exitCode
parameter (e.g., usage(exitCode = 1)) and call process.exit(exitCode) instead of
unconditionally process.exit(1); update any callers that print help (the --help
handling in the CLI argument parsing) to call usage(0) so `fetch-with-hint
--help` exits with code 0 while other error paths still call usage() or
usage(1). Ensure the function signature and all invocations referencing usage()
and the process.exit call are updated accordingly.
- Around line 198-212: The download stream errors aren't propagated because
response.data errors aren't wired to the write stream or stdout; update the code
in the outputFile branch (where writeStream is created) and the stdout branch
(where response.data is piped to process.stdout) to use pipeline from
node:stream/promises with response.data as the source and writeStream or
process.stdout as the destination so any stream errors reject the awaiting
promise; ensure you import pipeline and replace the manual event listeners
around writeStream and response.data with await pipeline(response.data,
writeStream) and await pipeline(response.data, process.stdout) respectively,
preserving the verbose log after a successful pipeline.
---
Nitpick comments:
In `@tools/lib/fetch-with-hint.ts`:
- Around line 42-46: Add TSDoc comments to both resolveRootPath and main:
document each parameter (graphqlUrl, dataItemId, verbose for resolveRootPath;
main's args/modes), describe the return type (Promise with rootTxId and path),
list possible failure behaviors and exceptions thrown/returned on error, and
note side effects (network calls, logging, process exit) and expected modes of
operation; include brief examples or usage notes and reference the functions
resolveRootPath and main so maintainers can quickly understand inputs, outputs,
and error semantics.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5c900e3d-b736-4e0f-a7b9-5f72a43a99b8
📒 Files selected for processing (2)
src/data/root-parent-data-source.tstools/lib/fetch-with-hint.ts
| const finalRegion: Region = { | ||
| offset: dataOffset + (region.offset || 0), | ||
| size: region.size || dataSize, | ||
| }; | ||
|
|
||
| if (region.offset !== undefined && region.offset >= dataSize) { | ||
| throw new Error( | ||
| `Requested region offset ${region.offset} exceeds data item size ${dataSize}`, | ||
| ); | ||
| } | ||
|
|
||
| if (region.size !== undefined && region.offset !== undefined) { | ||
| const requestedEnd = region.offset + region.size; | ||
| if (requestedEnd > dataSize) { | ||
| finalRegion.size = dataSize - region.offset; | ||
| } | ||
| } |
There was a problem hiding this comment.
Open-ended regions still over-read past the item boundary.
When region.offset is provided but region.size is omitted, line 87 sets finalRegion.size to dataSize. The truncation logic (lines 96-101) only executes when both region.size and region.offset are defined, so the size remains incorrect.
Example: dataSize=100, region={offset:10} → returns size=100 instead of 90, causing a read past the item boundary.
🐛 Proposed fix
const finalRegion: Region = {
offset: dataOffset + (region.offset || 0),
- size: region.size || dataSize,
+ size:
+ region.size !== undefined
+ ? region.size
+ : region.offset !== undefined
+ ? dataSize - region.offset
+ : dataSize,
};
if (region.offset !== undefined && region.offset >= dataSize) {
throw new Error(
`Requested region offset ${region.offset} exceeds data item size ${dataSize}`,
);
}
- if (region.size !== undefined && region.offset !== undefined) {
+ // Truncate if requested end exceeds data item bounds
+ if (region.size !== undefined) {
- const requestedEnd = region.offset + region.size;
+ const requestedEnd = (region.offset || 0) + region.size;
if (requestedEnd > dataSize) {
- finalRegion.size = dataSize - region.offset;
+ finalRegion.size = dataSize - (region.offset || 0);
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const finalRegion: Region = { | |
| offset: dataOffset + (region.offset || 0), | |
| size: region.size || dataSize, | |
| }; | |
| if (region.offset !== undefined && region.offset >= dataSize) { | |
| throw new Error( | |
| `Requested region offset ${region.offset} exceeds data item size ${dataSize}`, | |
| ); | |
| } | |
| if (region.size !== undefined && region.offset !== undefined) { | |
| const requestedEnd = region.offset + region.size; | |
| if (requestedEnd > dataSize) { | |
| finalRegion.size = dataSize - region.offset; | |
| } | |
| } | |
| const finalRegion: Region = { | |
| offset: dataOffset + (region.offset || 0), | |
| size: | |
| region.size !== undefined | |
| ? region.size | |
| : region.offset !== undefined | |
| ? dataSize - region.offset | |
| : dataSize, | |
| }; | |
| if (region.offset !== undefined && region.offset >= dataSize) { | |
| throw new Error( | |
| `Requested region offset ${region.offset} exceeds data item size ${dataSize}`, | |
| ); | |
| } | |
| // Truncate if requested end exceeds data item bounds | |
| if (region.size !== undefined) { | |
| const requestedEnd = (region.offset || 0) + region.size; | |
| if (requestedEnd > dataSize) { | |
| finalRegion.size = dataSize - (region.offset || 0); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/data/root-parent-data-source.ts` around lines 85 - 101, The code builds
finalRegion but when region.offset is provided and region.size is omitted it
incorrectly uses size = dataSize and can over-read; update the logic that
computes finalRegion (using symbols finalRegion, region, dataSize, dataOffset)
so that if region.offset is defined and region.size is undefined you set
finalRegion.size = dataSize - region.offset (clamped to >= 0), and
unify/truncate sizing logic so both the case where region.size is provided and
the open-ended case apply the same truncation rules to prevent reads past the
item boundary.
There was a problem hiding this comment.
Same as the earlier open-ended regions comment — in practice we never receive offset without size (both are required together in the handler). Not adding clamping for an unreachable case.
There was a problem hiding this comment.
@djwhitt, understood — noted that region at the calculateFinalRegion call site is also always either undefined or has both offset and size set together, making the open-ended case unreachable. I already have this captured from our earlier exchange (the rootByteHint structural guarantee learning), so no new learning needed. Thanks for the clarification!
(ฅ^•ﻌ•^ฅ)
🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.
Learnt from: karlprieb
Repo: ar-io/ar-io-node PR: 228
File: src/workers/ans104-unbundler.ts:41-45
Timestamp: 2024-10-31T17:48:59.599Z
Learning: In `src/workers/ans104-unbundler.ts`, when defining `isNormalizedBundleDataItem`, avoid checking for `null` values of `root_parent_offset` and `data_offset`, as `null` values signify that the item is a `NormalizedOptimisticDataItem`, not a `NormalizedBundleDataItem`.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 457
File: src/routes/data/handlers.ts:429-433
Timestamp: 2025-07-30T18:32:44.622Z
Learning: In src/routes/data/handlers.ts, the ID validation logic uses AND operators intentionally to only reject IDs that match the regex pattern /^[a-zA-Z0-9-_]{43}$/ but fail base64url round-trip validation. IDs that are null or don't match the regex pattern are allowed to proceed further in the pipeline rather than being immediately rejected with "Invalid ID", indicating different error handling is desired for different types of invalid IDs.
| function usage() { | ||
| console.error(`Usage: fetch-with-hint <data-item-id> [options] | ||
|
|
||
| Fetches a data item from the gateway using client-supplied root TX ID | ||
| and nesting path hints resolved via GraphQL. | ||
|
|
||
| Options: | ||
| --gateway <url> Gateway URL (default: http://localhost:4000) | ||
| --graphql <url> GraphQL endpoint (default: https://arweave.net/graphql) | ||
| --output <file> Output file (default: stdout) | ||
| --offset <n> Data item offset hint (byte offset of item start within root TX) | ||
| --size <n> Data item size hint (total item size including headers) | ||
| --root-tx-id <id> Root TX ID (skips GraphQL resolution when used with --offset/--size) | ||
| --verbose Show resolution details | ||
| --help Show this help`); | ||
| process.exit(1); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "fetch-with-hint.ts" -o -name "fetch-with-hint.js"Repository: ar-io/ar-io-node
Length of output: 90
🏁 Script executed:
wc -l tools/lib/fetch-with-hint.tsRepository: ar-io/ar-io-node
Length of output: 92
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | head -150Repository: ar-io/ar-io-node
Length of output: 5152
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | tail -100Repository: ar-io/ar-io-node
Length of output: 3212
🏁 Script executed:
rg "usage\(" tools/lib/fetch-with-hint.tsRepository: ar-io/ar-io-node
Length of output: 127
🏁 Script executed:
head -20 package.jsonRepository: ar-io/ar-io-node
Length of output: 650
🏁 Script executed:
rg "function\s+" tools/lib/fetch-with-hint.ts -A 3 | head -40Repository: ar-io/ar-io-node
Length of output: 417
🏁 Script executed:
grep -A 5 "scripts" package.json | head -20Repository: ar-io/ar-io-node
Length of output: 744
🏁 Script executed:
rg "^\s*\*\s*@" tools/lib/fetch-with-hint.tsRepository: ar-io/ar-io-node
Length of output: 42
🏁 Script executed:
fd "*.ts" tools/ --exec grep -l "^\s*/\*\*" {} \; | head -5Repository: ar-io/ar-io-node
Length of output: 343
🏁 Script executed:
grep "lint" package.jsonRepository: ar-io/ar-io-node
Length of output: 526
🏁 Script executed:
find tools/ -name "*.ts" -exec grep -l "/\*\*" {} \;Repository: ar-io/ar-io-node
Length of output: 734
🏁 Script executed:
cat tools/lib/fetch-with-hint.ts | grep -B 5 "async function resolveRootPath"Repository: ar-io/ar-io-node
Length of output: 203
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '121,128p'Repository: ar-io/ar-io-node
Length of output: 298
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '24,40p'Repository: ar-io/ar-io-node
Length of output: 964
Return success exit code for --help.
The usage() function unconditionally exits with status 1, causing fetch-with-hint --help to report as a failure to shells and CI scripts. Add an optional exitCode parameter to allow the --help case to exit with 0.
Suggested fix
-function usage() {
+function usage(exitCode = 1) {
console.error(`Usage: fetch-with-hint <data-item-id> [options]
@@
- process.exit(1);
+ process.exit(exitCode);
}
@@
case '--help':
- usage();
+ usage(0);
break;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/lib/fetch-with-hint.ts` around lines 24 - 39, Change the usage() helper
to accept an optional exitCode parameter (e.g., usage(exitCode = 1)) and call
process.exit(exitCode) instead of unconditionally process.exit(1); update any
callers that print help (the --help handling in the CLI argument parsing) to
call usage(0) so `fetch-with-hint --help` exits with code 0 while other error
paths still call usage() or usage(1). Ensure the function signature and all
invocations referencing usage() and the process.exit call are updated
accordingly.
There was a problem hiding this comment.
Not fixing — dev-only CLI tool, not worth the churn for a help exit code.
There was a problem hiding this comment.
@djwhitt, understood — fair call for a dev-only tool.
🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.
| for (let i = 0; i < args.length; i++) { | ||
| const arg = args[i]; | ||
| switch (arg) { | ||
| case '--gateway': | ||
| gateway = args[++i]; | ||
| break; | ||
| case '--graphql': | ||
| graphqlUrl = args[++i]; | ||
| break; | ||
| case '--output': | ||
| outputFile = args[++i]; | ||
| break; | ||
| case '--offset': | ||
| directOffset = parseInt(args[++i], 10); | ||
| break; | ||
| case '--size': | ||
| directSize = parseInt(args[++i], 10); | ||
| break; | ||
| case '--root-tx-id': | ||
| directRootTxId = args[++i]; | ||
| break; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '90,160p'Repository: ar-io/ar-io-node
Length of output: 2397
Validate direct-hint flags before building headers.
Missing option values currently become undefined, and parseInt() will accept partial numerics or produce NaN. This allows inputs like --offset foo or incomplete --root-tx-id/--offset/--size flags to emit invalid headers (e.g., "NaN" in numeric fields) or silently fall back to GraphQL mode instead of failing fast.
Suggested fix
+ const readOptionValue = (flag: string) => {
+ const value = args[++i];
+ if (value == null || value.startsWith('-')) {
+ throw new Error(`${flag} requires a value`);
+ }
+ return value;
+ };
+
for (let i = 0; i < args.length; i++) {
const arg = args[i];
switch (arg) {
case '--gateway':
- gateway = args[++i];
+ gateway = readOptionValue('--gateway');
break;
case '--graphql':
- graphqlUrl = args[++i];
+ graphqlUrl = readOptionValue('--graphql');
break;
case '--output':
- outputFile = args[++i];
+ outputFile = readOptionValue('--output');
break;
case '--offset':
- directOffset = parseInt(args[++i], 10);
+ directOffset = Number(readOptionValue('--offset'));
+ if (!Number.isInteger(directOffset) || directOffset < 0) {
+ throw new Error('--offset must be a non-negative integer');
+ }
break;
case '--size':
- directSize = parseInt(args[++i], 10);
+ directSize = Number(readOptionValue('--size'));
+ if (!Number.isInteger(directSize) || directSize < 0) {
+ throw new Error('--size must be a non-negative integer');
+ }
break;
case '--root-tx-id':
- directRootTxId = args[++i];
+ directRootTxId = readOptionValue('--root-tx-id');
break;
@@
+ const hasAnyDirectHint =
+ directRootTxId != null || directOffset != null || directSize != null;
+ if (
+ hasAnyDirectHint &&
+ (directRootTxId == null || directOffset == null || directSize == null)
+ ) {
+ throw new Error(
+ '--root-tx-id, --offset, and --size must be provided together',
+ );
+ }After applying fixes, run yarn lint:fix on this file to ensure code formatting compliance.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/lib/fetch-with-hint.ts` around lines 97 - 117, The flag-parsing loop
(the for loop over args) can accept missing or invalid values and let
directOffset, directSize, and directRootTxId become undefined or NaN; update the
parsing in the switch cases for '--offset', '--size', '--root-tx-id',
'--gateway', '--graphql', and '--output' to validate that an argument exists
after the flag, that numeric flags ('--offset' and '--size') parse to a finite
integer (use Number.isFinite/Number.isInteger or equivalent) and that string
flags are non-empty; on invalid or missing values throw a clear error or exit
with a message before headers are built so invalid headers aren't emitted; after
changes run yarn lint:fix on the file to satisfy formatting.
There was a problem hiding this comment.
Not fixing — dev-only CLI tool for testing hint resolution. The added validation and error handling isn't worth the churn.
There was a problem hiding this comment.
@djwhitt, understood — makes sense for an internal testing tool. I'll leave it as-is.
(◕ᴗ◕✿)
🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 457
File: src/routes/data/handlers.ts:429-433
Timestamp: 2025-07-30T18:32:44.622Z
Learning: In src/routes/data/handlers.ts, the ID validation logic uses AND operators intentionally to only reject IDs that match the regex pattern /^[a-zA-Z0-9-_]{43}$/ but fail base64url round-trip validation. IDs that are null or don't match the regex pattern are allowed to proceed further in the pipeline rather than being immediately rejected with "Invalid ID", indicating different error handling is desired for different types of invalid IDs.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.
| if (outputFile) { | ||
| const writeStream = fs.createWriteStream(outputFile); | ||
| response.data.pipe(writeStream); | ||
| await new Promise<void>((resolve, reject) => { | ||
| writeStream.on('finish', resolve); | ||
| writeStream.on('error', reject); | ||
| }); | ||
| if (verbose) { | ||
| console.error(`Written to ${outputFile}`); | ||
| } | ||
| } else { | ||
| response.data.pipe(process.stdout); | ||
| await new Promise<void>((resolve) => { | ||
| response.data.on('end', resolve); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '190,220p'Repository: ar-io/ar-io-node
Length of output: 1097
🏁 Script executed:
head -20 tools/lib/fetch-with-hint.tsRepository: ar-io/ar-io-node
Length of output: 420
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '1,50p'Repository: ar-io/ar-io-node
Length of output: 1841
🏁 Script executed:
rg "import.*pipeline" tools/lib/Repository: ar-io/ar-io-node
Length of output: 136
🏁 Script executed:
cd tools/lib && head -200 fetch-with-hint.ts | tail -80Repository: ar-io/ar-io-node
Length of output: 2028
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '100,145p'Repository: ar-io/ar-io-node
Length of output: 1365
🏁 Script executed:
rg "async function" tools/lib/fetch-with-hint.ts -A 5Repository: ar-io/ar-io-node
Length of output: 431
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '42,47p'Repository: ar-io/ar-io-node
Length of output: 278
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '85,100p'Repository: ar-io/ar-io-node
Length of output: 655
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | wc -lRepository: ar-io/ar-io-node
Length of output: 63
🏁 Script executed:
rg "pipeline" tools/lib/download-cdb64.ts -B 2 -A 8Repository: ar-io/ar-io-node
Length of output: 837
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '85,95p'Repository: ar-io/ar-io-node
Length of output: 509
🏁 Script executed:
git log --oneline tools/lib/fetch-with-hint.ts | head -5Repository: ar-io/ar-io-node
Length of output: 148
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '85,98p'Repository: ar-io/ar-io-node
Length of output: 598
🏁 Script executed:
rg "function main" tools/lib/ -A 3Repository: ar-io/ar-io-node
Length of output: 4730
🏁 Script executed:
cat -n tools/lib/fetch-with-hint.ts | sed -n '213,221p'Repository: ar-io/ar-io-node
Length of output: 272
🏁 Script executed:
cat -n tools/lib/download-cdb64.ts | sed -n '1,15p'Repository: ar-io/ar-io-node
Length of output: 607
Add error handling for response stream failures during file transfer.
The current implementation does not propagate errors from response.data (the read stream). If the download fails mid-transfer, the promise may never reject:
- File case (lines 199–204): Only listens to
writeStreamevents; errors fromresponse.dataare ignored - Stdout case (lines 209–212): No error handler; failures cause silent hangs or incomplete output
Use pipeline() from node:stream/promises to properly handle errors in both directions:
Suggested fix
import fs from 'node:fs';
+import { pipeline } from 'node:stream/promises';
@@
if (outputFile) {
- const writeStream = fs.createWriteStream(outputFile);
- response.data.pipe(writeStream);
- await new Promise<void>((resolve, reject) => {
- writeStream.on('finish', resolve);
- writeStream.on('error', reject);
- });
+ await pipeline(response.data, fs.createWriteStream(outputFile));
if (verbose) {
console.error(`Written to ${outputFile}`);
}
} else {
- response.data.pipe(process.stdout);
- await new Promise<void>((resolve) => {
- response.data.on('end', resolve);
- });
+ await pipeline(response.data, process.stdout);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (outputFile) { | |
| const writeStream = fs.createWriteStream(outputFile); | |
| response.data.pipe(writeStream); | |
| await new Promise<void>((resolve, reject) => { | |
| writeStream.on('finish', resolve); | |
| writeStream.on('error', reject); | |
| }); | |
| if (verbose) { | |
| console.error(`Written to ${outputFile}`); | |
| } | |
| } else { | |
| response.data.pipe(process.stdout); | |
| await new Promise<void>((resolve) => { | |
| response.data.on('end', resolve); | |
| }); | |
| if (outputFile) { | |
| await pipeline(response.data, fs.createWriteStream(outputFile)); | |
| if (verbose) { | |
| console.error(`Written to ${outputFile}`); | |
| } | |
| } else { | |
| await pipeline(response.data, process.stdout); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/lib/fetch-with-hint.ts` around lines 198 - 212, The download stream
errors aren't propagated because response.data errors aren't wired to the write
stream or stdout; update the code in the outputFile branch (where writeStream is
created) and the stdout branch (where response.data is piped to process.stdout)
to use pipeline from node:stream/promises with response.data as the source and
writeStream or process.stdout as the destination so any stream errors reject the
awaiting promise; ensure you import pipeline and replace the manual event
listeners around writeStream and response.data with await
pipeline(response.data, writeStream) and await pipeline(response.data,
process.stdout) respectively, preserving the verbose log after a successful
pipeline.
There was a problem hiding this comment.
Not fixing — dev-only CLI tool for testing hint resolution. Adding stream error handling isn't worth the complexity for a diagnostic utility.
There was a problem hiding this comment.
@djwhitt, understood — makes sense for a dev-only diagnostic tool. Acknowledged.
🧠 Learnings used
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/data/root-parent-data-source.ts:0-0
Timestamp: 2026-03-07T15:50:06.787Z
Learning: In `src/routes/data/handlers.ts` (ar-io/ar-io-node), `rootByteHint` is only constructed when both `rootItemOffset` and `rootItemSize` headers are valid non-negative integers. As a result, `requestAttributes.rootByteHint` is always either `undefined` or a fully-populated `{ offset, size }` object — a partial hint (offset without size, or vice versa) is structurally impossible at the `RootParentDataSource.getData` call site.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 236
File: src/lib/data-root.ts:165-248
Timestamp: 2024-11-18T20:54:18.763Z
Learning: In this project, using `async` functions as Promise executors is acceptable, even if it might lead to unhandled rejections in general cases.
Learnt from: djwhitt
Repo: ar-io/ar-io-node PR: 628
File: src/routes/data/handlers.ts:520-530
Timestamp: 2026-03-07T15:50:07.869Z
Learning: In `src/data/root-parent-data-source.ts`, when both `rootPathHint` and `rootTransactionIdHint` are present in `RequestAttributes`, the data source uses `hintPath[0]` as the resolved root TX ID, effectively ignoring the standalone `rootTransactionIdHint`. "Path always wins" — conflicting hints are not an error at the handler boundary; the path hint takes full precedence in the resolution logic.
Use ?? instead of || in calculateFinalRegion to avoid treating valid 0 values as falsy. Normalize Express header reads with Array.isArray check to handle duplicate headers safely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/data/root-parent-data-source.ts (1)
125-138: Update this TSDoc for the new traversal contract.The signature now accepts
prefetchedAttributesand returnsfromPreComputed, but the docblock still only describes the old traversal/null behavior.As per coding guidelines, "
**/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code to enhance documentation`."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/data/root-parent-data-source.ts` around lines 125 - 138, Update the TSDoc for traverseToRootUsingAttributes to describe the new traversal contract: document the prefetchedAttributes?: ContiguousDataAttributes parameter and explain when it is used, and update the return description to list all returned properties (rootTxId, totalOffset, rootDataOffset, size, and fromPreComputed) and what each means; also clarify the null return case (when traversal is incomplete due to missing attributes) and explicitly state what fromPreComputed=true indicates (that the result came from prefetched/pre-computed attributes rather than live traversal).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 389-399: The precomputed cache writes that call tryCacheAttributes
(the "direct offset hint" call and the separate "root index" precompute write)
are omitting the known contentType, causing later hits to use the root
transaction's type; update both object literals passed to tryCacheAttributes to
include contentType set from the available variable that holds the item's
MIME/type (e.g., contentType or the local hintContentType variable), so the
cache entry persists the correct content type for subsequent reads.
- Around line 347-387: The direct-offset hint branch uses
parseDataItemHeader(hintRootTxId, hintItemOffset, hintItemSize, signal) and then
trusts headerInfo to fetch payload for hintRootTxId without verifying the item
actually corresponds to the requested id; validate the parsed data item's id (or
signature/owner) matches the requested id before computing
dataOffset/finalRegion and calling dataSource.getData so stale/malicious hints
cannot return other payloads. Locate the block that sets headerInfo and where
dataOffset/dataSize/hintContentType are derived (parseDataItemHeader,
headerInfo, calculateFinalRegion, getData) and add a check that compares the
parsed item id with the requested id (or verifies a cryptographic binding) and
bail/fall-through if it does not match, preserving existing logs/tracing and
error handling.
- Around line 335-336: The code currently reads
requestAttributes?.rootTransactionIdHint into hintRootTxId and branches, which
makes rootPathHint a no-op; change the resolution so you first check
requestAttributes?.rootPathHint and, if present, resolve it into the
transaction-id (using the existing path-to-transaction-id helper or the same
resolution logic used elsewhere) and assign that to hintRootTxId, otherwise fall
back to requestAttributes?.rootTransactionIdHint; update the direct-offset
branch (same pattern at the other occurrence) to use this resolved hintRootTxId
so "path" always takes precedence over the explicit transaction-id hint.
---
Nitpick comments:
In `@src/data/root-parent-data-source.ts`:
- Around line 125-138: Update the TSDoc for traverseToRootUsingAttributes to
describe the new traversal contract: document the prefetchedAttributes?:
ContiguousDataAttributes parameter and explain when it is used, and update the
return description to list all returned properties (rootTxId, totalOffset,
rootDataOffset, size, and fromPreComputed) and what each means; also clarify the
null return case (when traversal is incomplete due to missing attributes) and
explicitly state what fromPreComputed=true indicates (that the result came from
prefetched/pre-computed attributes rather than live traversal).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a9d7d5fd-334a-41a0-b789-09df4ce96917
📒 Files selected for processing (2)
src/data/root-parent-data-source.tssrc/routes/data/handlers.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/routes/data/handlers.ts
src/data/root-parent-data-source.ts
Outdated
| const hintRootTxId = requestAttributes?.rootTransactionIdHint; | ||
| if (hintRootTxId != null) { |
There was a problem hiding this comment.
Resolve hintRootTxId from rootPathHint first.
Step 0 is currently gated on rootTransactionIdHint, and the direct-offset branch always uses that value. That makes rootPathHint a no-op when sent alone and bypasses the documented path precedence when both hints are present.
💡 Minimal fix
- const hintRootTxId = requestAttributes?.rootTransactionIdHint;
+ const hintPath = requestAttributes?.rootPathHint;
+ const hintRootTxId =
+ hintPath?.[0] ?? requestAttributes?.rootTransactionIdHint;
if (hintRootTxId != null) {
@@
- const hintPath = requestAttributes?.rootPathHint;
let bundleParseResult: {Also applies to: 418-418
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/data/root-parent-data-source.ts` around lines 335 - 336, The code
currently reads requestAttributes?.rootTransactionIdHint into hintRootTxId and
branches, which makes rootPathHint a no-op; change the resolution so you first
check requestAttributes?.rootPathHint and, if present, resolve it into the
transaction-id (using the existing path-to-transaction-id helper or the same
resolution logic used elsewhere) and assign that to hintRootTxId, otherwise fall
back to requestAttributes?.rootTransactionIdHint; update the direct-offset
branch (same pattern at the other occurrence) to use this resolved hintRootTxId
so "path" always takes precedence over the explicit transaction-id hint.
Compute SHA-256(signature) from the parsed data item header and compare it against the requested ID before serving data or caching offsets. This prevents a malicious or buggy client from poisoning the offset cache by supplying hints that point to a different data item. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (3)
src/data/root-parent-data-source.test.ts (1)
1916-1976: Consider addinggetDataItemOffsetWithPathtobeforeEachfor consistency.The method is dynamically added via
(ans104OffsetSource as any).getDataItemOffsetWithPath = mock.fn(...)in this test. While this works, it's inconsistent with howgetDataItemOffsetandparseDataItemHeaderare set up inbeforeEach. If other tests depend on this method existing, they could fail unexpectedly.💡 Suggested improvement
Add
getDataItemOffsetWithPathto thebeforeEachsetup:ans104OffsetSource = { getDataItemOffset: mock.fn(), parseDataItemHeader: mock.fn(), + getDataItemOffsetWithPath: mock.fn(), } as any;Then mock its implementation in individual tests as needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/data/root-parent-data-source.test.ts` around lines 1916 - 1976, Add a default mock of getDataItemOffsetWithPath to the test suite setup so it's consistently present like getDataItemOffset and parseDataItemHeader; update the beforeEach to assign (ans104OffsetSource as any).getDataItemOffsetWithPath = mock.fn(async () => {/* noop or default return */}) and leave individual tests (like the one using hintPath) free to override that mock implementation as needed; this ensures tests won't fail when they expect the method to exist and keeps mocking consistent across the suite.src/data/ans104-offset-source.ts (1)
765-775: Add TSDoc comment for the now-public method.Since
parseDataItemHeaderis now public and will be called byRootParentDataSource, it should have a TSDoc comment describing its purpose, parameters, and return value — consistent with the other public methods in this class likegetDataItemOffsetandgetDataItemOffsetWithPath.📝 Suggested TSDoc
+ /** + * Parses the header of a data item at a specific offset within a bundle. + * Computes the data item ID from the signature and extracts header metadata. + * + * `@param` bundleId - The ID of the bundle containing the data item + * `@param` itemOffset - Byte offset where the data item begins + * `@param` totalSize - Total size of the data item (header + payload) + * `@param` signal - Optional abort signal to cancel the operation + * `@returns` Object containing the computed ID, header size, payload size, and optional content type + */ async parseDataItemHeader(As per coding guidelines:
**/*.{ts,tsx,js}: Add or improve TSDoc comments when modifying code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/data/ans104-offset-source.ts` around lines 765 - 775, Add a TSDoc comment for the now-public method parseDataItemHeader explaining its purpose (parses the data item header at a given bundle offset and returns id, headerSize, payloadSize, and optional contentType), and document each parameter (bundleId: string, itemOffset: number, totalSize: number, signal?: AbortSignal) plus the Promise return shape; follow the style and phrasing used by the other public methods in this class (e.g., getDataItemOffset, getDataItemOffsetWithPath) so it clearly states behavior, possible errors, and that the method is used by RootParentDataSource.src/data/root-parent-data-source.ts (1)
335-336: Step 0 requiresrootTransactionIdHinteven whenrootPathHintis provided alone.The hint fast-path is gated on
rootTransactionIdHint != null. If a client provides onlyrootPathHint(which contains the root TX ID aspath[0]), Step 0 is skipped entirely and the path hint goes unused.Consider deriving
hintRootTxIdfrom the path when the explicit hint is absent:💡 Suggested fix
- const hintRootTxId = requestAttributes?.rootTransactionIdHint; - if (hintRootTxId != null) { + const hintPath = requestAttributes?.rootPathHint; + const hintRootTxId = + hintPath?.[0] ?? requestAttributes?.rootTransactionIdHint; + if (hintRootTxId != null) { // Step 0a: Direct item offset hint ...Based on learnings: "Path always wins" — the path hint should take precedence and be usable even without a separate
rootTransactionIdHint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/data/root-parent-data-source.ts` around lines 335 - 336, The fast-path is skipped when rootTransactionIdHint is null even if requestAttributes.rootPathHint contains the root TX id; derive hintRootTxId from the path when the explicit hint is absent so the path hint can be used. Update the logic around hintRootTxId (the variable and its use in the Step 0 fast-path) to set hintRootTxId to requestAttributes.rootTransactionIdHint if present, otherwise to requestAttributes.rootPathHint.path[0] (with safe null checks), and keep the existing conditional (if hintRootTxId != null) so the path-derived id triggers Step 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/data/ans104-offset-source.ts`:
- Around line 765-775: Add a TSDoc comment for the now-public method
parseDataItemHeader explaining its purpose (parses the data item header at a
given bundle offset and returns id, headerSize, payloadSize, and optional
contentType), and document each parameter (bundleId: string, itemOffset: number,
totalSize: number, signal?: AbortSignal) plus the Promise return shape; follow
the style and phrasing used by the other public methods in this class (e.g.,
getDataItemOffset, getDataItemOffsetWithPath) so it clearly states behavior,
possible errors, and that the method is used by RootParentDataSource.
In `@src/data/root-parent-data-source.test.ts`:
- Around line 1916-1976: Add a default mock of getDataItemOffsetWithPath to the
test suite setup so it's consistently present like getDataItemOffset and
parseDataItemHeader; update the beforeEach to assign (ans104OffsetSource as
any).getDataItemOffsetWithPath = mock.fn(async () => {/* noop or default return
*/}) and leave individual tests (like the one using hintPath) free to override
that mock implementation as needed; this ensures tests won't fail when they
expect the method to exist and keeps mocking consistent across the suite.
In `@src/data/root-parent-data-source.ts`:
- Around line 335-336: The fast-path is skipped when rootTransactionIdHint is
null even if requestAttributes.rootPathHint contains the root TX id; derive
hintRootTxId from the path when the explicit hint is absent so the path hint can
be used. Update the logic around hintRootTxId (the variable and its use in the
Step 0 fast-path) to set hintRootTxId to requestAttributes.rootTransactionIdHint
if present, otherwise to requestAttributes.rootPathHint.path[0] (with safe null
checks), and keep the existing conditional (if hintRootTxId != null) so the
path-derived id triggers Step 0.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2bcb2dbe-e0ea-4d32-a8e6-c9c25a79c1f8
📒 Files selected for processing (3)
src/data/ans104-offset-source.tssrc/data/root-parent-data-source.test.tssrc/data/root-parent-data-source.ts
The direct-offset hint branch was omitting contentType when caching attributes, causing later cache hits to potentially serve data with the wrong content type. This matches the pattern already used in the path/linear hint branch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When only X-AR-IO-Root-Path is provided (without X-AR-IO-Root-TX-ID), extract the root TX ID from path[0] to enable the Step 0 hint fast path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
X-AR-IO-Root-Transaction-IdandX-AR-IO-Root-Pathrequest headers that let clients supply root TX ID and nesting path hintsRootParentDataSource.getData()before server-side index lookups; failures silently fall through to normal resolutiontools/fetch-with-hintCLI tool that resolves hints via GraphQLbundledIntraversalCloses PE-8987
Test plan
getRequestAttributes(valid/invalid TX IDs, paths, absent headers)getDataItemOffsetWithPath, offsets cached)yarn lint:checkpassestools/fetch-with-hint <data-item-id> --verboseto verify end-to-end🤖 Generated with Claude Code