Skip to content

exp: Serve kupo#898

Merged
scarmuega merged 22 commits intomainfrom
exp/kupo
Mar 7, 2026
Merged

exp: Serve kupo#898
scarmuega merged 22 commits intomainfrom
exp/kupo

Conversation

@gonzalezzfelipe
Copy link
Copy Markdown
Contributor

@gonzalezzfelipe gonzalezzfelipe commented Feb 13, 2026

Missing:

  • /datums/hash
  • /scripts/hash
  • /checkpoints => wont implement

Depends on #899

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced MiniKupo, a read-only HTTP API compatible with Kupo for querying matches, datums, scripts, and metadata.
    • Added configurable HTTP endpoint with CORS support (default port 1442).
    • Implemented health status endpoint with Prometheus metrics output.
    • Added pattern matching for Cardano addresses, assets, and output references.
    • Included CLI proxy command for MiniKupo requests.
  • Documentation

    • Added comprehensive OpenAPI specification for MiniKupo API.
    • Added MiniKupo API usage guide and configuration documentation.
  • Data Support

    • Added datum state dumping capability.

@gonzalezzfelipe gonzalezzfelipe marked this pull request as draft February 13, 2026 21:34
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces Minikupo, a new optional HTTP API crate providing a Kupo-compatible read-only interface for Cardano blockchain data. It integrates a complete HTTP server with route handlers for matches, datums, scripts, metadata, and health endpoints, backed by pattern-based querying and domain data resolution.

Changes

Cohort / File(s) Summary
Workspace & Dependency Setup
Cargo.toml, crates/minikupo/Cargo.toml
Added new optional dolos-minikupo dependency and minikupo feature aggregating axum, http-body-util, and tower-http crates. Registered crates/minikupo as workspace member. New crate manifest declares HTTP/async dependencies (axum, tokio, tower-http, pallas, serde).
Configuration Integration
crates/core/src/config.rs, src/bin/dolos/init.rs
Added MinikupoConfig struct with listen_address and permissive_cors fields. Extended ServeConfig with optional minikupo field. Integrated configuration prompting and CLI flag serve_minikupo in initialization flow.
Serve & CLI Wiring
src/serve/mod.rs, src/bin/dolos/main.rs, src/bin/dolos/minikupo.rs
Added feature-gated minikupo module and Command::Minikupo variant in CLI. Integrated minikupo driver initialization in load_drivers. Implemented request proxy entrypoint for routing external requests to the running HTTP server.
Data Dump Support
src/bin/dolos/data/dump_state.rs
Added DatumState to import list and implemented TableRow for DatumState, enabling datum dumping via "datums" command path.
HTTP API Library Core
crates/minikupo/src/lib.rs, crates/minikupo/src/types.rs
Implemented generic Facade<D: Domain> wrapper with script/datum resolution helpers. Defined Driver implementing async run() for starting Axum server. Added comprehensive response types (BadRequest, Datum, Script, Health, Metadata) with serialization support and Prometheus metrics formatting.
Pattern Matching System
crates/minikupo/src/patterns.rs
Comprehensive pattern matching subsystem with Pattern enum supporting wildcards, addresses, assets, and output references. Includes parsing from string format with support for bech32/hex/Base58 addresses and pattern validation with detailed error handling. Implements matching logic for addresses, assets, and output references.
Route Handlers
crates/minikupo/src/routes/mod.rs, crates/minikupo/src/routes/matches.rs, crates/minikupo/src/routes/datums.rs, crates/minikupo/src/routes/scripts.rs, crates/minikupo/src/routes/metadata.rs, crates/minikupo/src/routes/health.rs
Implemented five HTTP endpoint handlers: pattern-based match querying with filters/sorting, datum hash lookup, script hash lookup, block metadata retrieval with transaction filtering and CBOR encoding, and health status with optional Prometheus output. Includes comprehensive query validation and error mapping.
Documentation
crates/minikupo/openapi.yaml, docs/content/apis/minikupo.mdx
Added OpenAPI 3.0 specification covering all endpoints, data schemas, parameters, and response definitions. Added user documentation describing endpoints, configuration, usage examples, and limitations.

Sequence Diagram

sequenceDiagram
    actor Client
    participant Router as HTTP Router
    participant Facade as Facade<D>
    participant Domain as Domain
    participant Archive as Archive

    Client->>Router: GET /matches/{pattern}?filters
    Router->>Facade: Parse pattern & validate filters
    Facade->>Facade: Build pattern matcher
    Facade->>Domain: Query UTXOs by pattern
    Domain->>Archive: Fetch blocks & transactions
    Archive-->>Domain: Block data
    Domain-->>Facade: Matching UTXOs
    Facade->>Facade: Resolve scripts/datums
    Facade->>Archive: Get scripts & datums by hash
    Archive-->>Facade: Script/datum data
    Facade->>Facade: Serialize to JSON
    Facade-->>Router: Matches with metadata
    Router-->>Client: 200 OK + JSON array
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly Related PRs

Suggested Reviewers

  • scarmuega

Poem

🐰 A Kupo-like friend hops into town,
With patterns and routes all around,
HTTP queries dance through the land,
While datums and scripts go hand in hand,
Minikupo's here to query with glee! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'exp: Serve kupo' directly relates to the main change: adding experimental kupo serving functionality (implemented as minikupo). It clearly summarizes the primary objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch exp/kupo

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gonzalezzfelipe gonzalezzfelipe marked this pull request as ready for review February 18, 2026 14:32
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (12)
src/bin/dolos/data/dump_state.rs (1)

246-258: Consider truncating bytes in table output and aligning format handling with peer impls.

Two small observations:

  1. hex::encode(&self.bytes) emits the full datum payload. For large inline datums this produces very wide table/CSV rows that are hard to read. A length cap (e.g., first N bytes + "…") would match the diagnostic intent of this command.

  2. Every peer TableRow impl either handles OutputFormat::Dbsync explicitly or calls todo!() to make unsupported formats obvious. Silently ignoring _format here works, but hides the fact that no Dbsync-specific layout was designed for datums. A brief todo!("dbsync format not supported for datums state") (matching the pattern) or an explicit comment would make the intent clear.

✏️ Optional improvement
 impl TableRow for DatumState {
-    fn header(_format: OutputFormat) -> Vec<&'static str> {
-        vec!["hash", "refcount", "bytes"]
+    fn header(format: OutputFormat) -> Vec<&'static str> {
+        if matches!(format, OutputFormat::Dbsync) {
+            todo!("dbsync format not supported for datums state");
+        }
+        vec!["hash", "refcount", "bytes"]
     }
 
-    fn row(&self, key: &EntityKey, _network: AddressNetwork, _format: OutputFormat) -> Vec<String> {
+    fn row(&self, key: &EntityKey, _network: AddressNetwork, format: OutputFormat) -> Vec<String> {
+        if matches!(format, OutputFormat::Dbsync) {
+            todo!("dbsync format not supported for datums state");
+        }
+        const MAX_BYTES: usize = 32; // truncate display
         vec![
             hex::encode(key.as_ref()),
             self.refcount.to_string(),
-            hex::encode(&self.bytes),
+            if self.bytes.len() > MAX_BYTES {
+                format!("{}…", hex::encode(&self.bytes[..MAX_BYTES]))
+            } else {
+                hex::encode(&self.bytes)
+            },
         ]
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bin/dolos/data/dump_state.rs` around lines 246 - 258, In the TableRow
impl for DatumState, avoid emitting the full datum payload by truncating
hex::encode(&self.bytes) to a capped length (e.g., take the first N bytes / hex
chars and append "…") when building the row in row(&self, key: &EntityKey,
_network: AddressNetwork, _format: OutputFormat); also explicitly handle the
_format parameter by matching on OutputFormat and either implement a
Dbsync-specific representation or call todo!("dbsync format not supported for
datums state") to follow the peer pattern and make unsupported formats explicit.
Cargo.toml (1)

173-174: Nit: inconsistent workspace member formatting.

"crates/kupo" is appended on the same line as "xtask" rather than on its own line like the other members.

Suggested formatting
-    "xtask", "crates/kupo",
+    "xtask",
+    "crates/kupo",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Cargo.toml` around lines 173 - 174, The workspace members array contains
"xtask" and "crates/kupo" on the same line; split them so each member is on its
own line for consistent formatting. Locate the workspace members list (the array
containing "xtask") and move "crates/kupo" onto a new line as a separate string
entry, matching the style used by the other members in the Cargo.toml workspace
declaration.
crates/cardano/src/indexes/query.rs (1)

378-480: Implementation is correct but the closure is deeply nested — consider extracting a helper.

The script_by_hash implementation correctly mirrors the plutus_data pattern and handles all script sources (witness native/plutus scripts and output ScriptRef variants). The early-return logic within the closure is correct.

The ~80-line closure with nested match arms could be extracted into a standalone function like find_script_in_block(block: &MultiEraBlock, hash: Hash<28>) -> Option<ScriptData> for readability, but this is optional given it follows the same inline-closure style as plutus_data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/cardano/src/indexes/query.rs` around lines 378 - 480, The closure
passed to find_first_by_tag in script_by_hash is very large and nested—extract
its logic into a helper function like fn find_script_in_block(block:
&MultiEraBlock, hash: Hash<28>) -> Option<ScriptData> that iterates txs, checks
native_scripts, plutus_v1/2/3 scripts, and output.script_ref() variants
producing the same ScriptData (using ScriptLanguage and script bytes) and then
replace the inline closure in script_by_hash with a call to
find_script_in_block, passing the block and script_hash; keep the existing
signature/return types (ScriptData, Hash<28>, MultiEraBlock) so callers like
find_first_by_tag remain unchanged.
crates/kupo/Cargo.toml (1)

23-23: Consider making axum a workspace dependency.

axum = "0.8.4" is declared independently in both the root Cargo.toml (line 36) and here. Promoting it to [workspace.dependencies] avoids version drift across crates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/Cargo.toml` at line 23, The axum dependency is duplicated across
crate manifests; move the version spec into the workspace by adding axum =
"0.8.4" under [workspace.dependencies] in the root Cargo.toml and change the
per-crate declaration (e.g., in the kupo crate's Cargo.toml) to reference the
workspace dependency using axum = { workspace = true } (or remove the version
and set workspace = true) so all crates share the same axum version and avoid
drift.
crates/kupo/src/routes/datums.rs (1)

12-12: Inconsistent trait bounds with scripts.rs.

by_hash here uses D: Domain, while scripts.rs uses D: Domain + Clone + Send + Sync + 'static. The difference is likely because this handler doesn't use async query methods, but having consistent bounds across sibling route handlers would be cleaner and avoids issues if the implementation evolves to use async queries later.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/datums.rs` at line 12, The handler signature for
by_hash uses a narrower bound (D: Domain) than sibling handlers (e.g., in
scripts.rs) which use D: Domain + Clone + Send + Sync + 'static; update the
by_hash function generic bound to D: Domain + Clone + Send + Sync + 'static so
it matches other route handlers (adjust the fn signature for pub async fn
by_hash<D: ...>), and ensure any usages inside by_hash still compile with the
widened bounds.
crates/kupo/src/routes/scripts.rs (1)

42-54: Duplicated script mapping logic with matches.rs.

map_script_data here and map_script_json in matches.rs (lines 625-639) perform the same CardanoLanguage → ScriptLanguage translation and hex encoding. Consider extracting a shared helper (e.g., in types.rs or a shared utility) to avoid drift between the two.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/scripts.rs` around lines 42 - 54, map_script_data in
scripts.rs duplicates the CardanoLanguage→ScriptLanguage mapping and hex
encoding logic also implemented by map_script_json in matches.rs; extract a
shared helper (e.g., fn script_from_bytes_or_data / fn map_cardano_script in
types.rs or a new utils module) that takes the CardanoLanguage and script bytes
and returns crate::types::Script (performing the match on CardanoLanguage and
hex::encode), then replace both map_script_data and map_script_json to call that
new helper to centralize the translation and encoding logic.
crates/kupo/src/routes/matches.rs (2)

429-535: build_matches loads all matching UTxOs into memory before filtering.

The entire UTxO set for a pattern is fetched and decoded before apply_filters runs (line 65). For broad patterns with many matches, this could cause significant memory pressure. The created_after/created_before filters require block info (obtained during build_matches), so they can't easily be pushed earlier. However, transaction_id and output_index filters could be applied at the refs level before fetching UTxOs from the state store.

This is acceptable for the experimental phase but worth keeping in mind for production use.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/matches.rs` around lines 429 - 535, build_matches
currently fetches all UTxOs in refs via facade.state().get_utxos before applying
per-output filters, causing memory pressure; filter the refs set first by
applying transaction_id and output_index constraints (from any provided
OutputFilter or caller-provided criteria) so get_utxos is called only for the
subset needed. Specifically, inside build_matches, before calling
facade.state().get_utxos(refs.into_iter().collect()), iterate refs (UtxoSet) and
retain only those whose tx_hash and output_index match the requested
transaction_id/output_index filters, then call get_utxos with that filtered
collection; keep the rest of the processing (block cache, resolve_hashes,
mapping to MatchResponse) unchanged.

156-161: Silently ignored unspent query parameter.

Line 158 let _ = query.unspent; discards the unspent filter without explanation. Since spent tracking is not implemented (all results are unspent), this is functionally correct, but a comment explaining why would help maintainability.

Suggested improvement
+        // All results are unspent since spent tracking is not implemented.
+        // The `unspent` flag is accepted but has no effect.
         let _ = query.unspent;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/matches.rs` around lines 156 - 161, The code currently
discards query.unspent (the line let _ = query.unspent;) with no explanation;
replace that silent discard with a short comment above the line explaining that
spent tracking is not implemented and therefore all returned matches are
effectively unspent, so the unspent filter is a no-op (keep the discard to avoid
unused warnings). Reference the surrounding conversion function (the
TryFrom/constructor that returns Result<Self, MatchError>) and keep the existing
spent-related validation that returns MatchError::BadRequest(spent_hint()) for
spent/spent_after/spent_before.
crates/kupo/src/routes/health.rs (2)

60-64: Loose Accept header matching for Prometheus format.

value.contains("text/plain") will match any Accept header containing that substring (e.g., text/plain-custom). This is unlikely to cause real issues, but a more precise check (e.g., splitting on , and trimming) would be more robust. Low priority given this is experimental.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/health.rs` around lines 60 - 64, The Accept header
check for Prometheus is too loose: replace the contains("text/plain") test used
when building wants_prometheus (the headers.get(header::ACCEPT) branch) with
logic that splits the header on commas, trims each media-range, then for each
part splits off any parameters at ';' and checks the bare media type equals
"text/plain" (or otherwise matches exactly the media-range before parameters).
Update the closure that maps the header value (the .map(|value| ... ) block) to
iterate over the comma-separated entries and return true only if any entry's
media type equals "text/plain".

50-58: Several health fields are hardcoded placeholders.

most_recent_node_tip, seconds_since_last_block, and network_synchronization are always None, and connection_status is always Connected. This means the health endpoint won't reflect actual degraded states (e.g., disconnection, stale data). If this is intentional for the experimental phase, consider adding a TODO comment to track it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/health.rs` around lines 50 - 58, The Health struct is
using hardcoded placeholders (ConnectionStatus::Connected and None for
most_recent_node_tip, seconds_since_last_block, network_synchronization); update
the health assembly in the function that builds Health so these fields reflect
real state: determine connection_status (e.g., by checking the node/db/peer
client used elsewhere), populate most_recent_node_tip from the actual tip source
(e.g., use most_recent_checkpoint or the node client), compute
seconds_since_last_block from the latest block/checkpoint timestamp, and set
network_synchronization by comparing local tip vs network tip; if this is
intentional temporarily, add a clear TODO comment near the Health construction
mentioning the missing implementations and where to source the real values
(referencing Health, ConnectionStatus::Connected, most_recent_node_tip,
seconds_since_last_block, network_synchronization).
crates/kupo/src/types.rs (1)

128-135: Consider making string_or_nan a function instead of a macro.

This macro is simple enough to be a generic function, which provides better type-checking and IDE support:

Suggested replacement
-macro_rules! string_or_nan {
-    ($value:expr) => {
-        match $value {
-            Some(inner) => inner.to_string(),
-            None => "NaN".to_string(),
-        }
-    };
-}
+fn string_or_nan(value: Option<impl std::fmt::Display>) -> String {
+    match value {
+        Some(inner) => inner.to_string(),
+        None => "NaN".to_string(),
+    }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/types.rs` around lines 128 - 135, The string_or_nan macro
should be converted to a generic function for better type-checking and IDE
support: replace the macro_rules! string_or_nan with a function like fn
string_or_nan<T: ToString>(value: Option<T>) -> String that returns
value.map(|v| v.to_string()).unwrap_or_else(|| "NaN".to_string()), then update
all call sites that used string_or_nan!(...) to call string_or_nan(...) instead;
remove the macro definition and ensure any modules using it import or reference
the new function name.
crates/kupo/src/routes/metadata.rs (1)

35-44: Errors are silently discarded — add tracing for debuggability.

Lines 38 and 43 discard the actual error with Err(_), returning bare 500 status codes. In production, this makes it very hard to diagnose why metadata requests fail. Consider logging at warn or error level before returning the status code.

Example
         Ok(Some(block)) => block,
         Ok(None) => return StatusCode::NOT_FOUND.into_response(),
-        Err(_) => return StatusCode::INTERNAL_SERVER_ERROR.into_response(),
+        Err(err) => {
+            tracing::error!(?err, slot_no, "failed to fetch block by slot");
+            return StatusCode::INTERNAL_SERVER_ERROR.into_response();
+        }
     };
 
     let block = match MultiEraBlock::decode(&block) {
         Ok(block) => block,
-        Err(_) => return StatusCode::INTERNAL_SERVER_ERROR.into_response(),
+        Err(err) => {
+            tracing::error!(?err, slot_no, "failed to decode block");
+            return StatusCode::INTERNAL_SERVER_ERROR.into_response();
+        }
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/metadata.rs` around lines 35 - 44, The match arms that
currently swallow errors should log the underlying error before returning a 500:
capture the error from facade.query().block_by_slot(slot_no).await and call
tracing::warn!/tracing::error! with a message like "block_by_slot failed"
including the error and slot_no, and likewise capture the error from
MultiEraBlock::decode(&block) and log it (e.g., "MultiEraBlock::decode failed"
with the error and identifying block info) before returning
StatusCode::INTERNAL_SERVER_ERROR; update the Err(_) arms in the code handling
block_by_slot and MultiEraBlock::decode to log the error and context using those
unique symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/kupo/openapi.yaml`:
- Line 1205: Fix the typo in the OpenAPI description string "An absolut slot
number." by changing "absolut" to "absolute" so the description reads "An
absolute slot number."; locate the description entry containing that exact text
and update it accordingly.
- Line 1886: There is a typo in the OpenAPI spec where the field "operationdId"
is used instead of "operationId"; locate the two occurrences (the operations
named getDatumByHash and getScriptByHash) in openapi.yaml and rename the key
"operationdId" to "operationId" so code generators correctly pick up the
operation identifiers.

In `@crates/kupo/src/lib.rs`:
- Around line 54-78: The NormalizePathLayer is currently added inside
build_router_with_facade which applies it after route matching; move path
normalization so it wraps the router externally. Remove the trailing
.layer(NormalizePathLayer::trim_trailing_slash()) from build_router_with_facade
so it returns the plain Router, and in Driver::run (or the place that calls
build_router_with_facade) wrap the returned Router with
NormalizePathLayer::trim_trailing_slash() (or NormalizePath::new) before serving
so paths are normalized prior to route matching; update any type annotations if
needed to accept the wrapped router.

In `@crates/kupo/src/routes/datums.rs`:
- Around line 38-48: The hint string in datum_hash_hint() incorrectly claims the
hash must be "lowercase" while parse_datum_hash() (and its counterpart in
scripts.rs) accepts both uppercase and lowercase via is_ascii_hexdigit(); update
datum_hash_hint() (and the analogous hint in scripts.rs) to say "64 hex
characters" or similar (e.g., "Invalid datum hash. Hash must be 64 hex
characters.") so the message matches the actual validation in parse_datum_hash()
and the script hash validation function.

In `@crates/kupo/src/routes/metadata.rs`:
- Around line 129-135: The conversion in metadatum_to_model currently downcasts
alonzo::Metadatum::Int from i128 to i32 which rejects valid Cardano metadata;
change the conversion to i64 (i128 -> i64) in metadatum_to_model and update the
MetadatumInt type definition in types.rs to use i64 (or a
string/serde_json::Number if you need unbounded support), keeping the same
map_err handling for an out-of-range case; ensure the signature/serialization of
Metadatum::Int construction uses the updated MetadatumInt field type so metadata
integers outside i32 range are preserved.

In `@crates/kupo/src/types.rs`:
- Around line 55-58: MetadatumInt currently uses int: i32 which is too small for
Cardano metadata; change the field type to a wider integer (e.g., i128) by
updating pub struct MetadatumInt { pub int: i128 }, and then update any
parsing/serialization code that constructs or reads MetadatumInt (see the
metadata parsing logic in metadata.rs) to use i128 as well so integers are not
truncated; ensure serde derives still work or add appropriate serde conversions
if necessary.
- Around line 137-158: The Prometheus output in Health::to_prometheus currently
hardcodes kupo_configuration_indexes and kupo_connection_status to 1.0; update
the function to derive those values from self.configuration.indexes and
self.connection_status instead: map ConnectionStatus::Connected => 1.0 and
Disconnected => 0.0 for kupo_connection_status, and map Indexes::Installed =>
1.0 and Indexes::Deferred => 0.0 (or the project's intended numeric mapping) for
kupo_configuration_indexes, then interpolate those computed numeric strings into
the formatted metrics instead of the hardcoded "1.0".

---

Duplicate comments:
In `@crates/kupo/src/routes/matches.rs`:
- Around line 625-639: map_script_json duplicates the language/script conversion
logic already implemented in scripts.rs::map_script_data; refactor to reuse that
existing mapping instead of re-implementing it—either call the exported
scripts::map_script_data (or the existing helper) from map_script_json, or move
the conversion into a shared helper/From impl (e.g., impl From<ScriptData> for
crate::types::Script) and call that from both places; update map_script_json to
build serde_json::Value by converting ScriptData with the shared function and
serializing the resulting crate::types::Script.

---

Nitpick comments:
In `@Cargo.toml`:
- Around line 173-174: The workspace members array contains "xtask" and
"crates/kupo" on the same line; split them so each member is on its own line for
consistent formatting. Locate the workspace members list (the array containing
"xtask") and move "crates/kupo" onto a new line as a separate string entry,
matching the style used by the other members in the Cargo.toml workspace
declaration.

In `@crates/cardano/src/indexes/query.rs`:
- Around line 378-480: The closure passed to find_first_by_tag in script_by_hash
is very large and nested—extract its logic into a helper function like fn
find_script_in_block(block: &MultiEraBlock, hash: Hash<28>) ->
Option<ScriptData> that iterates txs, checks native_scripts, plutus_v1/2/3
scripts, and output.script_ref() variants producing the same ScriptData (using
ScriptLanguage and script bytes) and then replace the inline closure in
script_by_hash with a call to find_script_in_block, passing the block and
script_hash; keep the existing signature/return types (ScriptData, Hash<28>,
MultiEraBlock) so callers like find_first_by_tag remain unchanged.

In `@crates/kupo/Cargo.toml`:
- Line 23: The axum dependency is duplicated across crate manifests; move the
version spec into the workspace by adding axum = "0.8.4" under
[workspace.dependencies] in the root Cargo.toml and change the per-crate
declaration (e.g., in the kupo crate's Cargo.toml) to reference the workspace
dependency using axum = { workspace = true } (or remove the version and set
workspace = true) so all crates share the same axum version and avoid drift.

In `@crates/kupo/src/routes/datums.rs`:
- Line 12: The handler signature for by_hash uses a narrower bound (D: Domain)
than sibling handlers (e.g., in scripts.rs) which use D: Domain + Clone + Send +
Sync + 'static; update the by_hash function generic bound to D: Domain + Clone +
Send + Sync + 'static so it matches other route handlers (adjust the fn
signature for pub async fn by_hash<D: ...>), and ensure any usages inside
by_hash still compile with the widened bounds.

In `@crates/kupo/src/routes/health.rs`:
- Around line 60-64: The Accept header check for Prometheus is too loose:
replace the contains("text/plain") test used when building wants_prometheus (the
headers.get(header::ACCEPT) branch) with logic that splits the header on commas,
trims each media-range, then for each part splits off any parameters at ';' and
checks the bare media type equals "text/plain" (or otherwise matches exactly the
media-range before parameters). Update the closure that maps the header value
(the .map(|value| ... ) block) to iterate over the comma-separated entries and
return true only if any entry's media type equals "text/plain".
- Around line 50-58: The Health struct is using hardcoded placeholders
(ConnectionStatus::Connected and None for most_recent_node_tip,
seconds_since_last_block, network_synchronization); update the health assembly
in the function that builds Health so these fields reflect real state: determine
connection_status (e.g., by checking the node/db/peer client used elsewhere),
populate most_recent_node_tip from the actual tip source (e.g., use
most_recent_checkpoint or the node client), compute seconds_since_last_block
from the latest block/checkpoint timestamp, and set network_synchronization by
comparing local tip vs network tip; if this is intentional temporarily, add a
clear TODO comment near the Health construction mentioning the missing
implementations and where to source the real values (referencing Health,
ConnectionStatus::Connected, most_recent_node_tip, seconds_since_last_block,
network_synchronization).

In `@crates/kupo/src/routes/matches.rs`:
- Around line 429-535: build_matches currently fetches all UTxOs in refs via
facade.state().get_utxos before applying per-output filters, causing memory
pressure; filter the refs set first by applying transaction_id and output_index
constraints (from any provided OutputFilter or caller-provided criteria) so
get_utxos is called only for the subset needed. Specifically, inside
build_matches, before calling
facade.state().get_utxos(refs.into_iter().collect()), iterate refs (UtxoSet) and
retain only those whose tx_hash and output_index match the requested
transaction_id/output_index filters, then call get_utxos with that filtered
collection; keep the rest of the processing (block cache, resolve_hashes,
mapping to MatchResponse) unchanged.
- Around line 156-161: The code currently discards query.unspent (the line let _
= query.unspent;) with no explanation; replace that silent discard with a short
comment above the line explaining that spent tracking is not implemented and
therefore all returned matches are effectively unspent, so the unspent filter is
a no-op (keep the discard to avoid unused warnings). Reference the surrounding
conversion function (the TryFrom/constructor that returns Result<Self,
MatchError>) and keep the existing spent-related validation that returns
MatchError::BadRequest(spent_hint()) for spent/spent_after/spent_before.

In `@crates/kupo/src/routes/metadata.rs`:
- Around line 35-44: The match arms that currently swallow errors should log the
underlying error before returning a 500: capture the error from
facade.query().block_by_slot(slot_no).await and call
tracing::warn!/tracing::error! with a message like "block_by_slot failed"
including the error and slot_no, and likewise capture the error from
MultiEraBlock::decode(&block) and log it (e.g., "MultiEraBlock::decode failed"
with the error and identifying block info) before returning
StatusCode::INTERNAL_SERVER_ERROR; update the Err(_) arms in the code handling
block_by_slot and MultiEraBlock::decode to log the error and context using those
unique symbols.

In `@crates/kupo/src/routes/scripts.rs`:
- Around line 42-54: map_script_data in scripts.rs duplicates the
CardanoLanguage→ScriptLanguage mapping and hex encoding logic also implemented
by map_script_json in matches.rs; extract a shared helper (e.g., fn
script_from_bytes_or_data / fn map_cardano_script in types.rs or a new utils
module) that takes the CardanoLanguage and script bytes and returns
crate::types::Script (performing the match on CardanoLanguage and hex::encode),
then replace both map_script_data and map_script_json to call that new helper to
centralize the translation and encoding logic.

In `@crates/kupo/src/types.rs`:
- Around line 128-135: The string_or_nan macro should be converted to a generic
function for better type-checking and IDE support: replace the macro_rules!
string_or_nan with a function like fn string_or_nan<T: ToString>(value:
Option<T>) -> String that returns value.map(|v| v.to_string()).unwrap_or_else(||
"NaN".to_string()), then update all call sites that used string_or_nan!(...) to
call string_or_nan(...) instead; remove the macro definition and ensure any
modules using it import or reference the new function name.

In `@src/bin/dolos/data/dump_state.rs`:
- Around line 246-258: In the TableRow impl for DatumState, avoid emitting the
full datum payload by truncating hex::encode(&self.bytes) to a capped length
(e.g., take the first N bytes / hex chars and append "…") when building the row
in row(&self, key: &EntityKey, _network: AddressNetwork, _format: OutputFormat);
also explicitly handle the _format parameter by matching on OutputFormat and
either implement a Dbsync-specific representation or call todo!("dbsync format
not supported for datums state") to follow the peer pattern and make unsupported
formats explicit.


SlotNo: &slotNo
type: integer
description: An absolut slot number.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: "absolut" → "absolute".

-      description: An absolut slot number.
+      description: An absolute slot number.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description: An absolut slot number.
description: An absolute slot number.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/openapi.yaml` at line 1205, Fix the typo in the OpenAPI
description string "An absolut slot number." by changing "absolut" to "absolute"
so the description reads "An absolute slot number."; locate the description
entry containing that exact text and update it accordingly.


/datums/{datum-hash}:
get:
operationdId: getDatumByHash
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: operationdId should be operationId.

This typo appears on both line 1886 (datums endpoint) and line 1915 (scripts endpoint). OpenAPI code generators rely on operationId for method naming and will ignore these misspelled fields.

Proposed fix
-      operationdId: getDatumByHash
+      operationId: getDatumByHash
-      operationdId: getScriptByHash
+      operationId: getScriptByHash
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
operationdId: getDatumByHash
operationId: getDatumByHash
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/openapi.yaml` at line 1886, There is a typo in the OpenAPI spec
where the field "operationdId" is used instead of "operationId"; locate the two
occurrences (the operations named getDatumByHash and getScriptByHash) in
openapi.yaml and rename the key "operationdId" to "operationId" so code
generators correctly pick up the operation identifiers.

Comment on lines +54 to +78
pub(crate) fn build_router_with_facade<D>(facade: Facade<D>) -> Router
where
D: Domain + Clone + Send + Sync + 'static,
{
let permissive_cors = facade.config.permissive_cors.unwrap_or_default();
let app = Router::new()
.route("/matches/{*pattern}", get(routes::matches::by_pattern::<D>))
.route("/datums/{datum-hash}", get(routes::datums::by_hash::<D>))
.route("/scripts/{script-hash}", get(routes::scripts::by_hash::<D>))
.route("/metadata/{slot-no}", get(routes::metadata::by_slot::<D>))
.route("/health", get(routes::health::health::<D>))
.with_state(facade)
.layer(
trace::TraceLayer::new_for_http()
.make_span_with(trace::DefaultMakeSpan::new().level(Level::INFO))
.on_response(trace::DefaultOnResponse::new().level(Level::INFO)),
)
.layer(if permissive_cors {
CorsLayer::permissive()
} else {
CorsLayer::new()
});

app.layer(NormalizePathLayer::trim_trailing_slash())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

axum Router::layer middleware execution order route matching

💡 Result:

  • Router::layer(...) wraps all previously added routes (and fallback) in that router. If you add more routes after calling .layer(...), those new routes won’t be wrapped. [1][2]
  • Execution order when chaining .layer(...) on a Router is “bottom-to-top” (onion model): the last .layer(...) you add runs first on the request, and the response unwinds in reverse. [1]
  • If you instead stack multiple layers with tower::ServiceBuilder, they run “top-to-bottom” in the order you declare them (first added layer sees the request first). [1][3]

Route matching interaction (important):

  • Router::layer middleware runs even if no route matches (i.e., it can observe/transform requests that will become 404 Not Found). This is why “global” middleware like tracing/CORS often goes here. [1][2]
  • Router::route_layer(...) runs only after a route is matched. This is recommended for middleware that might return early (e.g., auth), so it doesn’t accidentally turn a 404 Not Found into something else like 401 Unauthorized. [2]

Rule of thumb

  • Put “should apply to everything, even 404s” middleware on Router::layer.
  • Put “should apply only to real endpoints” middleware (authz, rate limiting) on route_layer. [2]

Sources:
[1] axum middleware docs (Ordering + ServiceBuilder note) (docs.rs)
[2] axum::Router docs (layer + route_layer semantics, matching note) (docs.rs)
[3] tower::ServiceBuilder docs (layer order) (docs.rs)

Citations:


🏁 Script executed:

# Find and examine the actual code in the repository
find . -type f -name "lib.rs" | grep kupo | head -5

Repository: txpipe/dolos

Length of output: 80


🏁 Script executed:

# Check the cargo manifest to understand dependencies
find . -type f -name "Cargo.toml" | grep kupo | head -5

Repository: txpipe/dolos

Length of output: 80


🏁 Script executed:

# Read the lib.rs file to see the actual implementation
cat -n crates/kupo/src/lib.rs | head -100

Repository: txpipe/dolos

Length of output: 3549


🏁 Script executed:

# Search for Driver::run to understand how the router is used
rg -n "Driver|run" crates/kupo/src/ -A 3 | head -50

Repository: txpipe/dolos

Length of output: 1043


🏁 Script executed:

# Check imports to understand which NormalizePathLayer is being used
head -30 crates/kupo/src/lib.rs

Repository: txpipe/dolos

Length of output: 706


🌐 Web query:

tower-http NormalizePathLayer how it works URI rewriting route matching

💡 Result:

tower_http::normalize_path::NormalizePathLayer is a Tower middleware that mutates the incoming request’s Uri (specifically path_and_query) before calling the inner service, so that downstream code (including routers) sees a “normalized” path. It supports two modes: trim and append trailing slashes. [1][2]

What exactly gets rewritten?

  • Trim mode (NormalizePathLayer::trim_trailing_slash()):
    • If the path ends with / (or starts with multiple /), it rewrites the path to a single-leading-slash form and removes all extra leading/trailing slashes via trim_matches('/').
    • Example: /foo//foo, ///foo////?a=a/foo?a=a.
    • The query string is preserved. [3]
  • Append mode (NormalizePathLayer::append_trailing_slash()):
    • Rewrites paths to ensure they end with exactly one / (again collapsing extra leading/trailing slashes).
    • Example: /foo/foo/, ///foo//?a=a/foo/?a=a.
    • The query string is preserved. [3]

This is internal URI rewriting (no redirect response is generated); the inner service simply receives the modified req.uri(). [2][3]

How this interacts with route matching (Axum)

In axum, where you attach the layer determines whether it affects routing:

  • Router::layer(...) runs after routing, so it cannot affect route matching (because matching already happened). Axum explicitly documents that middleware added with Router::layer “will run after routing” and thus “cannot be used to rewrite the request URI.” [4][5]
  • To make normalization affect routing, you must wrap the entire Router as a Service, i.e. apply the Tower layer around the router so it runs before the router sees the request. Axum documents this as the workaround for URI-rewriting middleware. [4]

Practical pattern (conceptually):

  • Works for matching: NormalizePathLayer::trim_trailing_slash().layer(router)
  • Doesn’t help matching: router.layer(NormalizePathLayer::trim_trailing_slash()) [4][5]

Sources: [1] tower-http normalize_path module docs, [2] NormalizePathLayer docs, [3] tower-http normalize_path.rs source, [4] axum “Rewriting request URI in middleware”, [5] axum Router docs.


🌐 Web query:

axum NormalizePathLayer trim_trailing_slash before route matching

💡 Result:

In axum, if you apply NormalizePathLayer::trim_trailing_slash() via Router::layer(...), it runs after the router has already matched the route, so /foo/ can still miss a /foo route and 404. To normalize before route matching, wrap the entire Router as a Service with the layer (i.e., make it the outermost service). [1]

use axum::{routing::get, Router};
use tower::layer::Layer;
use tower_http::normalize_path::NormalizePathLayer;

let router = Router::new().route("/foo", get(|| async { "ok" }));

// pre-routing normalization:
let app = NormalizePathLayer::trim_trailing_slash().layer(router);

This “pre-routing middleware” pattern is also the intended way to ensure layers run even when no route matches (commonly done for NormalizePathLayer). [2]

Note: axum also exposes trim_trailing_slash() / append_trailing_slash() as helpers on route types (feature normalize-path), but those are per-route wrappers, not a pre-router rewrite. [3]

Sources: [1] [2] [3]


🏁 Script executed:

# Check axum and tower-http versions in Cargo.toml
cat crates/kupo/Cargo.toml | grep -E "axum|tower"

Repository: txpipe/dolos

Length of output: 112


🏁 Script executed:

# Check the full return type signature and how it's used
rg -n "build_router|NormalizePath" crates/kupo/src/ -B 2 -A 2

Repository: txpipe/dolos

Length of output: 1660


🏁 Script executed:

# Look for any tests or other uses of build_router
rg -n "build_router" crates/ -l

Repository: txpipe/dolos

Length of output: 137


Move NormalizePathLayer outside the router to normalize paths before route matching.

In axum, Router::layer() applies middleware after route matching completes, so trailing slashes like /matches/{*path}/ fail to match routes like /matches/{*pattern} because the router sees the un-normalized path first. Per axum documentation, URI-rewriting middleware must wrap the router externally to affect routing.

Proposed fix
-pub fn build_router<D>(cfg: KupoConfig, domain: D) -> Router
+pub fn build_router<D>(cfg: KupoConfig, domain: D) -> NormalizePath<Router>
 where
     D: Domain + Clone + Send + Sync + 'static,
 {
-    build_router_with_facade(Facade {
+    let router = build_router_with_facade(Facade {
         inner: domain,
         config: cfg,
-    })
+    });
+    NormalizePathLayer::trim_trailing_slash().layer(router)
 }

In build_router_with_facade, remove the NormalizePathLayer from the returned value:

-    app.layer(NormalizePathLayer::trim_trailing_slash())
+    app

Then update Driver::run to use the returned NormalizePath<Router> directly:

-    axum::serve(listener, ServiceExt::<Request>::into_make_service(app))
+    axum::serve(listener, app.into_make_service())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/lib.rs` around lines 54 - 78, The NormalizePathLayer is
currently added inside build_router_with_facade which applies it after route
matching; move path normalization so it wraps the router externally. Remove the
trailing .layer(NormalizePathLayer::trim_trailing_slash()) from
build_router_with_facade so it returns the plain Router, and in Driver::run (or
the place that calls build_router_with_facade) wrap the returned Router with
NormalizePathLayer::trim_trailing_slash() (or NormalizePath::new) before serving
so paths are normalized prior to route matching; update any type annotations if
needed to accept the wrapped router.

Comment thread crates/kupo/src/routes/datums.rs Outdated
Comment on lines +38 to +48
fn parse_datum_hash(value: &str) -> Result<Vec<u8>, StatusCode> {
if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}

let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(bytes)
}

fn datum_hash_hint() -> String {
"Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validation accepts uppercase hex but hint says "lowercase".

is_ascii_hexdigit() on line 39 accepts both A-F and a-f, but the hint on line 48 tells users "Hash must be 64 lowercase hex characters." Either enforce lowercase (add && value.chars().all(|c| !c.is_ascii_uppercase())) or fix the hint to say "hex characters." The same inconsistency exists in scripts.rs line 33/57.

Proposed fix (update the hint)
 fn datum_hash_hint() -> String {
-    "Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
+    "Invalid datum hash. Hash must be 64 hex characters.".to_string()
 }

And similarly in scripts.rs:

 fn script_hash_hint() -> String {
-    "Invalid script hash. Hash must be 56 lowercase hex characters.".to_string()
+    "Invalid script hash. Hash must be 56 hex characters.".to_string()
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn parse_datum_hash(value: &str) -> Result<Vec<u8>, StatusCode> {
if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}
let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(bytes)
}
fn datum_hash_hint() -> String {
"Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
fn parse_datum_hash(value: &str) -> Result<Vec<u8>, StatusCode> {
if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}
let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(bytes)
}
fn datum_hash_hint() -> String {
"Invalid datum hash. Hash must be 64 hex characters.".to_string()
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/datums.rs` around lines 38 - 48, The hint string in
datum_hash_hint() incorrectly claims the hash must be "lowercase" while
parse_datum_hash() (and its counterpart in scripts.rs) accepts both uppercase
and lowercase via is_ascii_hexdigit(); update datum_hash_hint() (and the
analogous hint in scripts.rs) to say "64 hex characters" or similar (e.g.,
"Invalid datum hash. Hash must be 64 hex characters.") so the message matches
the actual validation in parse_datum_hash() and the script hash validation
function.

Comment on lines +129 to +135
fn metadatum_to_model(datum: &alonzo::Metadatum) -> Result<Metadatum, StatusCode> {
match datum {
alonzo::Metadatum::Int(value) => {
let value: i128 = (*value).into();
let value = i32::try_from(value).map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Metadatum::Int(MetadatumInt { int: value }))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

i128i32 conversion will reject valid Cardano metadata integers.

Cardano metadata integers (alonzo::Metadatum::Int) can hold values well beyond i32 range (e.g. timestamps, token quantities, arbitrary user values). The try_from on line 133 maps these to a 500 error, silently breaking the entire metadata response for that transaction.

At minimum, widen to i64. If Kupo compatibility requires unbounded integers, consider serializing as a JSON number string or using serde_json::Number.

Proposed fix (also update `MetadatumInt` in types.rs)

In crates/kupo/src/routes/metadata.rs:

         alonzo::Metadatum::Int(value) => {
             let value: i128 = (*value).into();
-            let value = i32::try_from(value).map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
+            let value = i64::try_from(value).map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
             Ok(Metadatum::Int(MetadatumInt { int: value }))
         }

In crates/kupo/src/types.rs:

 pub struct MetadatumInt {
-    pub int: i32,
+    pub int: i64,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/metadata.rs` around lines 129 - 135, The conversion in
metadatum_to_model currently downcasts alonzo::Metadatum::Int from i128 to i32
which rejects valid Cardano metadata; change the conversion to i64 (i128 -> i64)
in metadatum_to_model and update the MetadatumInt type definition in types.rs to
use i64 (or a string/serde_json::Number if you need unbounded support), keeping
the same map_err handling for an out-of-range case; ensure the
signature/serialization of Metadatum::Int construction uses the updated
MetadatumInt field type so metadata integers outside i32 range are preserved.

Comment on lines +55 to +58
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
pub struct MetadatumInt {
pub int: i32,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

MetadatumInt.int is i32 — too narrow for Cardano metadata integers.

This is the root cause of the truncation issue flagged in metadata.rs. Cardano metadata integers can exceed i32 range. See the related comment in crates/kupo/src/routes/metadata.rs for the proposed fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/types.rs` around lines 55 - 58, MetadatumInt currently uses
int: i32 which is too small for Cardano metadata; change the field type to a
wider integer (e.g., i128) by updating pub struct MetadatumInt { pub int: i128
}, and then update any parsing/serialization code that constructs or reads
MetadatumInt (see the metadata parsing logic in metadata.rs) to use i128 as well
so integers are not truncated; ensure serde derives still work or add
appropriate serde conversions if necessary.

Comment on lines +137 to +158
impl Health {
pub fn to_prometheus(&self) -> String {
let checkpoint = string_or_nan!(self.most_recent_checkpoint);
let node_tip = string_or_nan!(self.most_recent_node_tip);
let seconds_since_last_block = string_or_nan!(self.seconds_since_last_block);
let network_synchronization = string_or_nan!(self.network_synchronization);

format!(
"# TYPE kupo_configuration_indexes gauge\n\
kupo_configuration_indexes 1.0\n\n\
# TYPE kupo_connection_status gauge\n\
kupo_connection_status 1.0\n\n\
# TYPE kupo_most_recent_checkpoint counter\n\
kupo_most_recent_checkpoint {checkpoint}\n\n\
# TYPE kupo_most_recent_node_tip counter\n\
kupo_most_recent_node_tip {node_tip}\n\n\
# TYPE kupo_network_synchronization gauge\n\
kupo_network_synchronization {network_synchronization}\n\n\
# TYPE kupo_seconds_since_last_block gauge\n\
kupo_seconds_since_last_block {seconds_since_last_block}\n",
)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find where BlockSlot is defined and check its Display impl
echo "=== Searching for BlockSlot definition ==="
rg -n 'struct BlockSlot' -B 2 -A 5 --type rust | head -50

echo -e "\n=== Checking Display impl for BlockSlot ==="
rg -n 'impl.*Display.*BlockSlot|impl Display for BlockSlot' -A 15 --type rust

echo -e "\n=== Checking for BlockSlot definition in dolos ==="
fd -e rs | xargs grep -l 'struct BlockSlot' | head -5

echo -e "\n=== Looking for string_or_nan macro definition ==="
rg -n 'macro_rules.*string_or_nan' -A 10 --type rust

Repository: txpipe/dolos

Length of output: 804


🏁 Script executed:

#!/bin/bash
# Examine the complete Health struct and to_prometheus implementation
echo "=== Health struct definition ==="
rg -n 'struct Health' -A 30 --type rust | head -50

echo -e "\n=== Health::to_prometheus complete implementation ==="
rg -n 'impl Health' -A 50 --type rust | head -80

Repository: txpipe/dolos

Length of output: 4664


🏁 Script executed:

#!/bin/bash
# Check Health struct fields more carefully
rg -n 'pub struct Health' -A 20 --type rust
echo -e "\n=== Check connection_status field type ==="
rg -n 'connection_status.*:' --type rust | grep -v test

echo -e "\n=== Check configuration field ==="
rg -n 'configuration.*:' --type rust | grep Health -B 5 -A 5

Repository: txpipe/dolos

Length of output: 2854


🏁 Script executed:

#!/bin/bash
# BlockSlot is likely imported from another crate; let's check the imports in types.rs
echo "=== Check imports in types.rs ==="
head -50 crates/kupo/src/types.rs | grep -E 'use|import'

echo -e "\n=== Search for BlockSlot in the entire codebase ==="
find . -name "*.rs" -type f ! -path "*/target/*" | xargs grep -l "struct BlockSlot\|type BlockSlot" | head -10

echo -e "\n=== Check if BlockSlot comes from dolos_core ==="
rg -n 'BlockSlot' crates/kupo/src/types.rs | head -5
</parameter>

Repository: txpipe/dolos

Length of output: 572


🏁 Script executed:

#!/bin/bash
# Check if there's a Display impl defined in this repo for BlockSlot
echo "=== Search for Display impl for BlockSlot anywhere in repo ==="
rg -n 'impl.*Display.*for.*BlockSlot' -A 10 --type rust

echo -e "\n=== Check what dolos_core exports ==="
find . -path "*/dolos_core/*" -name "*.rs" | xargs grep -l "BlockSlot" | head -5

echo -e "\n=== Look for BlockSlot re-exports ==="
rg -n 'pub.*BlockSlot\|pub.*use.*BlockSlot' --type rust
</parameter>

Repository: txpipe/dolos

Length of output: 321


🏁 Script executed:

#!/bin/bash
# Let's look at the actual file crates/kupo/src/types.rs more directly to understand BlockSlot
echo "=== Full Health struct and BlockSlot usage context ==="
sed -n '1,100p' crates/kupo/src/types.rs
</parameter>

Repository: txpipe/dolos

Length of output: 2944


🏁 Script executed:

#!/bin/bash
# Look at BlockSlot definition in the core crate
echo "=== BlockSlot definition in crates/core/src/lib.rs ==="
rg -n 'struct BlockSlot\|BlockSlot\s*=' crates/core/src/lib.rs -A 5 -B 2

echo -e "\n=== Check for Display impl on BlockSlot ==="
rg -n 'impl.*Display.*BlockSlot\|impl Display for BlockSlot' crates/core/src/ -A 15

Repository: txpipe/dolos

Length of output: 155


🏁 Script executed:

#!/bin/bash
# Check Cargo.toml to understand dolos_core structure
echo "=== Look at BlockSlot more carefully ==="
find crates/core -name "*.rs" -type f | xargs grep -l "BlockSlot"

echo -e "\n=== Check the main lib.rs or mod.rs for BlockSlot ==="
cat crates/core/src/lib.rs | head -200

Repository: txpipe/dolos

Length of output: 5528


🏁 Script executed:

#!/bin/bash
# Verify that BlockSlot = u64 by looking at the exact line
echo "=== Confirm BlockSlot definition ==="
rg -n 'pub type BlockSlot' crates/core/src/lib.rs -A 1 -B 1

echo -e "\n=== Verify u64 has valid Display impl ==="
# u64 is a primitive type, it definitely has Display
echo "u64 is a primitive Rust type with a built-in Display impl that produces numeric strings"

Repository: txpipe/dolos

Length of output: 304


Hardcoded Prometheus metric values ignore actual state.

Lines 146 and 148 emit kupo_configuration_indexes 1.0 and kupo_connection_status 1.0 unconditionally, regardless of self.connection_status (which is a ConnectionStatus enum with Connected/Disconnected variants) and self.configuration.indexes (which is an Indexes enum with Deferred/Installed variants). These metrics will not reflect actual state changes and should use the enum values to emit appropriate values.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/types.rs` around lines 137 - 158, The Prometheus output in
Health::to_prometheus currently hardcodes kupo_configuration_indexes and
kupo_connection_status to 1.0; update the function to derive those values from
self.configuration.indexes and self.connection_status instead: map
ConnectionStatus::Connected => 1.0 and Disconnected => 0.0 for
kupo_connection_status, and map Indexes::Installed => 1.0 and Indexes::Deferred
=> 0.0 (or the project's intended numeric mapping) for
kupo_configuration_indexes, then interpolate those computed numeric strings into
the formatted metrics instead of the hardcoded "1.0".

@gonzalezzfelipe gonzalezzfelipe marked this pull request as draft February 18, 2026 16:07
@gonzalezzfelipe gonzalezzfelipe marked this pull request as ready for review March 5, 2026 14:37
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
crates/kupo/src/routes/datums.rs (1)

47-49: ⚠️ Potential issue | 🟡 Minor

Hint text contradicts accepted input format.

Line 39 accepts uppercase hex (is_ascii_hexdigit()), but Line 48 says “lowercase hex characters.” Please either enforce lowercase or update the hint to “hex characters.”

Proposed fix
 fn datum_hash_hint() -> String {
-    "Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
+    "Invalid datum hash. Hash must be 64 hex characters.".to_string()
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/datums.rs` around lines 47 - 49, The hint text
returned by datum_hash_hint() incorrectly says “lowercase hex characters” while
the validation (using is_ascii_hexdigit()) accepts uppercase; update
datum_hash_hint() to say “64 hex characters” (or similar "hex characters"
wording) to match the current validator, so the message aligns with the
is_ascii_hexdigit() check rather than enforcing lowercase; adjust only the
string returned by datum_hash_hint() to remove the lowercase requirement.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/kupo/src/routes/matches.rs`:
- Around line 700-703: The asset-name filter fails to match empty-name assets
because apply_filters always constructs keys as "{policy_id}.{asset_name}" while
map_value stores empty-name assets under just "policy_id"; update apply_filters
(and the similar block at the other occurrence) to special-case an empty
asset_name: if filters.asset_name.as_ref() is Some and asset_name.is_empty(),
use the policy_id alone as the lookup key (i.e., key = policy_id) and check
item.value.assets.contains_key(&key); otherwise keep the existing
"{policy_id}.{asset_name}" behavior; reference parse_asset_name only for parsing
validations but do not rely on it for this empty-name lookup.
- Line 157: The code treats any presence of query.resolve_hashes as true by
using query.resolve_hashes.is_some(), causing "false" to be interpreted as true;
change the check to only enable resolution when the value is explicitly true
(e.g., use query.resolve_hashes == Some(true) or
query.resolve_hashes.unwrap_or(false)) so resolve_hashes is true only when the
client provided true; update the use-site where resolve_hashes is defined in
matches.rs (the resolve_hashes variable and any downstream logic that relies on
it) to use that explicit boolean check.

---

Duplicate comments:
In `@crates/kupo/src/routes/datums.rs`:
- Around line 47-49: The hint text returned by datum_hash_hint() incorrectly
says “lowercase hex characters” while the validation (using is_ascii_hexdigit())
accepts uppercase; update datum_hash_hint() to say “64 hex characters” (or
similar "hex characters" wording) to match the current validator, so the message
aligns with the is_ascii_hexdigit() check rather than enforcing lowercase;
adjust only the string returned by datum_hash_hint() to remove the lowercase
requirement.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e47ed54e-a848-47f1-a5a2-529ce42574af

📥 Commits

Reviewing files that changed from the base of the PR and between 224fc35 and bcc7d05.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • Cargo.toml
  • crates/kupo/src/routes/datums.rs
  • crates/kupo/src/routes/matches.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • Cargo.toml

query: MatchesQuery,
facade: &Facade<D>,
) -> Result<Self, MatchError> {
let resolve_hashes = query.resolve_hashes.is_some();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

resolve_hashes=false is interpreted as true.

Using query.resolve_hashes.is_some() enables hash resolution for any provided value, including false. This can produce unexpectedly heavy responses and extra backend lookups.

💡 Proposed fix
diff --git a/crates/kupo/src/routes/matches.rs b/crates/kupo/src/routes/matches.rs
@@
-        let resolve_hashes = query.resolve_hashes.is_some();
+        let resolve_hashes = match query.resolve_hashes.as_deref() {
+            None => false,
+            Some("") | Some("true") => true,
+            Some("false") => false,
+            Some(_) => {
+                return Err(MatchError::BadRequest(
+                    "Invalid value for 'resolve_hashes'. Use true/false.".to_string(),
+                ))
+            }
+        };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let resolve_hashes = query.resolve_hashes.is_some();
let resolve_hashes = match query.resolve_hashes.as_deref() {
None => false,
Some("") | Some("true") => true,
Some("false") => false,
Some(_) => {
return Err(MatchError::BadRequest(
"Invalid value for 'resolve_hashes'. Use true/false.".to_string(),
))
}
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/matches.rs` at line 157, The code treats any presence
of query.resolve_hashes as true by using query.resolve_hashes.is_some(), causing
"false" to be interpreted as true; change the check to only enable resolution
when the value is explicitly true (e.g., use query.resolve_hashes == Some(true)
or query.resolve_hashes.unwrap_or(false)) so resolve_hashes is true only when
the client provided true; update the use-site where resolve_hashes is defined in
matches.rs (the resolve_hashes variable and any downstream logic that relies on
it) to use that explicit boolean check.

Comment on lines +700 to +703
if let Some(asset_name) = filters.asset_name.as_ref() {
let key = format!("{policy_id}.{asset_name}");
if !item.value.assets.contains_key(&key) {
return false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Empty asset-name filtering is currently impossible.

parse_asset_name rejects empty hex values, and apply_filters only builds policy_id.asset_name keys. But map_value represents empty asset names as just policy_id. This makes it impossible to target empty-name assets via asset_name.

💡 Proposed fix
diff --git a/crates/kupo/src/routes/matches.rs b/crates/kupo/src/routes/matches.rs
@@
 fn parse_asset_name(value: &str) -> Result<String, MatchError> {
     let len = value.len();
-    if !(2..=64).contains(&len) || !len.is_multiple_of(2) {
+    if len > 64 || !len.is_multiple_of(2) {
         return Err(MatchError::BadRequest(policy_asset_hint()));
     }
     hex::decode(value)
         .map_err(|_| MatchError::BadRequest(policy_asset_hint()))
         .map(|_| value.to_lowercase())
 }
@@
             if let Some(asset_name) = filters.asset_name.as_ref() {
-                let key = format!("{policy_id}.{asset_name}");
+                let key = if asset_name.is_empty() {
+                    policy_id.to_string()
+                } else {
+                    format!("{policy_id}.{asset_name}")
+                };
                 if !item.value.assets.contains_key(&key) {
                     return false;
                 }
             }

Also applies to: 774-782

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/matches.rs` around lines 700 - 703, The asset-name
filter fails to match empty-name assets because apply_filters always constructs
keys as "{policy_id}.{asset_name}" while map_value stores empty-name assets
under just "policy_id"; update apply_filters (and the similar block at the other
occurrence) to special-case an empty asset_name: if filters.asset_name.as_ref()
is Some and asset_name.is_empty(), use the policy_id alone as the lookup key
(i.e., key = policy_id) and check item.value.assets.contains_key(&key);
otherwise keep the existing "{policy_id}.{asset_name}" behavior; reference
parse_asset_name only for parsing validations but do not rely on it for this
empty-name lookup.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (5)
crates/kupo/src/lib.rs (1)

131-132: ⚠️ Potential issue | 🟠 Major

Path normalization is applied too late to reliably affect route matching.

NormalizePathLayer should wrap the router externally (pre-routing) rather than being attached with Router::layer(...) here.

In axum 0.8, does `Router::layer(...)` execute before or after route matching, and should `NormalizePathLayer` wrap the router externally to rewrite paths for matching?

Also applies to: 147-149

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/lib.rs` around lines 131 - 132, The
NormalizePathLayer::trim_trailing_slash() is being added via Router::layer(...)
(e.g. app.layer(...)) which runs after route matching; instead, wrap the router
with NormalizePathLayer externally so the layer runs before routing. Update the
construction so that the Router instance is wrapped by
NormalizePathLayer::trim_trailing_slash() (move the call out of Router::layer
and apply it around the Router/app), and make the same change for the other
occurrences around the code referenced (lines ~147-149) so path normalization
happens pre-routing for all routers.
crates/kupo/src/routes/matches.rs (2)

152-152: ⚠️ Potential issue | 🟠 Major

resolve_hashes=false is currently treated as enabled.

Presence-based parsing makes any supplied value truthy, including "false".

Proposed fix
-        let resolve_hashes = query.resolve_hashes.is_some();
+        let resolve_hashes = match query.resolve_hashes.as_deref() {
+            None => false,
+            Some("") | Some("true") => true,
+            Some("false") => false,
+            Some(_) => {
+                return Err(MatchError::BadRequest(
+                    "Invalid value for 'resolve_hashes'. Use true/false.".to_string(),
+                ))
+            }
+        };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/matches.rs` at line 152, The code treats any presence
of query.resolve_hashes as true (so "false" is truthy); update the logic around
let resolve_hashes = query.resolve_hashes.is_some() to actually parse the
supplied string into a boolean by using query.resolve_hashes (from the request)
and calling .and_then(|s| s.parse::<bool>().ok()).unwrap_or(false) (or
equivalent) so explicit "false"/"0" values evaluate to false while missing
parameter defaults to false; apply this change where resolve_hashes and
query.resolve_hashes are referenced to ensure correct boolean behavior.

654-657: ⚠️ Potential issue | 🟠 Major

Empty asset-name filtering is unreachable with current parser + key construction.

You encode empty-name assets as policy_id in map_value, but filters always look up policy_id.asset_name and parse_asset_name rejects empty values.

Proposed fix
 fn parse_asset_name(value: &str) -> Result<String, MatchError> {
     let len = value.len();
-    if !(2..=64).contains(&len) || !len.is_multiple_of(2) {
+    if len > 64 || !len.is_multiple_of(2) {
         return Err(MatchError::BadRequest(policy_asset_hint()));
     }
     hex::decode(value)
         .map_err(|_| MatchError::BadRequest(policy_asset_hint()))
         .map(|_| value.to_lowercase())
 }
             if let Some(asset_name) = filters.asset_name.as_ref() {
-                let key = format!("{policy_id}.{asset_name}");
+                let key = if asset_name.is_empty() {
+                    policy_id.to_string()
+                } else {
+                    format!("{policy_id}.{asset_name}")
+                };
                 if !item.value.assets.contains_key(&key) {
                     return false;
                 }
             }

Also applies to: 728-731

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/matches.rs` around lines 654 - 657, The asset-name
lookup currently always builds keys as "{policy_id}.{asset_name}", but
empty-name assets are stored in assets map as just the policy_id (no dot), so
change the key construction in the filtering checks that use filters.asset_name
(e.g., the block using filters.asset_name.as_ref() with
item.value.assets.contains_key) to handle empty asset names: if asset_name is
empty build key = policy_id, otherwise build key =
format!("{policy_id}.{asset_name}"); apply the same change to the other
occurrence around the 728-731 range so both lookups match how empty-name assets
are encoded in the map.
crates/kupo/src/routes/datums.rs (1)

29-30: ⚠️ Potential issue | 🟡 Minor

Datum hash hint is stricter than the parser.

The parser accepts uppercase hex too, but the hint says lowercase-only. The error message should match runtime validation.

Proposed fix
 fn datum_hash_hint() -> String {
-    "Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
+    "Invalid datum hash. Hash must be 64 hex characters.".to_string()
 }

Also applies to: 37-38

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/datums.rs` around lines 29 - 30, The validation
accepts any ASCII hex digit (is_ascii_hexdigit) but the user-facing hint
currently says lowercase-only; update the hint/error text associated with the
datum hash validation so it describes "64-character hex string
(case-insensitive)" or equivalent, and make the same change for the duplicate
check at the other occurrence; locate the checks using the variable value and
the ASCII hex validation in crates/kupo/src/routes/datums.rs (the blocks around
the if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit())
checks) and replace the lowercase-only wording with a case-insensitive hex
description so runtime validation and the hint match.
crates/kupo/src/routes/scripts.rs (1)

30-31: ⚠️ Potential issue | 🟡 Minor

Hint text contradicts accepted hash format.

Validation accepts uppercase hex, but the hint says lowercase-only. Please align the hint with actual validation behavior.

Proposed fix
 fn script_hash_hint() -> String {
-    "Invalid script hash. Hash must be 56 lowercase hex characters.".to_string()
+    "Invalid script hash. Hash must be 56 hex characters.".to_string()
 }

Also applies to: 39-40

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/kupo/src/routes/scripts.rs` around lines 30 - 31, The hint text that
currently says lowercase-only contradicts the validation in routes/scripts.rs
which accepts uppercase hex via the check if value.len() != 56 ||
!value.chars().all(|c| c.is_ascii_hexdigit()); update the hint to reflect
case-insensitive hexadecimal (e.g., "56-character hexadecimal (0-9, a-f or
A-F)") or alternatively change the validation to enforce lowercase consistently,
and apply the same fix to the other occurrence around the second check (lines
corresponding to the duplicate at 39-40).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/kupo/src/lib.rs`:
- Around line 131-132: The NormalizePathLayer::trim_trailing_slash() is being
added via Router::layer(...) (e.g. app.layer(...)) which runs after route
matching; instead, wrap the router with NormalizePathLayer externally so the
layer runs before routing. Update the construction so that the Router instance
is wrapped by NormalizePathLayer::trim_trailing_slash() (move the call out of
Router::layer and apply it around the Router/app), and make the same change for
the other occurrences around the code referenced (lines ~147-149) so path
normalization happens pre-routing for all routers.

In `@crates/kupo/src/routes/datums.rs`:
- Around line 29-30: The validation accepts any ASCII hex digit
(is_ascii_hexdigit) but the user-facing hint currently says lowercase-only;
update the hint/error text associated with the datum hash validation so it
describes "64-character hex string (case-insensitive)" or equivalent, and make
the same change for the duplicate check at the other occurrence; locate the
checks using the variable value and the ASCII hex validation in
crates/kupo/src/routes/datums.rs (the blocks around the if value.len() != 64 ||
!value.chars().all(|c| c.is_ascii_hexdigit()) checks) and replace the
lowercase-only wording with a case-insensitive hex description so runtime
validation and the hint match.

In `@crates/kupo/src/routes/matches.rs`:
- Line 152: The code treats any presence of query.resolve_hashes as true (so
"false" is truthy); update the logic around let resolve_hashes =
query.resolve_hashes.is_some() to actually parse the supplied string into a
boolean by using query.resolve_hashes (from the request) and calling
.and_then(|s| s.parse::<bool>().ok()).unwrap_or(false) (or equivalent) so
explicit "false"/"0" values evaluate to false while missing parameter defaults
to false; apply this change where resolve_hashes and query.resolve_hashes are
referenced to ensure correct boolean behavior.
- Around line 654-657: The asset-name lookup currently always builds keys as
"{policy_id}.{asset_name}", but empty-name assets are stored in assets map as
just the policy_id (no dot), so change the key construction in the filtering
checks that use filters.asset_name (e.g., the block using
filters.asset_name.as_ref() with item.value.assets.contains_key) to handle empty
asset names: if asset_name is empty build key = policy_id, otherwise build key =
format!("{policy_id}.{asset_name}"); apply the same change to the other
occurrence around the 728-731 range so both lookups match how empty-name assets
are encoded in the map.

In `@crates/kupo/src/routes/scripts.rs`:
- Around line 30-31: The hint text that currently says lowercase-only
contradicts the validation in routes/scripts.rs which accepts uppercase hex via
the check if value.len() != 56 || !value.chars().all(|c| c.is_ascii_hexdigit());
update the hint to reflect case-insensitive hexadecimal (e.g., "56-character
hexadecimal (0-9, a-f or A-F)") or alternatively change the validation to
enforce lowercase consistently, and apply the same fix to the other occurrence
around the second check (lines corresponding to the duplicate at 39-40).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f86d20f-9105-42c0-8c51-392adb45435a

📥 Commits

Reviewing files that changed from the base of the PR and between bcc7d05 and 4e5d0d8.

📒 Files selected for processing (4)
  • crates/kupo/src/lib.rs
  • crates/kupo/src/routes/datums.rs
  • crates/kupo/src/routes/matches.rs
  • crates/kupo/src/routes/scripts.rs

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (7)
crates/minikupo/openapi.yaml (2)

1205-1205: ⚠️ Potential issue | 🟡 Minor

Unresolved typo in slot description.

Line 1205 still says absolut instead of absolute.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/openapi.yaml` at line 1205, Replace the typo in the OpenAPI
description string "An absolut slot number." with the correct wording "An
absolute slot number." — locate the description entry that currently contains
the exact text "An absolut slot number." in the OpenAPI YAML and update it to
"An absolute slot number." so the schema description is spelled correctly.

1886-1886: ⚠️ Potential issue | 🟡 Minor

operationId is misspelled in two operations.

Line 1886 and Line 1915 use operationdId, which breaks operation naming for generators that rely on operationId.

Proposed fix
-      operationdId: getDatumByHash
+      operationId: getDatumByHash
-      operationdId: getScriptByHash
+      operationId: getScriptByHash

Also applies to: 1915-1915

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/openapi.yaml` at line 1886, Two occurrences use the
misspelled key "operationdId" which prevents generators from recognizing
operationId; replace the misspelled "operationdId" with the correct
"operationId" for the affected operations (e.g., the entry currently showing
operationdId: getDatumByHash and the other occurrence referenced in the comment)
so the OpenAPI operations have the proper operationId fields.
crates/minikupo/src/routes/matches.rs (2)

152-152: ⚠️ Potential issue | 🟡 Minor

resolve_hashes=false is still interpreted as enabled.

Line 152 uses is_some(), so any provided value turns resolution on. Parse explicit boolean semantics instead.

Suggested fix
-        let resolve_hashes = query.resolve_hashes.is_some();
+        let resolve_hashes = match query.resolve_hashes.as_deref() {
+            None => false,
+            Some("") | Some("true") => true,
+            Some("false") => false,
+            Some(_) => return Err(MatchError::BadRequest(
+                "Invalid value for 'resolve_hashes'. Use true/false.".to_string(),
+            )),
+        };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/routes/matches.rs` at line 152, The code uses
query.resolve_hashes.is_some() which treats any provided value (including false)
as enabled; change this to evaluate the actual boolean value instead (e.g., set
resolve_hashes = query.resolve_hashes.unwrap_or(false) or compare to Some(true))
so only an explicit true enables resolution; update the references to
resolve_hashes accordingly in the matches handling (look for the
query.resolve_hashes usage and the local resolve_hashes variable).

654-656: ⚠️ Potential issue | 🟠 Major

Empty asset-name filtering remains impossible.

map_value represents empty-name assets as policy_id, but Lines 654-656 and Lines 728-731 force policy_id.asset_name and reject empty names. This prevents filtering policy-only units.

Suggested fix
 fn parse_asset_name(value: &str) -> Result<String, MatchError> {
     let len = value.len();
-    if !(2..=64).contains(&len) || !len.is_multiple_of(2) {
+    if len > 64 || !len.is_multiple_of(2) {
         return Err(MatchError::BadRequest(policy_asset_hint()));
     }
     hex::decode(value)
         .map_err(|_| MatchError::BadRequest(policy_asset_hint()))
         .map(|_| value.to_lowercase())
 }
             if let Some(asset_name) = filters.asset_name.as_ref() {
-                let key = format!("{policy_id}.{asset_name}");
+                let key = if asset_name.is_empty() {
+                    policy_id.to_string()
+                } else {
+                    format!("{policy_id}.{asset_name}")
+                };
                 if !item.value.assets.contains_key(&key) {
                     return false;
                 }
             }

Also applies to: 728-731

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/routes/matches.rs` around lines 654 - 656, The code
currently builds a lookup key as format!("{policy_id}.{asset_name}") and rejects
assets when that key is missing, which breaks matching for empty-name assets
represented in map_value as just policy_id; change the key construction in the
asset-name filtering blocks (the branch that reads filters.asset_name.as_ref()
and the similar block later) to use policy_id alone when asset_name is empty
(e.g., if asset_name.is_empty() then key = policy_id.clone() else key =
format!("{policy_id}.{asset_name}")), and ensure you do not skip processing when
asset_name == "" so policy-only units are matched against item.value.assets
contains_key(&key).
crates/minikupo/src/types.rs (2)

145-148: ⚠️ Potential issue | 🟠 Major

Prometheus values are still hardcoded and ignore runtime state.

Lines 145-148 always emit 1.0, regardless of self.configuration.indexes and self.connection_status.

Suggested fix
 impl Health {
     pub fn to_prometheus(&self) -> String {
+        let configuration_indexes = match self.configuration.indexes {
+            Indexes::Installed => "1.0",
+            Indexes::Deferred => "0.0",
+        };
+        let connection_status = match self.connection_status {
+            ConnectionStatus::Connected => "1.0",
+            ConnectionStatus::Disconnected => "0.0",
+        };
         let checkpoint = string_or_nan!(self.most_recent_checkpoint);
         let node_tip = string_or_nan!(self.most_recent_node_tip);
         let seconds_since_last_block = string_or_nan!(self.seconds_since_last_block);
         let network_synchronization = string_or_nan!(self.network_synchronization);

         format!(
             "# TYPE kupo_configuration_indexes gauge\n\
-            kupo_configuration_indexes 1.0\n\n\
+            kupo_configuration_indexes {configuration_indexes}\n\n\
             # TYPE kupo_connection_status gauge\n\
-            kupo_connection_status 1.0\n\n\
+            kupo_connection_status {connection_status}\n\n\
             # TYPE kupo_most_recent_checkpoint counter\n\
             kupo_most_recent_checkpoint {checkpoint}\n\n\
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/types.rs` around lines 145 - 148, The Prometheus output
is hardcoded to "1.0" instead of using the runtime values; update the code that
builds the metrics string (the block producing "kupo_configuration_indexes 1.0"
and "kupo_connection_status 1.0") to interpolate the actual values from
self.configuration.indexes and self.connection_status. Locate the function or
impl in types.rs that formats these lines (search for the literal
"kupo_configuration_indexes" / "kupo_connection_status") and replace the fixed
"1.0" with the appropriate numeric formatted values (e.g., cast or convert the
index count and connection status to f64 or string) so the emitted metrics
reflect the current state. Ensure formatting matches Prometheus expected numeric
format and handle any Option/enum conversion for self.connection_status
consistently.

56-58: ⚠️ Potential issue | 🟠 Major

MetadatumInt.int is still too narrow for metadata integers.

Keeping this as i32 forces lossy/downcasting behavior in metadata route conversion and rejects valid values.

Suggested fix
 pub struct MetadatumInt {
-    pub int: i32,
+    pub int: i128,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/types.rs` around lines 56 - 58, The MetadatumInt struct's
field int is declared as i32 which is too narrow; change pub struct MetadatumInt
{ pub int: i64 } to use a 64-bit integer and update any
conversion/serialization/deserialization code that references MetadatumInt::int
(e.g., metadata route conversion helpers) to accept/produce i64 to avoid
lossy/downcasting and rejection of valid values.
crates/minikupo/src/lib.rs (1)

131-131: ⚠️ Potential issue | 🟠 Major

NormalizePathLayer should wrap the router before route matching.

Applying it at Line 131 via Router::layer risks post-match normalization behavior; wrap the router as a service before serving.

Suggested fix
 pub(crate) fn build_router_with_facade<D>(facade: Facade<D>) -> Router
 where
     D: Domain + Clone + Send + Sync + 'static,
 {
@@
-    app.layer(NormalizePathLayer::trim_trailing_slash())
+    app
 }
@@
     async fn run(cfg: Self::Config, domain: D, cancel: C) -> Result<(), ServeError> {
-        let app = build_router(cfg.clone(), domain);
+        let app = NormalizePathLayer::trim_trailing_slash().layer(build_router(cfg.clone(), domain));
@@
-        axum::serve(listener, ServiceExt::<Request>::into_make_service(app))
+        axum::serve(listener, app.into_make_service())
             .with_graceful_shutdown(async move { cancel.cancelled().await })

Also applies to: 147-148

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/lib.rs` at line 131, The
NormalizePathLayer::trim_trailing_slash() is currently added via Router::layer
(app.layer(...)) which runs after route matching; instead wrap the entire Router
as a service so normalization happens before matching. Move the
NormalizePathLayer::trim_trailing_slash() out of the Router::layer calls and
apply it as an outer service wrapper around the router when building/serving
(wrap the Router value with the NormalizePathLayer service or use ServiceBuilder
to layer NormalizePathLayer around the Router) — update occurrences referencing
app.layer(...) and any other Router::layer uses (including the instances around
lines with app.layer at 147-148) so the router is normalized prior to route
matching.
🧹 Nitpick comments (1)
crates/minikupo/src/routes/health.rs (1)

57-57: Avoid hardcoding the Kupo compatibility suffix inline.

Line 57 bakes 2.11.0 directly into runtime output; this is easy to forget during upgrades. Prefer a shared constant (or config-driven value) used by both docs and runtime responses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/routes/health.rs` at line 57, Replace the hardcoded
"-2.11.0" suffix in the version formatting with a single shared constant (e.g.
KUPO_COMPAT_SUFFIX) and use that constant where the runtime health response
builds the version (the line containing version: format!("{}-2.11.0",
env!("CARGO_PKG_VERSION"))). Add the constant (pub const KUPO_COMPAT_SUFFIX:
&str = "-2.11.0";) in a shared module or near crate metadata so docs and runtime
can import it, then change the format call to format!("{}{}",
env!("CARGO_PKG_VERSION"), KUPO_COMPAT_SUFFIX) so upgrades only require updating
the constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/minikupo/openapi.yaml`:
- Around line 1956-2276: The OpenAPI spec includes operationIds (getPatterns,
putPatterns, matchPattern, deletePattern, putPattern, sampleCheckpoints,
getCheckpointBySlot, getMetadataBySlot, getHealth, getMetrics) that are not
wired in the application router, causing contract drift; fix by either (A)
adding corresponding route handlers and wiring in the router setup function
(e.g., the function that configures/mounts routes such as
configure_routes/create_router/register_routes) mapping each operationId to its
handler and returning the correct status/content-types, or (B) removing or
trimming those paths from openapi.yaml to match the actual mounted endpoints;
ensure any added handlers follow existing handler signatures and middleware
patterns and update API tests and OpenAPI generation so spec and router remain
in sync.

In `@crates/minikupo/src/routes/datums.rs`:
- Around line 29-39: The datum hash validator currently accepts uppercase hex;
update the validation in the function that returns Err(StatusCode::BAD_REQUEST)
so it enforces 64 characters and only lowercase hex (e.g., require value.len()
== 64 && value == value.to_lowercase() && value.chars().all(|c|
c.is_ascii_hexdigit())), and keep or adjust datum_hash_hint() to match the new
lowercase-only rule so the hint and validation are consistent.

In `@crates/minikupo/src/routes/scripts.rs`:
- Around line 30-40: The validation currently permits uppercase hex but the
user-facing hint from script_hash_hint() claims lowercase-only; change the
parser in the validation block to enforce lowercase hex by adding a lowercase
check (e.g. ensure value.len() == 56, all chars are hex, and value ==
value.to_lowercase()), and keep the error returns unchanged so that
Hash::new(bytes) still receives a 28-byte lowercase hex decode; adjust the
condition that currently uses value.chars().all(|c| c.is_ascii_hexdigit()) to
also verify lowercase (or compare value to value.to_lowercase()) to match
script_hash_hint().

---

Duplicate comments:
In `@crates/minikupo/openapi.yaml`:
- Line 1205: Replace the typo in the OpenAPI description string "An absolut slot
number." with the correct wording "An absolute slot number." — locate the
description entry that currently contains the exact text "An absolut slot
number." in the OpenAPI YAML and update it to "An absolute slot number." so the
schema description is spelled correctly.
- Line 1886: Two occurrences use the misspelled key "operationdId" which
prevents generators from recognizing operationId; replace the misspelled
"operationdId" with the correct "operationId" for the affected operations (e.g.,
the entry currently showing operationdId: getDatumByHash and the other
occurrence referenced in the comment) so the OpenAPI operations have the proper
operationId fields.

In `@crates/minikupo/src/lib.rs`:
- Line 131: The NormalizePathLayer::trim_trailing_slash() is currently added via
Router::layer (app.layer(...)) which runs after route matching; instead wrap the
entire Router as a service so normalization happens before matching. Move the
NormalizePathLayer::trim_trailing_slash() out of the Router::layer calls and
apply it as an outer service wrapper around the router when building/serving
(wrap the Router value with the NormalizePathLayer service or use ServiceBuilder
to layer NormalizePathLayer around the Router) — update occurrences referencing
app.layer(...) and any other Router::layer uses (including the instances around
lines with app.layer at 147-148) so the router is normalized prior to route
matching.

In `@crates/minikupo/src/routes/matches.rs`:
- Line 152: The code uses query.resolve_hashes.is_some() which treats any
provided value (including false) as enabled; change this to evaluate the actual
boolean value instead (e.g., set resolve_hashes =
query.resolve_hashes.unwrap_or(false) or compare to Some(true)) so only an
explicit true enables resolution; update the references to resolve_hashes
accordingly in the matches handling (look for the query.resolve_hashes usage and
the local resolve_hashes variable).
- Around line 654-656: The code currently builds a lookup key as
format!("{policy_id}.{asset_name}") and rejects assets when that key is missing,
which breaks matching for empty-name assets represented in map_value as just
policy_id; change the key construction in the asset-name filtering blocks (the
branch that reads filters.asset_name.as_ref() and the similar block later) to
use policy_id alone when asset_name is empty (e.g., if asset_name.is_empty()
then key = policy_id.clone() else key = format!("{policy_id}.{asset_name}")),
and ensure you do not skip processing when asset_name == "" so policy-only units
are matched against item.value.assets contains_key(&key).

In `@crates/minikupo/src/types.rs`:
- Around line 145-148: The Prometheus output is hardcoded to "1.0" instead of
using the runtime values; update the code that builds the metrics string (the
block producing "kupo_configuration_indexes 1.0" and "kupo_connection_status
1.0") to interpolate the actual values from self.configuration.indexes and
self.connection_status. Locate the function or impl in types.rs that formats
these lines (search for the literal "kupo_configuration_indexes" /
"kupo_connection_status") and replace the fixed "1.0" with the appropriate
numeric formatted values (e.g., cast or convert the index count and connection
status to f64 or string) so the emitted metrics reflect the current state.
Ensure formatting matches Prometheus expected numeric format and handle any
Option/enum conversion for self.connection_status consistently.
- Around line 56-58: The MetadatumInt struct's field int is declared as i32
which is too narrow; change pub struct MetadatumInt { pub int: i64 } to use a
64-bit integer and update any conversion/serialization/deserialization code that
references MetadatumInt::int (e.g., metadata route conversion helpers) to
accept/produce i64 to avoid lossy/downcasting and rejection of valid values.

---

Nitpick comments:
In `@crates/minikupo/src/routes/health.rs`:
- Line 57: Replace the hardcoded "-2.11.0" suffix in the version formatting with
a single shared constant (e.g. KUPO_COMPAT_SUFFIX) and use that constant where
the runtime health response builds the version (the line containing version:
format!("{}-2.11.0", env!("CARGO_PKG_VERSION"))). Add the constant (pub const
KUPO_COMPAT_SUFFIX: &str = "-2.11.0";) in a shared module or near crate metadata
so docs and runtime can import it, then change the format call to
format!("{}{}", env!("CARGO_PKG_VERSION"), KUPO_COMPAT_SUFFIX) so upgrades only
require updating the constant.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b0a84b7d-42bc-4320-85dc-79bd92907c9a

📥 Commits

Reviewing files that changed from the base of the PR and between 7b347be and b313ece.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • Cargo.toml
  • crates/core/src/config.rs
  • crates/minikupo/Cargo.toml
  • crates/minikupo/openapi.yaml
  • crates/minikupo/src/lib.rs
  • crates/minikupo/src/patterns.rs
  • crates/minikupo/src/routes/datums.rs
  • crates/minikupo/src/routes/health.rs
  • crates/minikupo/src/routes/matches.rs
  • crates/minikupo/src/routes/metadata.rs
  • crates/minikupo/src/routes/mod.rs
  • crates/minikupo/src/routes/scripts.rs
  • crates/minikupo/src/types.rs
  • docs/content/apis/minikupo.mdx
  • src/bin/dolos/init.rs
  • src/bin/dolos/main.rs
  • src/bin/dolos/minikupo.rs
  • src/serve/mod.rs
✅ Files skipped from review due to trivial changes (2)
  • crates/minikupo/src/routes/mod.rs
  • docs/content/apis/minikupo.mdx
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/core/src/config.rs
  • src/bin/dolos/init.rs
  • src/serve/mod.rs

Comment on lines +1956 to +2276
/patterns:
get:
operationId: getPatterns
tags: ["Patterns"]
summary: Get Patterns
description: |
Retrieve all patterns currently configured on the server.
responses:
200:
description: OK
headers: *default-headers
content:
"application/json;charset=utf-8":
schema:
type: array
items:
$ref: "#/components/schemas/Pattern"

put: &putPatterns
operationId: putPatterns
tags: ["Patterns"]
summary: Bulk Add Patterns
description: |
Add many patterns at once. See [Add Pattern (*)](#operation/putPattern) for more details.
requestBody:
content:
"application/json;charset=utf-8":
schema:
type: object

additionalProperties: false
required:
- patterns
- rollback_to
properties:
patterns:
type: array
description: A list of patterns on addresses, assets or transactions as described in [Patterns](#section/Patterns).
items:
$ref: "#/components/schemas/Pattern"
rollback_to:
$ref: "#/components/schemas/ForcedRollback/properties/rollback_to"
limit:
$ref: "#/components/schemas/ForcedRollback/properties/limit"
responses:
200:
description: OK
headers:
Content-Type:
$ref: "#/components/headers/Content-Type"
content:
"application/json;charset=utf-8":
schema:
type: array
items:
$ref: "#/components/schemas/Pattern"
400:
description: Bad Request
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/BadRequest"

/patterns/{pattern}:
get: &getPattern
operationId: matchPattern
tags: ["Patterns"]
summary: Get Pattern (*)
description: |
Retrieve all patterns that include<sup>*</sup> the provided pattern. Remember that an address is itself a pattern!
This endpoint is thereby useful to check is an address is matched by a given pattern configuration (the returned
list would be non-empty).

> <sup><strong>(*) Definition</strong></sup> <br/>
> If all results matched by `y` are also matched by `x`, then `x` is said to include `y`.
parameters:
- $ref: "#/components/parameters/pattern"
responses:
200:
description: OK
Content-Type:
$ref: "#/components/headers/Content-Type"
content:
"application/json;charset=utf-8":
schema:
type: array
items:
$ref: "#/components/schemas/Pattern"

delete: &deletePattern
operationId: deletePattern
tags: ["Patterns"]
summary: Delete Pattern (*)
description: |
Removes patterns from the database and active filtering. Note that this does
**NOT** remove the corresponding matches nor will it halt or restart synchronization.
The server will continue filtering new blocks but, will that pattern removed.
parameters:
- $ref: "#/components/parameters/pattern"
responses:
200:
description: OK
Content-Type:
$ref: "#/components/headers/Content-Type"
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/Deleted"
400:
description: Bad Request
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/BadRequest"

put: &putPattern
operationId: putPattern
tags: ["Patterns"]
summary: Add Pattern (*)
description: |
Add a new pattern to watch and force the server to re-sync from a given point. Only blocks discovered from that point will be matched against the new pattern.

This endpoint may not be instantaneous as the forced rollback may only occur after the next block arrives in the indexer. On the main network, this takes usually
no more than 20 seconds. On test networks however where the block density tends to be much lower, this may take longer.

Note also that, very long rollback will take a substantial amount of time to be processed by the server; if the server is shut down while a rollback is ongoing, the
rollback will be aborted and the server will remain where it was, although with the extra pattern added.
parameters:
- $ref: "#/components/parameters/pattern"
requestBody:
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/ForcedRollback"
responses:
200:
description: OK
Content-Type:
$ref: "#/components/headers/Content-Type"
content:
"application/json;charset=utf-8":
schema:
type: array
items:
$ref: "#/components/schemas/Pattern"
400:
description: Bad Request
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/BadRequest"

/checkpoints:
get:
operationId: sampleCheckpoints
tags: ["Checkpoints"]
summary: Sample Checkpoints
description: |
Retrieve a **sample of** all checkpoints currently in the database, in descending `slot_no` order.
This is useful to know where the synchronization is at. On restart, the synchronization will continue from the most recent
checkpoints that is also valid on the network.
responses:
200:
description: OK
headers: *default-headers
content:
"application/json;charset=utf-8":
schema:
type: array
items:
$ref: "#/components/schemas/Point"
304:
$ref: "#/components/responses/304"

/checkpoints/{slot-no}:
get:
operationId: getCheckpointBySlot
tags: ["Checkpoints"]
summary: Get Checkpoint By Slot
description: |
Retrieve a checkpoint by its (absolute) slot number. The query is flexible by default. Meaning that, if there's no checkpoint at the given slot, the server
will for for the closest (i.e. most recent) slot that is **before** the provided slot number. This is particularly useful to find ancestors to known slots
in order to use them for references on-chain.
parameters:
- $ref: "#/components/parameters/slot-no"
- $ref: "#/components/parameters/strict"
responses:
200:
description: OK
headers: *default-headers
content:
"application/json;charset=utf-8":
schema:
oneOf:
- $ref: "#/components/schemas/Point"
- type: "null"
304:
$ref: "#/components/responses/304"

/metadata/{slot-no}:
get:
operationId: getMetadataBySlot
tags: ["Metadata"]
summary: Get Metadata By Slot
description: |
Retrieve all metadata data seen in a block at the given slot, possibly filtered by transaction id.

Metadata are ordered according to their respective transaction's order in the block.
parameters:
- <<: *transaction-id
description: |
Filters results by transaction id to retrieve items originating from that transaction only.
responses:
200:
description: OK
headers:
<<: *default-headers
X-Block-Header-Hash:
<<: *HeaderHash
description: |
The block's header hash digest (`blake2b-256`) of the corresponding block at the provided slot. Clients are expected to control that this hash matches the one they expect, especially when fetching metadata from blocks in the unstable region (i.e. less than `k=2160` blocks from the network tip). Indeed, Cardano is a distributed system after all and data isn't guaranteed to be immediately immutable. Data present in very recent blocks may therefore be different between two successive requests.
content:
"application/json;charset=utf-8":
schema:
type: array
items:
$ref: "#/components/schemas/Metadata"
304:
$ref: "#/components/responses/304"

/health:
get:
operationId: getHealth
tags: ["Health"]
summary: Get Health
description: |
Retrieve Kupo's application health status. Note that this call is cheap and does not halt the various concurrent tasks performed by the Kupo.

This endpoint has two possible content-types: `application/json` or `text/plain`. The latter returns health in a format suitable for [Prometheus](https://prometheus.io/docs/introduction/overview/). The server defaults to `application/json`, but you can control this behavior by passing a corresponding `Accept` header.

In addition, the server may return any of the following response codes:
- 200: when healthy
- 202: when connected to a chain producer but still syncing
- 503: when disconnected from a chain producer
responses:
200:
description: Healthy
headers: *default-headers
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/Health"

"text/plain;charset=utf-8":
schema:
$ref: "#/components/schemas/HealthPrometheus"
example: &HealthPrometheusExample |
# TYPE kupo_configuration_indexes gauge
kupo_configuration_indexes 1.0

# TYPE kupo_connection_status gauge
kupo_connection_status 1.0

# TYPE kupo_most_recent_checkpoint counter
kupo_most_recent_checkpoint 61264845

# TYPE kupo_most_recent_node_tip counter
kupo_most_recent_node_tip 82838775

# TYPE kupo_network_synchronization gauge
kupo_network_synchronization 0.73956

# TYPE kupo_seconds_since_last_block gauge
kupo_seconds_since_last_block 2.0

202:
description: Syncing
headers: *default-headers
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/Health"

"text/plain;charset=utf-8":
example: *HealthPrometheusExample
schema:
$ref: "#/components/schemas/HealthPrometheus"

503:
description: Unavailable
headers: *default-headers
content:
"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/Health"

"text/plain;charset=utf-8":
example: *HealthPrometheusExample
schema:
$ref: "#/components/schemas/HealthPrometheus"

/metrics:
get:
operationId: getMetrics
tags: ["Health"]
summary: Get Metrics
description: |
Like [`/health`](#operation/getHealth), but always return `200 OK` as a status.
responses:
200:
description: Metrics
headers: *default-headers
content:
"text/plain;charset=utf-8":
example: *HealthPrometheusExample
schema:
$ref: "#/components/schemas/HealthPrometheus"

"application/json;charset=utf-8":
schema:
$ref: "#/components/schemas/Health"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

OpenAPI advertises endpoints that are not mounted in the router.

From Line 1956 onward, this spec includes operations like /patterns, /checkpoints, and /metrics, but the current router wiring (in crates/minikupo/src/lib.rs) mounts a narrower set. This creates API contract drift and client-facing 404/405 behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/openapi.yaml` around lines 1956 - 2276, The OpenAPI spec
includes operationIds (getPatterns, putPatterns, matchPattern, deletePattern,
putPattern, sampleCheckpoints, getCheckpointBySlot, getMetadataBySlot,
getHealth, getMetrics) that are not wired in the application router, causing
contract drift; fix by either (A) adding corresponding route handlers and wiring
in the router setup function (e.g., the function that configures/mounts routes
such as configure_routes/create_router/register_routes) mapping each operationId
to its handler and returning the correct status/content-types, or (B) removing
or trimming those paths from openapi.yaml to match the actual mounted endpoints;
ensure any added handlers follow existing handler signatures and middleware
patterns and update API tests and OpenAPI generation so spec and router remain
in sync.

Comment on lines +29 to +39
if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}

let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(bytes.as_slice().into())
}

fn datum_hash_hint() -> String {
"Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hash validation allows uppercase despite a lowercase-only contract.

Line 29 accepts uppercase hex, while Line 38 says lowercase-only (and the spec documents lowercase patterns). Enforce lowercase or relax the hint/spec.

Proposed fix (strict lowercase)
-    if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
+    if value.len() != 64
+        || !value
+            .bytes()
+            .all(|b| b.is_ascii_digit() || (b'a'..=b'f').contains(&b))
+    {
         return Err(StatusCode::BAD_REQUEST);
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if value.len() != 64 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}
let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(bytes.as_slice().into())
}
fn datum_hash_hint() -> String {
"Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
}
if value.len() != 64
|| !value
.bytes()
.all(|b| b.is_ascii_digit() || (b'a'..=b'f').contains(&b))
{
return Err(StatusCode::BAD_REQUEST);
}
let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(bytes.as_slice().into())
}
fn datum_hash_hint() -> String {
"Invalid datum hash. Hash must be 64 lowercase hex characters.".to_string()
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/routes/datums.rs` around lines 29 - 39, The datum hash
validator currently accepts uppercase hex; update the validation in the function
that returns Err(StatusCode::BAD_REQUEST) so it enforces 64 characters and only
lowercase hex (e.g., require value.len() == 64 && value == value.to_lowercase()
&& value.chars().all(|c| c.is_ascii_hexdigit())), and keep or adjust
datum_hash_hint() to match the new lowercase-only rule so the hint and
validation are consistent.

Comment on lines +30 to +40
if value.len() != 56 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}

let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
let bytes: [u8; 28] = bytes.try_into().map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(Hash::new(bytes))
}

fn script_hash_hint() -> String {
"Invalid script hash. Hash must be 56 lowercase hex characters.".to_string()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Script hash validation and error hint are inconsistent.

Line 30 allows uppercase hex, but Line 40 states lowercase-only. Align parser behavior with the declared contract.

Proposed fix (strict lowercase)
-    if value.len() != 56 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
+    if value.len() != 56
+        || !value
+            .bytes()
+            .all(|b| b.is_ascii_digit() || (b'a'..=b'f').contains(&b))
+    {
         return Err(StatusCode::BAD_REQUEST);
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if value.len() != 56 || !value.chars().all(|c| c.is_ascii_hexdigit()) {
return Err(StatusCode::BAD_REQUEST);
}
let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
let bytes: [u8; 28] = bytes.try_into().map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(Hash::new(bytes))
}
fn script_hash_hint() -> String {
"Invalid script hash. Hash must be 56 lowercase hex characters.".to_string()
if value.len() != 56
|| !value
.bytes()
.all(|b| b.is_ascii_digit() || (b'a'..=b'f').contains(&b))
{
return Err(StatusCode::BAD_REQUEST);
}
let bytes = hex::decode(value).map_err(|_| StatusCode::BAD_REQUEST)?;
let bytes: [u8; 28] = bytes.try_into().map_err(|_| StatusCode::BAD_REQUEST)?;
Ok(Hash::new(bytes))
}
fn script_hash_hint() -> String {
"Invalid script hash. Hash must be 56 lowercase hex characters.".to_string()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/minikupo/src/routes/scripts.rs` around lines 30 - 40, The validation
currently permits uppercase hex but the user-facing hint from script_hash_hint()
claims lowercase-only; change the parser in the validation block to enforce
lowercase hex by adding a lowercase check (e.g. ensure value.len() == 56, all
chars are hex, and value == value.to_lowercase()), and keep the error returns
unchanged so that Hash::new(bytes) still receives a 28-byte lowercase hex
decode; adjust the condition that currently uses value.chars().all(|c|
c.is_ascii_hexdigit()) to also verify lowercase (or compare value to
value.to_lowercase()) to match script_hash_hint().

@scarmuega scarmuega merged commit 1a700d5 into main Mar 7, 2026
12 checks passed
@scarmuega scarmuega deleted the exp/kupo branch March 7, 2026 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants