feat(ledger): trade-ledger foundation — schema + no-op access layer#115
Conversation
Adds postgres:17 service, initial migration, sqlx-cli migrate script, and Ledger traits in Rust + Go with no-op defaults. Binaries stay backward-compatible when DATABASE_URL is unset.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Pablosinyores
left a comment
There was a problem hiding this comment.
Review — feat(ledger): trade-ledger foundation
Foundation slice is well-scoped and the "zero behaviour change when DATABASE_URL unset" promise holds in code. Verified locally: cargo build -p aether-common, cargo test -p aether-common --lib db:: (2/2), go build ./internal/db/..., go test ./internal/db/..., go vet ./internal/db/..., docker compose config, bash -n scripts/db_migrate.sh — all clean. Schema↔payload mapping walked column-by-column on both sides.
Hot-path latency impact verified zero: NoopLedger is empty-body vtable dispatch, no alloc, no atomic. OnceLock/sync.Once only fires in the constructor.
Blocking (CRITICAL)
1. internal/db/ledger.go:40 — Error string is not nullable
SQL inclusion_results.error TEXT is nullable. Go zero-value "" will write an empty string instead of NULL, so WHERE error IS NULL returns zero rows for successful inclusions. Rust counterpart at crates/common/src/db.rs:70 correctly uses Option<String>.
- Error string
+ Error *string2. internal/db/ledger.go:39 — LandedTxHash []byte has no length contract
Schema is semantically a 32-byte tx hash; Rust counterpart at crates/common/src/db.rs:69 is Option<B256> (exactly 32 bytes). The unbounded []byte lets a caller silently write 0/20/64 bytes — Rust readers would then panic or get garbage. Suggest *[32]byte (nullable) or at minimum a documented length contract.
Should Fix Before Merge (WARNING)
go.mod:29—github.com/google/uuidis marked// indirectbut is now imported directly ininternal/db/ledger.go:16. Anygo mod tidywill rewritego.mod/go.sum. Rungo mod tidyand commit the result.crates/common/src/db.rs:117-136—uuid_compat::Uuidhas no documented byte-order contract.google/uuidis RFC 4122; if the futureuuid::Uuidswap uses a different layout, the FK correlationarbs.arb_id↔bundles.arb_idbreaks silently. Add// TODO: replace with uuid::Uuid (RFC 4122 byte order) before PgLedger impl shipson the struct so the swap is gated on the byte-layout match.crates/common/src/db.rs:52—NewPool.protocol: Stringis stringly-typed;ProtocolTypeenum already exists atcrates/common/src/types.rs:7. Casing drift in the DB will silently breakWHERE protocol = ...filters. Bind through the enum'sDisplayimpl.deploy/docker/docker-compose.yml:177— postgres service starts unconditionally on everydocker compose up, consuming port 5432 and ~50MB RAM even when no binary is wired to it. Therethservice at line 199 usesprofiles: [node]for the same reason. Addprofiles: [ledger](or similar) to preserve the "no surprise" contract for current operators.- Clock-authority inconsistency —
internal/db/ledger.go:25(SubmittedAt) and:41(ResolvedAt) are client-settime.Times that override the SQLDEFAULT now(), whilearbs.tshas no Rust field and is DB-authoritative. If the Go executor's clock drifts vs Postgres,submitted_at/resolved_ataren't safely comparable witharbs.ts. Either document this as intentional (Go clock = "when Go acted", DB clock = "when row was written") or unify.
Nice to Have (SUGGESTION)
migrations/0001_trade_ledger.sql:62—inclusion_results_bundle_id_idxis redundant with the composite PK(bundle_id, builder)(line 59); the PK B-tree already serves any leading-column query. Drop it.internal/db/ledger.go:57-59— interface methods take structs by value.NewBundle.SignedTxHexis a ~500B string. NoopLedger is fine (compiler elides), but the future PgLedger is forced to copy on every call. Pointer receivers cost nothing now and avoid a future interface-breaking change.crates/common/src/db.rs:81— trait methods are infallible by design. This boxes the future PgLedger into log-and-drop, which works for fire-and-forget but rules out caller-side retry/metric emission. Consider-> Result<(), LedgerError>with an extension trait that auto-logs-and-drops for hot-path callers.
What's Good
- Single coherent commit, conventional format, clean message.
- 9 files, no drive-by edits, scope matches the PR body.
NUMERIC(78,0)correctly sized for U256 (2^256 has 78 digits).BYTEA/TIMESTAMPTZchoices are correct.OnceLock<()>/sync.Oncestartup-warn is symmetric across both binaries with identical log message, so operators can grepledger disabledon either side.- Trait object safety is explicitly tested (
db.rs:151). - No existing code imports
aether_common::dborinternal/dboutside the new files — runtime behaviour vsmainis genuinely unchanged.
Verdict
REQUEST CHANGES — gated on the two CRITICAL items above. Once those land, the WARNINGs are cheap fixes that lock the schema/trait into a shape that won't fight the PgLedger follow-up.
…sion NewInclusion previously used a non-nullable string for Error and an unbounded []byte for LandedTxHash. Both diverged from the SQL schema and the Rust counterpart (Option<String>, Option<B256>): - string would silently write "" to a NULL-allowed column, breaking WHERE error IS NULL queries used by the reconciliation job. - []byte allowed any length, while inclusion_results.landed_tx_hash is semantically a 32-byte tx hash and Rust enforces B256. Switching to *string and *[32]byte makes nil round-trip to SQL NULL and pushes the 32-byte invariant into the type system, eliminating silent corruption when readers expect a fixed-length hash. Refs PR #115 review.
go mod tidy hoists github.com/google/uuid (used by internal/db) and the four go.opentelemetry.io packages already imported directly elsewhere out of the indirect block. Required after the trade-ledger foundation imported uuid as a direct dep. Refs PR #115 review.
Replaces the stringly-typed protocol field with the ProtocolType enum so callers cannot construct a pool record with a value the rest of the system does not recognise. The Postgres column remains TEXT — the PgLedger impl will serialise via ProtocolType's serde repr, preserving schema compatibility while pushing the validation up to compile time. Adds Default for ProtocolType (UniswapV2) so NewPool keeps its derived Default; UniswapV2 = 1 mirrors the on-chain enum's first variant and never silently maps to an undefined protocol code. Refs PR #115 review.
Pins the in-memory layout to RFC 4122 network byte order, matching
Postgres's uuid binary wire format and uuid::Uuid::{as_bytes,
from_bytes}. Documents the contract on the module and on both
constructors so the eventual swap to uuid::Uuid is a straight
round-trip with no endianness surprises.
Refs PR #115 review.
Devs not using the trade ledger should not pay for an idle Postgres process. Adding the service to the 'ledger' profile keeps it out of the default 'docker compose up' set; opt in with: docker compose --profile ledger up -d postgres Mirrors the existing 'node' profile used for the optional Reth service. .env.example updated with the new bring-up sequence. Refs PR #115 review.
Splits the timestamp columns into two classes: - Event-time (arbs.ts, bundles.submitted_at, inclusion_results .resolved_at) are client-set; DEFAULT now() is a fallback for ad-hoc psql inserts only. Application writers populate the value at the moment the event happens to keep latency analysis and back-tests honest. - Persistence-boundary (pool_registry.first_seen/last_seen, pnl_daily.updated_at) are DB-set, since they record when the row landed, not when the event occurred. Documents the policy at the head of the migration so future writers do not silently fall back to now() and skew downstream analysis. Refs PR #115 review.
The manual Default impl introduced in 0944ef3 trips clippy::derivable_impls under cargo clippy --workspace -- -D warnings, which is the gate the CI rust job runs. Replacing it with #[derive (Default)] + #[default] on UniswapV2 keeps the same semantics (UniswapV2 = 1 as the safe sentinel) and unblocks CI. Refs PR #115 review.
Pablosinyores
left a comment
There was a problem hiding this comment.
Re-review — all prior findings addressed
Verified locally on e4fc873: cargo build -p aether-common, cargo test -p aether-common --lib db:: (2/2), cargo clippy -p aether-common --all-targets -- -D warnings, go build ./internal/db/..., go test ./internal/db/..., go vet ./internal/db/... — all clean. CI green (rust 2m55s, go 29s, solidity 17s, toolchain 4s). mergeStateStatus: CLEAN.
Blocking findings — both resolved
internal/db/ledger.go:48-49 — LandedTxHash *[32]byte and Error *string now match the Rust counterpart's nullability + length contract. The struct doc explicitly calls out the bugs the previous shapes would have caused ("" instead of NULL; arbitrary-length tx hashes), which closes the loop nicely for a future reader.
Non-blocking findings — all resolved
go.mod:7-12—github.com/google/uuidand the fourgo.opentelemetry.io/otelpackages promoted to the direct require block.go mod tidyis now a no-op for contributors.crates/common/src/db.rs:124-137—uuid_compat::Uuidbyte-order contract documented as RFC 4122 network byte order, with explicit notes on Postgres binary wire format compatibility,uuid::Uuid::as_bytes/from_bytesround-trip, and the gotcha aroundfrom_bytes_le/ Windows GUIDs. The futureuuid::Uuidswap is now a typed contract, not an implicit assumption.crates/common/src/db.rs:59—NewPool.protocolis nowProtocolType, notString. Stringly-typed casing drift in the DB is no longer possible.crates/common/src/types.rsalso derivesDefaultmapping toProtocolType::UniswapV2 = 1so theDefaultderive onNewPoolkeeps working — the rationale is in the doc comment, which I think is the right safe sentinel.deploy/docker/docker-compose.yml—postgresservice is now gated behindprofiles: [ledger]. Verified:docker compose config --servicesreturns 9 services (no postgres);docker compose --profile ledger config --servicesreturns 10 (postgres added). The "no surprise" contract for current operators is intact.migrations/0001_trade_ledger.sql:11-22— clock-authority policy is now an explicit comment in the schema header: event-time columns (arbs.ts,bundles.submitted_at,inclusion_results.resolved_at) are CLIENT-SET, persistence-boundary columns (pool_registry.first_seen/last_seen,pnl_daily.updated_at) are DB-SET. The reasoning (event-vs-row time skew under load) is captured. Future wiring PRs have a clear contract.
One minor follow-up (non-blocking, just for tracking)
The new clock-authority policy says arbs.ts is CLIENT-SET, but NewArb at crates/common/src/db.rs:30-48 still has no ts field. The wiring follow-up that ships PgLedger will need to add pub ts: chrono::DateTime<Utc> (or equivalent) to NewArb and have callers populate it at arb-creation time. Same applies to NewBundle.SubmittedAt and NewInclusion.ResolvedAt on the Go side, which already have client-set fields — those are good. Just calling out the asymmetry so it doesn't slip when PgLedger lands.
Quality of the fixes is high — author didn't just patch the field types but also added doc comments explaining the bugs the original shapes would have caused, which makes the intent durable.
Verdict: APPROVE
…sion NewInclusion previously used a non-nullable string for Error and an unbounded []byte for LandedTxHash. Both diverged from the SQL schema and the Rust counterpart (Option<String>, Option<B256>): - string would silently write "" to a NULL-allowed column, breaking WHERE error IS NULL queries used by the reconciliation job. - []byte allowed any length, while inclusion_results.landed_tx_hash is semantically a 32-byte tx hash and Rust enforces B256. Switching to *string and *[32]byte makes nil round-trip to SQL NULL and pushes the 32-byte invariant into the type system, eliminating silent corruption when readers expect a fixed-length hash. Refs PR #115 review.
go mod tidy hoists github.com/google/uuid (used by internal/db) and the four go.opentelemetry.io packages already imported directly elsewhere out of the indirect block. Required after the trade-ledger foundation imported uuid as a direct dep. Refs PR #115 review.
Replaces the stringly-typed protocol field with the ProtocolType enum so callers cannot construct a pool record with a value the rest of the system does not recognise. The Postgres column remains TEXT — the PgLedger impl will serialise via ProtocolType's serde repr, preserving schema compatibility while pushing the validation up to compile time. Adds Default for ProtocolType (UniswapV2) so NewPool keeps its derived Default; UniswapV2 = 1 mirrors the on-chain enum's first variant and never silently maps to an undefined protocol code. Refs PR #115 review.
Pins the in-memory layout to RFC 4122 network byte order, matching
Postgres's uuid binary wire format and uuid::Uuid::{as_bytes,
from_bytes}. Documents the contract on the module and on both
constructors so the eventual swap to uuid::Uuid is a straight
round-trip with no endianness surprises.
Refs PR #115 review.
Devs not using the trade ledger should not pay for an idle Postgres process. Adding the service to the 'ledger' profile keeps it out of the default 'docker compose up' set; opt in with: docker compose --profile ledger up -d postgres Mirrors the existing 'node' profile used for the optional Reth service. .env.example updated with the new bring-up sequence. Refs PR #115 review.
Splits the timestamp columns into two classes: - Event-time (arbs.ts, bundles.submitted_at, inclusion_results .resolved_at) are client-set; DEFAULT now() is a fallback for ad-hoc psql inserts only. Application writers populate the value at the moment the event happens to keep latency analysis and back-tests honest. - Persistence-boundary (pool_registry.first_seen/last_seen, pnl_daily.updated_at) are DB-set, since they record when the row landed, not when the event occurred. Documents the policy at the head of the migration so future writers do not silently fall back to now() and skew downstream analysis. Refs PR #115 review.
Replaces fire-and-forget tokio::spawn per write with a single dedicated
writer task draining a bounded mpsc::channel(1024). Every Ledger trait
method becomes a non-awaiting try_send; saturation drops the row and
bumps a metric instead of fanning out unbounded background tasks while
Postgres is slow.
Adds LedgerMetrics, registered on the engine's existing prometheus
Registry via the new EngineMetrics::registry() accessor so a single
/metrics endpoint emits both engine and ledger families:
aether_ledger_writes_total{op, result} Counter
aether_ledger_drops_total{op} Counter
aether_ledger_queue_depth Gauge
aether_ledger_write_latency_ms{op} Histogram
The writer task observes per-op latency from dequeue to query
completion, decrements queue_depth on dequeue, and tags every write
ok/err. A failing write logs and drops — never propagates to the
engine.
Channel size sized for ~3 s of bursty inserts at 200 arbs/s peak, so
saturation is the alert signal that Postgres is the bottleneck. The
sqlx pool stays bounded at 8 connections, so even if every slot stalls
the channel still drops cleanly rather than blocking.
Refs PR #115 follow-up; PR-1 of the 3-PR plan to close issue #95.
Replaces fire-and-forget tokio::spawn per write with a single dedicated
writer task draining a bounded mpsc::channel(1024). Every Ledger trait
method becomes a non-awaiting try_send; saturation drops the row and
bumps a metric instead of fanning out unbounded background tasks while
Postgres is slow.
Adds LedgerMetrics, registered on the engine's existing prometheus
Registry via the new EngineMetrics::registry() accessor so a single
/metrics endpoint emits both engine and ledger families:
aether_ledger_writes_total{op, result} Counter
aether_ledger_drops_total{op} Counter
aether_ledger_queue_depth Gauge
aether_ledger_write_latency_ms{op} Histogram
The writer task observes per-op latency from dequeue to query
completion, decrements queue_depth on dequeue, and tags every write
ok/err. A failing write logs and drops — never propagates to the
engine.
Channel size sized for ~3 s of bursty inserts at 200 arbs/s peak, so
saturation is the alert signal that Postgres is the bottleneck. The
sqlx pool stays bounded at 8 connections, so even if every slot stalls
the channel still drops cleanly rather than blocking.
Refs PR #115 follow-up; PR-1 of the 3-PR plan to close issue #95.
Summary
postgres:17compose service (healthcheck + named volume).migrations/0001_trade_ledger.sql— 5 tables:arbs,bundles,inclusion_results,pool_registry,pnl_daily. U256 →NUMERIC(78,0), variable-shape path data →JSONB, all timestampsTIMESTAMPTZ. Indexes per the issue spec.scripts/db_migrate.sh— sqlx-cli wrapper, loads.env, applies migrations.Ledgertrait +NoopLedgerdefault in Rust (crates/common/src/db.rs) withNewArb/NewPool/InclusionUpdatepayloads.Ledgerinterface +NoopLedgerdefault in Go (internal/db/ledger.go) withNewBundle/NewInclusion/PnLDailyDeltapayloads..env.exampledocumentsDATABASE_URL+POSTGRES_USER/PASSWORD/DB(commented out so unset behaviour stays the default).Backward compatibility: binaries still build and run identically to current
main.NoopLedgerdiscards every write and logs once at startup so operators can grep forledger disabled.Files Changed
migrations/0001_trade_ledger.sqlscripts/db_migrate.shcrates/common/src/db.rsLedgertrait +NoopLedgercrates/common/src/lib.rsdbmodulecrates/common/Cargo.tomlserde_json+tracingto runtime depsinternal/db/ledger.goLedgerinterface +NoopLedgerinternal/db/ledger_test.godeploy/docker/docker-compose.ymlpostgres:17service +postgres-datavolume.env.exampleDATABASE_URL+POSTGRES_*linesAcceptance Criteria (issue scope satisfied by this PR)
docker compose up postgresbrings up Postgres 17 with healthcheck (compose config validated; reviewer to do liveup)scripts/db_migrate.shis in place; applies0001_trade_ledger.sqlidempotently against any reachable PostgresDATABASE_URLunset: both binaries boot and run identically to today (no writes, no errors)cargo build --workspace,cargo clippy --workspace --all-targets -- -D warnings,cargo test -p aether-commoncleango build ./...,go test ./... -race -count=1,go vet ./...cleanTest plan
cargo build --workspace— greencargo clippy --workspace --all-targets -- -D warnings— cleancargo test -p aether-common --release— 40 pass incl new NoopLedger testsgo build ./...— greengo test ./... -count=1— green incl new ledger smoke testgo vet ./...— cleandocker compose -f deploy/docker/docker-compose.yml config— postgres service +postgres-datavolume registered correctlydocker compose up postgres -d+DATABASE_URL=postgres://aether:aether@localhost:5432/aether ./scripts/db_migrate.shapplies the migrationOut of scope — tracked under #95, will land in follow-ups
PgLedger—sqlx::PgPoolimpl ofLedger; engine wiring (insert_arbon everyARB PUBLISHED,insert_poolon auto-registered pools); workspacesqlx+uuiddeps.PgLedger—pgxpoolimpl; executor wiring (InsertBundleon bundle build,InsertInclusiononGetBundleStatsresolve).pnl_dailyrollup — scheduled job, once per UTC day.postgres:17service container in the Rust integration-test job, schema-applied + round-trip test.Refs #95