feat: v0.1.34 — validator commission, anchored session grid, settlement fixes, app/supplier hardening#1949
Merged
oten91 merged 41 commits intoMay 28, 2026
Conversation
cd668c7 to
35866b9
Compare
…v0.1.34 upgrade handler - Reject duplicate addresses in ValidateServiceRevShare - Iterate sorted map keys in distributeSupplierRewardsToShareholders for defense-in-depth - Reject module accounts as supplier owner address in StakeSupplier - Handle failed unbond refunds gracefully in EndBlocker instead of halting chain - Add DeduplicateSupplierRevShareAddresses migration for existing state - Add v0.1.34 upgrade handler with rev share dedup migration - Add accountKeeper to supplier keeper for module account detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the GitHub API is unreachable, Ignite's resolveDevVersion returns "development" or "nightly" instead of a real version, which fails the >= v29 comparison. Skip the check in that case so local builds and CI don't break on transient API outages.
…session queries Applications previously overwrote their service config destructively on every restake (msg_server_stake_application.go: app.ServiceConfigs = msg.Services), so historical session queries (GetSession at a past height) resolved against the latest config rather than the one active at that height. This is the application-side analogue of the supplier service_config_history mechanism and the supplier session-query non-determinism already fixed. Design: lazy history with a flat-snapshot fallback. - An application that has never changed its service config keeps an empty service_config_history; GetActiveServiceConfigs falls back to the flat ServiceConfigs for all heights. This preserves today's behavior (new apps are immediately active) and requires NO upgrade migration or genesis backfill. - A service swap records history lazily: on the first change the prior config is materialized as active since height 1, then closed at the next session start while the new config opens at the next session start. Same-service restakes (stake bumps) only refresh the flat snapshot. - History is never pruned (keep-forever). Apps are orders of magnitude fewer than supplier service rows, and keeping it avoids the pruning-induced historical-query non-determinism seen on the supplier side. Changes: - proto: ApplicationServiceConfigUpdate message + Application.service_config_history (field 9). - Application.GetActiveServiceConfigs(height) / ApplicationServiceConfigUpdate.IsActive(height) / BackfillServiceConfigHistory. - recordApplicationServiceConfigChange replaces the destructive overwrite in createApplication/updateApplication and is reused by application transfer. - session_hydrator resolves the app's service via GetActiveServiceConfigs(blockHeight), matching supplier hydration. - GetApplication normalizes nil history to an empty slice (same convention as ServiceConfigs/PendingUndelegations) for stable in-memory + event round-trip representation. - genesis validates history entries when present. - v0.1.34 upgrade handler: no application migration needed (documented). Tests: application history activation/deactivation, empty-history fallback, lazy recording on swap vs no-op on same-service restake; existing keeper event tests updated for the new normalized repeated field.
…cations Consistency follow-up to the service config history audit: GetAllApplications now normalizes nil ServiceConfigHistory to an empty slice, matching GetApplication and the iterator accessor. Proto marshals nil and empty slices identically (no consensus or genesis round-trip impact), so this only keeps the in-memory representation consistent across read paths.
Settlement previously split the "proposer" reward pool purely by
network-wide stake weight, ignoring validator commission entirely
(validators earned only their self-bonded share, ~0.27% of the pool on
mainnet). Commission rates were effectively decorative.
Rewrite DistributeValidatorRewards as a two-level, commission-aware
allocation:
- Level 1: split the pool across validators by bonded-stake weight (LRM).
- Level 2 per validator: carve commission = floor(poolShare x rate) to
the operator account, then split the remainder among that validator's
delegators (incl. self-delegation) by stake weight (LRM).
Commission is read deterministically (LegacyDec.MulInt().TruncateInt()).
commission=0 reproduces the prior behavior exactly. At current mainnet
rates validators move from ~0.27% to ~16.56% of the pool.
This is consensus-breaking (different bank operations for the same
settled claims). It activates atomically when validators run the v0.1.34
binary at the upgrade height; no KVStore migration or handler logic is
required. Documented in the v0.1.34 upgrade handler comment.
Add EventValidatorRewardDistribution: one summary event per bonded
validator per op_reason per settlement block (bounded by validator-set
size, so it does not multiply with delegators or claims and preserves the
#1758 event-count reduction). It exposes pool_share, commission,
self/external delegator rewards, and total delegated stake so a
delegator's per-validator earnings can be reconstructed off-chain even
though their cross-validator rewards collapse into a single
EventSettlementBatch transfer. session_end_block_height is threaded from
FlushBatchedValidatorRewards.
Settlement auto-unstake compared application stake against the hardcoded apptypes.DefaultMinStake (1 POKT) instead of the on-chain min_stake param (1,000 POKT on mainnet), so apps that dropped below the real min_stake were never force-unbonded and lingered as zombies burning stake. - Fix 1: ProcessTokenLogicModules reads applicationKeeper.GetParams().MinStake. Stake only drops via settlement burn, so this catches every future crossing in the same block it happens. - Fix 2: one-time backfill MarkBelowMinStakeApplicationsUnbonding, run from the v0.1.34 upgrade handler, clears the pre-upgrade backlog of idle below-min_stake apps. A recurring scan was rejected (future crossings already caught by Fix 1). - No orphaned-payment risk: hydrator stops assigning at UnstakeSessionEndHeight and unbonding period (60 blocks) > settlement lag (~33 blocks), so in-flight claims settle before the app is removed. - Tests: discriminating settlement test (min_stake=1,000 POKT) + sweep test. - testutil mock GetParams now returns DefaultParams (non-nil MinStake).
… event
The lazy service config history feature (ad62f6d0e, 81851bd81) normalizes
Application.ServiceConfigHistory to a non-nil empty slice, so the
EventMorseApplicationClaimed now carries []{} instead of nil. Update the
expected fixture to match, mirroring the normalization fixed in
GetAllApplications (81851bd81).
… test Pre-existing lint failure: inner `found` shadowed the outer declaration. Rename to `gotFound` to satisfy govet shadow check.
…ssion (#543) Make num_blocks_per_session changeable to ANY value via governance without misaligning in-flight sessions. Session boundaries are now computed relative to a per-epoch grid anchor stored in shared Params, instead of a single modulo from block 1 (the #543 bug: a non-divisor N change re-derived in-flight sessions onto a different grid -> session-id mismatch -> claims rejected, suppliers unpaid). Consensus-breaking; ships in v0.1.34. Core: - proto: add derived (non-governable) session_grid_anchor_height and session_number_at_anchor to shared Params (fields 12/13). - session.go: epoch-relative GetSessionStartHeight/EndHeight/Number with a guard that falls back to the genesis block-1 grid when params describe a later epoch than the query height (anchor > h). Bit-identical to legacy when anchor in {0,1}. - handler: recordParamsHistory stamps the anchor + seeds the genesis epoch at height 1; sanitizes governance-supplied anchor values. - shared EndBlocker promotes a params epoch to live at its effective height (keeper.EndBlocker), ordered last among shared-params consumers in app_config. - v0.1.34 upgrade handler stamps live anchor=1 and seeds params history at height 1 (protects in-flight unbonders, no new field/backfill). Narrow Option B: only a num_blocks_per_session change defers its LIVE write to the boundary (so in-flight sessions keep the old N); every other shared param still takes effect on live immediately, preserving existing semantics. Blocking preconditions for a production N change, shipped together: - F1: unbonding sweeps (supplier/application/gateway) compute the unbonding end height with params at the unstake height, not live -> no early release on an N decrease. Adds GetParamsAtHeight to the application/gateway SharedKeeper ifaces. - F2: settlement per-session budget divisor reads params at the claim's session-end height (token_logic_modules.go). - O1: off-chain difficulty cache-clear (cache/options.go) now detects the real settlement boundary (tail+1 blocks back) instead of a within-session offset that never fired once N <= tail. - O2: new shared ParamsAtHeight query endpoint; off-chain sharedQuerier window methods read params at-height (with a cache-friendly fast path when the query height is within the live epoch). Tests: anchored-grid unit tests (legacy-equivalence, non-divisor transition, monotonic numbering, anchor-after-height guard); EndBlocker promotion off-by-one; End/BeginBlocker ordering assertions; in-flight settlement repro (non-divisor 4->3, session id stable + settles + paid); param-update integration suites adapted to deferral.
…onding events Same nil-vs-empty normalization as the Morse claim event fix: lazy service config history (ad62f6d0e, 81851bd81) makes the unbonding-begin/end events carry an empty (non-nil) ServiceConfigHistory. Update the expected fixture.
…ing-app claim The lazy service config history feature (ad62f6d0e) now populates Application.ServiceConfigHistory on stake/re-stake. Update the TestClaimMorseExistingApplication fixture to expect the 2-entry history the feature produces: nosvc deactivated and s.appServiceConfig activated at the next session start boundary. Add an app-side test helper (testutil/application/service_config_history.go) mirroring the supplier-side CreateServiceConfigUpdateHistoryFromServiceConfigs, and derive the transition height from shared params rather than hard-coding it. Test-only change; no production behavior affected.
…d same-session swaps Two bugs in the lazy service config history feature surfaced during a branch audit: 1. Transfer-merge dropped history for merged-in services. mergeAppServiceConfigs pre-mutated dstApp.ServiceConfigs to the union before recordApplicationServiceConfigChange compared the receiver against that same slice, so the comparison was always equal and history was never updated. A transfer into an app that had a prior service swap (non-empty history) left the merged-in service unresolvable by session hydration. Fixed by returning the union from mergeAppServiceConfigs without mutating dstApp, so the change is detected. 2. A same-session double swap (svcA -> svcB -> svcC) left a zero-width history entry (activation == deactivation) for the never-served intermediate service. Harmless at runtime (IsActive always false) but rejected by GenesisState.Validate, breaking genesis export/import round-trips. Fixed by dropping any entry still scheduled to activate at the next session boundary instead of deactivating it at its own activation height. Adds regression tests for both paths.
…on boundary Extends the #543 anchored-grid Option B deferral from num_blocks_per_session to all session-timing params: the session/claim/proof window offsets and grace period offset. Settlement reads LIVE params, and claim/proof window math depends on these offsets. Applying an offset change to live params mid-flight re-measured an in-flight claim's window on the new value, so a shrunk window could mark a claim expired before its proof was ever submittable, orphaning it. Deferring the live write to the EndBlocker (promoted at the next session boundary) keeps in-flight sessions and claims on the params they were created under. Unbonding-period and compute-unit params keep the legacy immediate-live behavior. The bulk UpdateParams path delegates to recordParamsHistory, so it is covered automatically. Converts the five offset param tests to assert the deferred behavior via a shared requireSessionTimingParamDeferred helper.
… params Session-timing shared params (num_blocks_per_session and the session/claim/proof window offsets and grace period offset) are deferred to the next session boundary by the shared EndBlocker per the #543 anchored grid (Option B). Live params lag the recorded value until the boundary, so the e2e update-params assertion must wait for that promotion before querying live — comparing immediately would compare against the OLD value still in effect and falsely fail. assertExpectedModuleParamsUpdated now waits for the next session boundary + 1 when any session-timing param is in the expected updated set for the shared module. The boundary is computed from the current live shared params, which still describe the current epoch (deferral keeps them there until promotion). Other modules and non-timing shared params take the legacy path with no wait.
…heus values
Newer prometheus-operator CRD validation rejects null selectors with
`spec.<X>Selector.matchLabels in body must be of type object: "null"`,
breaking `helm upgrade` of the observability stack on a fresh localnet.
Replace `matchLabels: null` with `matchLabels: {}` on all four selectors
(scrapeConfigSelector, serviceMonitorSelector, ruleSelector, podMonitorSelector).
Empty object is the canonical "match all" form under Kubernetes label-selector
semantics and is accepted by the CRD.
The shared module Params message carries two DERIVED runtime fields stamped per epoch by recordParamsHistory and the EndBlocker (#543 anchored grid): SessionGridAnchorHeight and SessionNumberAtAnchor. These are not governance-settable and not constant beyond the genesis seed — they advance at every session boundary, so on any post-genesis chain they differ from the DefaultParams baseline (1, 1) that assertExpectedModuleParamsUpdated builds the expected struct from. A first e2e run showed a real mismatch at the assertion site: expected: SessionGridAnchorHeight=1, SessionNumberAtAnchor=1 actual: SessionGridAnchorHeight=511, SessionNumberAtAnchor=52 Read the live values from the actual response and overlay them onto the expected struct so the comparison is restricted to the governance-settable params. Combined with the previously-landed session-boundary wait, the shared update-params assertion is now stable against both deferred promotion and the running anchor advance.
…n changes The grid anchor and session number at the anchor are derived metadata that describe the CURRENT epoch's session grid. They must move when num_blocks_per_session changes — the new N does not align with the old grid, so the anchor records the new starting point. For any other param change, nextSessionStartHeight is already on the current grid, so a new anchor at that height is mathematically equivalent to the previous epoch's anchor: identical boundaries, identical session numbers. Previously, recordParamsHistory stamped a new anchor on every param update regardless of whether N changed, defensively making each history entry self-describing. The cost was a visibly drifting anchor on a chain where N never changes — surprising for observers, and broke the canonical "compare against DefaultParams + overlay" e2e assertion pattern for the shared module (forcing a special-case overlay of live anchor fields into the expected struct). Carry the previous epoch's anchor forward when N is unchanged. Boundary math and session-number monotonicity are preserved. The field is now stable unless N moves, which matches its semantic intent.
QueryParamsResponse.GetParams() returns a Params value, and the GetSessionGridAnchorHeight/GetSessionNumberAtAnchor getters are pointer-receiver methods, so calling them on the returned value failed to build with: cannot call pointer method GetSessionGridAnchorHeight on Params Use direct field access on the local copy returned by GetParams.
…ndow-offset claims Closes the cross-session window-offset orphan class (O2). A claim's settlement height is sessionEndHeight + GetSessionEndToProofWindowCloseBlocks(P_at_sessionEnd) + 1, where P_at_sessionEnd is the shared params epoch effective at the claim's sessionEndHeight. Settlement previously computed expiringSessionEndHeight from LIVE params only, so a window-offset change promoted to live between the claim's session end and its proof-window-close height (a normal occurrence — claim lifecycles commonly span two session boundaries) shifted the lookup off the claim's actual stored sessionEndHeight and orphaned the claim forever. The Phase 1 settlement loop now iterates a DEDUPED set of candidate sessionEndHeights, computed by walking params history backward from the current block bounded by 4*N (with a 240-block floor) and applying each recent epoch's offsets. Epochs with identical offsets collapse to one candidate, so the common case (no recent change) is a single iterator pass with zero added cost. When offsets differ between live and a recent prior epoch, both candidates are scanned and the claim is located by its actual stored sessionEndHeight. Adds IterateParamsHistoryReverse(ctx, fromHeight, fn) on the shared keeper to support the backward walk and exposes it on the SharedKeeper interface used by tokenomics. Includes TestWindowOffsetChange_CrossSessionClaimStillSettles: shrinks ProofWindowCloseOffsetBlocks while a session is in flight, runs the EndBlocker to promote the new offsets at the boundary, then settles at the height computed from the OLD offsets — asserts the claim still settles and the supplier is paid. Verified to fail (claim orphaned, 0 settled) when the candidate-set scan is reverted to the legacy single-iterator path. The legacy GetExpiringClaimsIterator is marked DEPRECATED but kept for external and test compatibility; settlement no longer calls it.
…ettlementContext Splits the per-epoch candidate sessionEndHeight computation into two functions: GetExpiringClaimsSessionEndHeights(ctx, blockHeight) // public candidateSessionEndHeightsForLiveParams(ctx, liveParams, blockHeight) // internal The internal variant takes live shared params as an injected argument so the settlement Phase 1 loop can pass the snapshot already held in settlementContext, avoiding a redundant store read. The exported variant queries live shared params itself and delegates to the internal one — usable by observability tooling and benchmarks that need the candidate set without having a settlementContext. No behavior change.
paramsAnyMapFromParamsStruct uses reflect.Value.NumField() to enumerate every field on a Params struct and emit a paramsAnyMap. With the #543 anchored-grid addition of two derived fields on shared.Params (session_grid_anchor_height and session_number_at_anchor), the helper now emits those fields too — and buildSharedMsgUpdateParams' switch fatally fails for any param name without a matching case, since these derived fields have no ParamX constant and are not governance-settable. The Stake_Supplier scenarios (User_can_unstake_a_Supplier, User_can_restake_a_Supplier_waiting_for_it_to_become_active_again) failed with: parse_params_test.go:248: ERROR: unexpected "int64" type param name "session_grid_anchor_height" parse_params_test.go:248: ERROR: unexpected "int64" type param name "session_number_at_anchor" Skip the derived field names in the reflection helper via a small denylist. Derived fields are stamped per-epoch by the shared keeper and must never be carried by a MsgUpdateParam(s) payload.
…droom LocalNet's pre-staked applications (app1, app2, app3) were calibrated with only 68 upokt of headroom above application.min_stake (100 POKT). That margin was safe under the pre-#1846 bug where settlement's auto-unbond check used the hardcoded apptypes.DefaultMinStake (1 POKT) instead of the on-chain param (100 POKT). With #1846 fixed (commit 64956d288, this branch), the check now reads the on-chain param correctly, so any settlement burn larger than 68 upokt pushes the app below min_stake and triggers MarkBelowMinStakeApplicationsUnbonding. That blew up the e2e suite: scenario 1 burns ~92,400 upokt of app1's stake, app1 then auto-unbonds, every subsequent scenario fails with "application is not active" or "application not found". Bump each pre-staked app to 200 POKT (200,000,000 upokt). That leaves ~100 POKT of headroom above min_stake — enough for ~1,000 settlement burns at the e2e's ~100,000 upokt-per-scenario rate. The matching application{1,2,3}_stake_config.yaml files under localnet/pocketd/config/ are gitignored runtime artifacts; bump them locally to 200000001upokt (1 above genesis, per the convention noted in this file) so manual `make stake_app_*` commands continue to produce a state change.
…parison Captures the perf characteristics of the #1758 settlement event aggregation work already on the branch. Benches: - BenchmarkAggregation_MainnetScale — aggregating realistic per-claim bank operations at mainnet scale (200 suppliers / batch). - BenchmarkEventManager_PreVsPostAggregation — measures the EventManager-level overhead before vs. after aggregation (v0.1.31 per-claim vs. v0.1.33 batched). - BenchmarkProtoMarshal_PreVsPost — proto marshal cost for the same pre/post comparison. - TestSettlementEventMemoryComparison — file-size + memory delta sanity check. - TestSettlementDownstreamOverhead — ABCI event proto bytes + index-key byte counts the indexer downstream pays. Bench-only (test build tag); excluded from regular `go test ./...` runs.
…ght scan
Codifies the perf claim made when introducing the O2 fix
(commit 1e3463e16): K=1 legacy parity at zero added cost, bounded overhead at
K=2 and K=3 candidates.
Matrix: K ∈ {1, 2, 3} candidate sessionEndHeights × M ∈ {200, 1000, 2551} claims
distributed evenly across the K candidates. Measured on M1, 2s/sub-bench:
K=1/M=2551 → 4.12 ms/op (baseline; equivalent to the legacy single-iterator path)
K=2/M=2551 → 4.36 ms/op (no measurable overhead vs K=1 at the same claim load)
K=3/M=2551 → 8.39 ms/op (~2x baseline; per-prefix-iterator overhead, not the
candidate-set computation itself)
Even the worst sub-bench is 0.012% of the ~68s mainnet block time.
Idempotent (read-only against the claim store) so b.N can grow arbitrarily large.
Bench-only (test build tag); excluded from regular `go test ./...` runs.
…OKT genesis Pairs with 57d5e5baa which bumped pre-staked applications in config.yml to 200 POKT for #1846 headroom. The application{1,2,3}_stake_config.yaml files drive `make stake_app_*` / tools/scripts/stake-apps.sh for manual restakes; bumping them keeps the restake amount > genesis amount (per the convention in config.yml) so manual stake commands continue to produce a state change.
…ration The v0.1.34 upgrade handler runs three migrations at the upgrade height — a bug in any of them halts the chain. Two of the three are covered in their respective module tests: - DeduplicateSupplierRevShareAddresses in x/supplier/keeper/migrate_duplicate_revshare_test.go - MarkBelowMinStakeApplicationsUnbonding in x/application/keeper/unbond_applications_test.go The third migration (the anchored-session-grid genesis seed, #543) had no dedicated test, and the handler descriptor (plan name, empty StoreUpgrades, non-nil CreateUpgradeHandler) had no static sanity check. This file adds: - TestUpgrade_0_1_34_PlanMetadata: pins the plan name to "v0.1.34", asserts no StoreUpgrades, and ensures CreateUpgradeHandler is wired. Catches the "plan name typo or nil handler" class of chain-halt bug at test time rather than at the upgrade height on mainnet. - TestUpgrade_0_1_34_SeedAnchoredSessionGrid: replicates the seed step against a fresh shared keeper and asserts the post-state: live params carry anchor=1 / session_number_at_anchor=1, params history has the genesis epoch recorded at effective_height=1 (the entry F1/F2 at-height reads need to resolve pre-upgrade heights to the legacy grid), and boundary math under the seeded anchor matches the legacy block-1 grid at several reference heights. - TestUpgrade_0_1_34_SeedIsIdempotent: re-running the seed leaves the same state — relevant for forensic replays crossing the upgrade height. The seed step is inlined into each test (rather than calling an unexported helper) so any change to the seed values or order in the handler will diverge this test and surface the change.
…g transfer When an application with non-empty ServiceConfigHistory is transferred to a NEW address (no merge), the post-transfer dst app shallow-copied srcApp.ServiceConfigHistory verbatim — each entry's ApplicationAddress field still named the source app. Indexers and queries that filter history by app address would miss the dst app's pre-transfer entries. Allocate a fresh history slice with cloned entries on the new-address path, rewriting ApplicationAddress to dst's address. Avoids aliasing srcApp's view (srcApp is removed below this point, but the clarity is worth the small alloc). Adds TestMsgServer_TransferApplication_NewAddressRewritesHistoryAppAddress as a regression — verified to fail (entries still naming src) when the rewrite block is removed. The transfer-to-existing-address (merge) path is unaffected: it does not copy srcApp.ServiceConfigHistory onto dstApp, and any new entries appended by recordApplicationServiceConfigChange already use dstApp.Address.
…scan correctness Closes two medium-priority test gaps left after the O2 fix: (1) Window-offset GROW direction. The existing TestWindowOffsetChange_CrossSessionClaimStillSettles only exercised the SHRINK case (oldProofClose=2 -> newProofClose=1). Refactor the test body into a shared helper runWindowOffsetCrossSessionTest(t, oldProofClose, newProofClose) and add the GROW companion (1 -> 2). The two tests now bracket the candidate scan mechanism from both directions of an offset change. (2) K=3 candidate-set correctness. The candidate-scan bench (settlement_candidate_scan_bench_test.go) exercised the K=3 case for PERFORMANCE but did not assert end-to-end CORRECTNESS — that the right number of candidates is returned AND that each candidate's iterator yields the corresponding claim. The new TestSettlementCandidateScan_MultiEpochCorrectness sets up three distinct params epochs (live + two history entries within the lookback) and verifies the candidate scan returns exactly three sessionEndHeight values, each containing the inserted claim. Catches future regressions where a multi-epoch lookback yields the wrong candidates or the wrong claims.
35866b9 to
767285d
Compare
settle_pending_claims_bench_test.go intentionally uses the lower-level EmitEvent API to measure raw bank-event aggregation overhead. Converting to EmitTypedEvent would change what's being benchmarked. The deprecation is paired with the telemetry/OTel migration in SDK v0.53.5+, planned for post-v0.1.34 work. Narrow path-scoped exclusion, mirrors the existing SimulationOperations pattern. Does not affect production code paths or other benches.
…t + revshare nil-safe Addresses 4 P1 findings from the v0.1.34 branch audit. All are chain-halt or fund-loss class. Pivoted on the supplier bank-send-err finding: instead of returning the error (which would chain-halt on legacy module-owned suppliers — the exact pre-existing state the new owner-check is supposed to leave alone), emit a new typed event for indexer/governance visibility and continue. This trades silent loss for tracked loss, without introducing a new halt vector. ## 1. x/application/keeper/unbond_applications.go:72 — nil MinStake guard EndBlockerUnbondApplications dereferenced k.GetParams(ctx).MinStake.Amount with no nil guard. The sister fn MarkBelowMinStakeApplicationsUnbonding in the same file already defended (line 119) with a DefaultMinStake fallback. Inconsistent treatment of mirror logic = real chain-halt vector if shared params are ever uninitialized. Fix: mirror line-119 guard. Pre-check appMinStake != nil; fall back to apptypes.DefaultMinStake. ## 2. x/supplier/keeper/unbond_suppliers.go:102 — same nil-deref Audit flagged the application-side site, missed the symmetric supplier site at line 102. Same fix applied with suppliertypes.DefaultMinStake. ## 3. x/supplier/keeper/unbond_suppliers.go:88-94 — observable stuck-coins path EndBlockerUnbondSuppliers silently logged the bank.SendCoinsFromModuleToAccount error and removed the supplier anyway, stranding stake in the supplier module pool with no recovery path. The audit suggested either matching the app-module halt policy (return err) or emitting a recoverable event. The halt approach was tried and abandoned: chain-halting on pre-existing legacy state (a module-owned supplier from before the stake-time module-account-owner check) would be strictly worse than the original silent loss — the very next session-end would re-trigger the same broken send and brick the chain. Final approach: emit a new typed event EventSupplierStakeStuckInModulePool capturing operator_address, owner_address, stuck_coin, the bank error string, and the session-end height. Removing the supplier from state still proceeds so the unbonding queue makes progress. Indexers/governance can detect stuck coins from the event stream and propose a reclaim path. Proto added in proto/pocket/supplier/event.proto with a comment explaining the legacy-state rationale; .pb.go regenerated via make proto_regen. ## 4. x/supplier/keeper/migrate_duplicate_revshare.go — nil-safe iteration DeduplicateSupplierRevShareAddresses runs from the v0.1.34 upgrade handler and dereferences rs.Address / rs.RevSharePercentage without nil-checking the slice entries. Pre-validation-era state could carry nil entries, panicking the upgrade handler. The outer loop nil-checks configUpdate.Service but not the rev-share slice contents. Fix: skip nil rs entries in both hasDuplicateRevShareAddresses and mergeRevShareDuplicates. ## Verification - make proto_regen: clean - go build ./...: clean - golangci-lint run x/application/keeper/... x/supplier/keeper/...: 0 issues - All existing TestEndBlockerUnbond* tests pass F1 integration test (third audit P1 item — N-decrease during in-flight unbonding actor) lands in a follow-up commit.
…-overdue, session-end gating, F1 (N-decrease) Adds the supplier-side EndBlockerUnbondSuppliers test file. Six pre-existing tests authored earlier (long-overdue replay, multi-overdue at different heights, mid-session no-op, stake-return amounts, full lifecycle, store cleanup) are committed here as a unit alongside the new F1 regression test: TestEndBlockerUnbondSuppliers_NumBlocksPerSessionDecreaseDoesNotReleaseEarly F1 (#543) is the funds-loss-on-N-decrease blocking precondition for the anchored session grid. A supplier that began unbonding under N=oldN must NOT be released early when num_blocks_per_session is later decreased mid- unbonding — that would shorten the promised unbonding window and let the supplier withdraw stake before in-flight claims settle. The fix in EndBlockerUnbondSuppliers computes unbonding-end via GetParamsAtHeight(unstakeSessionEndHeight) — the epoch effective when the supplier began unbonding — instead of live params. The test exercises this end-to-end: 1. N=20; stake + unstake supplier → captures unstakeSessionEndHeight. 2. Plant a new shared-params epoch with N=4 at the next session boundary (simulating governance promotion of a deferred N change). 3. Walk to the buggy newN-derived 'early horizon' (still strictly before the real oldN-derived horizon) and call EndBlockerUnbondSuppliers. Assert 0 unbonded + supplier still in state. Under the bug this would have released the supplier. 4. Walk to the true oldN-derived horizon, call EndBlockerUnbondSuppliers again. Assert 1 unbonded + supplier removed. Confirms the original unbonding commitment is honored even after governance shrinks N.
…grid seed, deprecated iterator switch, cross-delegation doc, validator log clarity
Addresses 5 P2 findings from the v0.1.34 branch audit. None are critical
on their own; bundled here to clear the audit backlog before the RC merge.
## 1. x/shared/keeper/params.go — safe unmarshal in params-history reads
Both GetParamsAtHeight (iterator path) and GetParamsHistoryEntry (exact-
key path) used cdc.MustUnmarshal on the bytes returned from the store. A
corrupted history entry — partial write, schema downgrade, bit rot —
would have panicked the chain at the EndBlocker every block after the
corruption. Defense in depth: switch to cdc.Unmarshal, log the failure,
and:
- GetParamsAtHeight falls through to live params (same recovery as
"no entry found"); downstream callers already tolerate this.
- GetParamsHistoryEntry returns (Params{}, false) — same shape as the
"missing entry" path, which is the EndBlocker's contract for the
deferred-promotion lookup.
Matches the audit_fix_patterns memory rule: MustUnmarshal is fine for
keeper store reads of expected-shape data, but secondary indexes and
history scans need the safer path.
## 2. app/upgrades/v0.1.34.go — idempotent grid seed at height 1
seedAnchoredSessionGrid unconditionally wrote SetParamsAtHeight(ctx, 1, ...).
On mainnet this is fine (history is empty pre-upgrade). On testnets where
a previous rehearsal may have already seeded a different grid, the
unconditional write would silently overwrite the operator's customization.
Fix: only write if no entry exists at height 1
(via the new GetParamsHistoryEntry helper). Logs a skip notice when
preserving an existing entry. Live-params SetParams stays unconditional —
the upgrade contract is to STAMP the anchor on live, only history is
preserved.
## 3. tests/integration/tokenomics/token_logic_modules/tlm_suite_test.go — switch off deprecated GetExpiringClaimsIterator
assertNoPendingClaims used the legacy single-iterator GetExpiringClaimsIterator
which only considers the LIVE-params-derived candidate sessionEndHeight.
After a window-offset change, cross-epoch claims would silently escape
the assertion (O2 class), producing a false-pass with leftover claims.
Switched to walk the deduplicated set from GetExpiringClaimsSessionEndHeights
+ per-candidate GetSessionEndHeightClaimsIterator — matches the production
Phase-1 settlement scan. Tightens what the helper actually asserts and
removes the now-unused tokenomicskeeper import.
## 4. proto/pocket/tokenomics/event.proto — document cross-delegation bucketing
Audit flagged that when the same pokt address is both a validator account
AND a delegator on a different validator, EventSettlementBatch buckets the
combined income under the validator-side op_reason. Per-validator
EventValidatorRewardDistribution reporting stays correct — but indexers
summing "VALIDATOR vs DELEGATOR" totals from EventSettlementBatch alone
would mis-classify the cross-delegation accounts.
Added a CROSS-DELEGATION ACCOUNTING NOTE block to the
EventValidatorRewardDistribution proto comment explaining the correct
aggregation pattern (sum from this per-validator event, not from
EventSettlementBatch alone). Documentation-only; no on-chain behavior
change. .pb.go regenerated.
## 5. x/tokenomics/token_logic_module/distribution_validator.go — clearer reward-distribution log
Log line previously reported "total bonded: <totalBondedTokens>" — the
chain-level number passed in from validateAndPrepareValidatorRewards.
But the actual denominator used in calculateProportionalRewards is
totalValidatorStake (excludes any validators dropped by buildValidatorStakes,
e.g., unparseable operator addresses). Two numbers, one log label.
Fix: log BOTH — eligible_stake (the real denominator) AND
validator_set_total (the chain-level number). They normally match;
divergence is now observable in audit logs.
## Verification
- make proto_regen: clean
- go build ./...: clean
- golangci-lint run on changed packages: 0 issues
- go test ./x/shared/keeper/... ./app/upgrades/...: green
…h commission, two-changes-same-session, revshare merge sum >100%
Adds four regression tests requested by the v0.1.34 branch audit. None are
critical on their own, but each pins behavior at a boundary the audit
flagged as untested.
## 1. TestValidatorRewardDistribution_FullCommission
x/tokenomics/token_logic_module/validator_distribution_test.go
Edge case: validator with 100% commission. Production code short-circuits
the remainder path (commission = poolShare → remainder = 0) and returns
BEFORE calling GetValidatorDelegations. The test mirrors that mock pattern,
asserts the full reward reaches the validator account, and pins the
EventValidatorRewardDistribution shape:
- commission_upokt = full reward
- self_delegation_reward_upokt = 0
- delegators_reward_upokt = 0
- total_delegated_stake_upokt = 0 (short-circuit doesn't read delegations)
- num_delegators = 0
Guards against accidental division-by-zero / commission>pool regressions
at the upper boundary.
## 2. TestValidatorRewardDistribution_CrossDelegatorWithCommission
x/tokenomics/token_logic_module/validator_distribution_test.go
Multi-validator (A, B) where A's account address is ALSO a delegator on
B. Both have 10% commission. Verifies:
- A's combined transfer (commission + self + delegator-on-B slice = 118k)
is bucketed under VALIDATOR op_reason — the bank-batch accumulator
keys on (recipient, op_reason), not source, so cross-delegation income
flows into VALIDATOR for accounts that are ALSO validators.
- No DELEGATOR-tagged transfer exists for A.
- EventValidatorRewardDistribution for validator B still correctly
reports delegators_reward_upokt = 18k (A's external-delegator slice).
This locks in the behavior documented in the new CROSS-DELEGATION
ACCOUNTING NOTE on the EventValidatorRewardDistribution proto. Indexers
that build VALIDATOR vs DELEGATOR totals MUST sum from the per-validator
event, not from EventSettlementBatch alone.
## 3. TestTwoSessionTimingParamChanges_SameSession_LastOneWins
tests/integration/tokenomics/two_param_changes_same_session_test.go
Two governance UpdateParam messages for num_blocks_per_session in the
SAME in-flight session. Both compute the same effective_height (next
session boundary) → the second SetParamsAtHeight overwrites the first
at that key. Asserts:
- Live params unchanged after both writes (both are deferred).
- Params-history entry at the boundary reflects the SECOND (final)
value, not the first (intermediate).
- Shared EndBlocker at the boundary promotes the SECOND value.
- GetParamsAtHeight at the boundary agrees with the live promotion.
C3 scenario covered. The same write path handles the other five session-
timing params; this representative test pins the overwrite behavior for
all of them.
## 4. TestDeduplicateSupplierRevShareAddresses_MergedSumExceeds100
x/supplier/keeper/migrate_duplicate_revshare_test.go
Boundary: input revshare entries that already exceed 100% in aggregate
(e.g., [A:60, A:50, B:30] → A:110, B:30, sum 140%). Pre-validation-era
state can carry this shape. The migration must:
- Not panic on the merge.
- Produce a deterministic merged list in first-occurrence order.
- NOT clamp percentages — surface the pathological state so the next
stake-update's ValidateBasic rejects it (migrate first, validate
later separation of concerns).
The audit-flagged sum-of-merged-duplicates > 100% case is now regression-
proof.
## Verification
- go test ./x/tokenomics/token_logic_module/... ./x/supplier/keeper/... ./tests/integration/tokenomics/...: all green
- golangci-lint run on touched packages: 0 issues
- One goimports fix on validator_distribution_test.go applied
…tory entries Mirrors the unmarshal-safety hardening applied to GetParamsAtHeight and GetParamsHistoryEntry in aa8961e. IterateParamsHistoryReverse was the only remaining MustUnmarshal site on the params-history read path — audit flagged the asymmetry. The iterator's sole production caller is candidateSessionEndHeightsForLiveParams (O2 cross-session candidate scan). A corrupted history entry there would have halted the chain at every settlement block past the corruption. Fix: switch to cdc.Unmarshal, log the failure, and skip the offending entry rather than the whole iteration. Skipping yields a degraded scan (that epoch's candidates may be missed) which is observable and recoverable; halting is not.
…n supplier revshare distribution
Closes a chain-halt vector latent on `feat/v0.1.34-session-grid-and-settlement-fixes`:
the `mergeRevShareDuplicates` migration sums duplicate-address percentages without
re-validating the resulting total, so a pre-v0.1.34 supplier carrying duplicate
revshare entries (combined sum > 100%) can end up in store with an invalid
config. At settlement, the old `GetSupplierShareholderAmountMap` computed a
NEGATIVE remainder for that supplier and stuffed it into the first shareholder's
share — when the overshoot exceeded the first shareholder's positive share, the
final amount became negative, and `cosmostypes.NewCoin` panicked → AppHash halt
on the next settlement block touching that supplier.
Three defensive layers, all consensus-deterministic:
1. `GetSupplierShareholderAmountMap` validates pre-flight:
- empty list → error
- any nil entry → error (no `rs.Address` nil-deref)
- duplicate recipient address → error (prevents silent map-overwrite data
loss when raw sum happens to equal 100, e.g. `[(a,30),(a,70)]`)
- sum != 100 → error
- belt-and-suspenders: negative remainder after math → error (unreachable
when the sum-check passes; kept to defend against future bypass paths)
2. `distributeSupplierRewardsToShareholders` routes ALL of the above to a new
owner-fallback branch instead of propagating the error up. The full
`amountToDistribute` is paid to the supplier's `owner_address` — the
proto-level field populated at stake time and guaranteed present for the
supplier's lifecycle. Chain keeps making progress; supplier's owner
collects the (otherwise-stuck) revenue until operator restakes with clean
revshare.
3. New `EventSupplierRevShareFallbackDistribution` proto event carries:
operator address, owner address, service_id, session_end_block_height,
amount, op_reason, observed_sum_percentage, human-readable reason. Indexers
can build a "broken supplier config" dashboard alongside
`EventSupplierStakeStuckInModulePool`. Expected occurrences on a healthy
mainnet: zero; non-zero count post-upgrade points operator outreach at
pre-existing duplicate-revshare suppliers.
Defensive branch when owner address is empty (impossible after stake
validation, defensive only): returns a settlement-side error so the claim is
treated as faulty by the settlement loop. No panic, no halt.
`ctx context.Context` threaded through `distributeSupplierRewardsToShareholders`
so the new event can be emitted via `sdkCtx.EventManager().EmitTypedEvent`. Both
TLM call sites (Relay Burn Equals Mint, Global Mint) pass their existing
`tlm*.ctx`.
Tests (all in `x/tokenomics/token_logic_module/distribution_supplier_test.go`):
- Unit tests on `GetSupplierShareholderAmountMap`: happy path, remainder
allocation, sum > 100 rejection, sum < 100 rejection, duplicate-address
rejection, nil-entry rejection, empty-list rejection.
- Integration tests on `distributeSupplierRewardsToShareholders` with a real
EventManager: happy path (no fallback event), sum=140 (owner fallback +
event, configured recipients receive ZERO), sum<100 symmetric case,
duplicate-addresses-with-sum-100 (still falls back, prevents silent data
loss), empty owner (faulty-claim error without panic).
Adjusts 3 pre-existing call sites in
`x/tokenomics/keeper/token_logic_modules_test.go` for the (map, error) signature.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s + defensive dedup guard + LRM conservation telemetry + comment accuracy
Four audit follow-ups, all log/comment/defensive-guard scope. Zero settlement-output
change on any reachable mainnet path; the guards only activate on consensus-impossible
state, so the net behavior on a running network is identical.
1. **H2 — Distinguish "delegation query failed" from "zero delegations" in the
validator pool branch.** Previously a `!ok || empty-map || zero-stake` fell
through with no surfacing log at the caller — the underlying cause was buried in
collectValidatorDelegationStakes Warn/Debug lines. Now an explicit switch logs:
- `!ok` → already-logged by callee, no double-log
- `len(delegatorStakeAmounts) == 0 && validator.GetBondedTokens().IsPositive()` →
Warn (bonded validator with zero delegations — should be unreachable because
bond-time requires self-delegation; genesis-imported edge case worth surfacing)
- `totalDelegatedStake.IsZero()` → Warn (all delegations had zero shares)
Reward credit unchanged in all three cases — the full remainder is still paid to
the operator account.
2. **L2 — Fix misleading `BigInt().Int64()` panic comment.** The function silently
truncates the bottom 64 bits on overflow, it does not panic. Bound argument
(`len(stakeholders)` ≤ a few hundred) added so future readers can see why the
truncation is safe.
3. **L6 — LRM conservation telemetry.** `queueRewardTransfers` now takes the
expected total reward and logs Error on a sum mismatch. Telemetry only — does
NOT return an error (a logging invariant must not block settlement). Skipped
when expected is zero so the log stays readable for callers that don't pass a
meaningful total.
4. **H3 — Defensive dedupe guard on AccAddress in `buildValidatorStakes`.**
`validatorStakeAmounts[accAddr]` previously overwrote on duplicate, which
would have (a) dropped one validator from the Level-1 denominator and
(b) double-counted the other in totalValidatorStake. The staking module
enforces ValAddress uniqueness in the bonded set, so the branch is
consensus-impossible — but the guard is free and the failure mode is
subtle and settlement-corrupting, so we log Error and skip the second
occurrence rather than risk a silent state corruption.
Test: TestBuildValidatorStakes_DuplicateAccAddressIsSkipped — feeds two
validators with the same OperatorAddress + one unique, asserts that the
stake map has 2 entries (not 3, not double-counting the duplicate) and the
total stake reflects ONLY the unique contributions.
H2/L2/L6 are log/comment scope and were verified by running the full
`x/tokenomics/...` test suite — all pass.
Skipped: H1 (audit-claimed nil-Claim halt in GetSessionEndHeight). False
positive — `ClaimSettlementResult.Claim` is a VALUE field (`types.Claim`, not
`*types.Claim`) so it can never be nil; the entire `GetSessionHeader()`-then-
`GetSessionEndBlockHeight()` chain is gogoproto-generated nil-safe (returns 0
on nil SessionHeader). No halt vector.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dempotency + cross-session event doc Four audit follow-ups bundled. None affect a running network: M1 changes EndBlocker behavior only on legacy module-account-owned applications (zero expected on mainnet, same as supplier-side fix); M4 is proto comment only; M5 changes only the SetParams branch on testnet re-runs (mainnet first run is unaffected because anchor is zero pre-upgrade); M6 changes only the genesis-export halt path on corrupted entries (no impact on healthy data). M1 — `EventApplicationStakeStuckInModulePool` (mirror of supplier-side fix) -------------------------------------------------------------------------- `UnbondApplication` previously returned the bank-send error from SendCoinsFromModuleToAccount, which would halt the chain at EndBlocker on any legacy module-account-owned application (impossible to stake post-v0.1.34 due to the new owner check, but pre-v0.1.34 state can carry one). Now the bank-send error is logged, an `EventApplicationStakeStuckInModulePool` typed event is emitted (operator address, stuck coin, reason string, session_end_height — exact shape of the supplier-side stuck-coins event), and the application is removed from state anyway to keep the EndBlocker unbonding queue making progress. Coins remain stranded in the application module pool; indexers track them; governance can propose a reclaim transfer. No dedicated unit test landed for this fix — mirrors the precedent set by the supplier-side stuck event in `de74942eb` which also had no dedicated test (the failure-injection path requires adding a `WithBankSendFailure` option to the keeper scaffold). The change is a one-for-one mirror of the supplier-side code shape and was verified by running the full `x/application/...` suite + integration tests. M4 — Cross-session `session_end_block_height` proto caveat ---------------------------------------------------------- When O2-style cross-session settlement processes claims from multiple session_end_block_heights in the same block (rare path, only after window- offset change), `EventValidatorRewardDistribution.session_end_block_height` reports the FIRST claim's session — not every contributing session. Added a proto comment block calling out the caveat and pointing indexers at the settlement block height (from the SDK header) as the canonical timestamp. Did NOT rename or add new fields (proto-breaking change at upgrade for indexer consumers is worse than the approximation in the rare cross-session path). M5 — Symmetric idempotency guard around SetParams in v0.1.34 handler -------------------------------------------------------------------- The `SetParamsAtHeight(ctx, 1, ...)` write was guarded against re-runs via a `GetParamsHistoryEntry(ctx, 1)` existence check. The companion `SetParams` write was unconditional, so on a testnet re-run the live params got re-stamped with anchor=1 + session=1 even if the operator had pinned a different grid in a previous rehearsal. Now the `SetParams` write is gated on `liveParams.SessionGridAnchorHeight == 0` (mirror of the history-side guard, semantically equivalent to "anchor not yet set"). Mainnet first-run: anchor is zero pre-upgrade, so the stamp proceeds unchanged. Testnet re-run with prior pinning: the stamp is skipped and a log line surfaces the existing anchor. Also adds a second `historySeedParams` snapshot taken after the live-params stamp branch so the two seeds (live + history) stay in lock-step even when one branch is skipped. M6 — `GetAllParamsHistory` safe-Unmarshal (matches sister readers) ------------------------------------------------------------------ The three production readers on the params-history surface were already Unmarshal+log+skip-corrupted (GetParamsAtHeight in `aa8961e87`, GetParamsHistoryEntry in `aa8961e87`, IterateParamsHistoryReverse in `6a315a1de`). `GetAllParamsHistory` — the genesis-export reader — was the remaining MustUnmarshal site. A corrupted entry would have halted genesis export with a panic. Same Unmarshal+log+skip pattern as the sister fns now. Healthy chain data is unaffected; the only behavioral difference is on a corrupted entry, where the new code logs an Error and continues instead of panicking.
…se the genesis-seeded-OLD-epoch orphan class
Audit pass 3 caught a real chain-state bug: `candidateSessionEndHeightsForLiveParams`
used a `max(4*N, 240)` global lookback floor and stopped the reverse params-history
walk on the FIRST entry below it. On a long-running chain (mainnet ~720K blocks at
the time of the first post-upgrade window-offset change), the v0.1.34 upgrade
handler's history seed at h=1 is the ONLY representation of the OLD epoch. That
entry sits far below any reasonable `blockHeight - max(4*N, 240)` floor, so the
iterator early-stopped on it and the OLD epoch's offsets never produced a candidate
sessionEndHeight.
Practical impact: any claim with `sessionEndHeight` reachable only via the OLD
epoch's offsets (i.e. a claim from the LAST session under the OLD epoch, whose
proof-window-close happens to coincide with the settlement block under the OLD
tail) would be silently orphaned forever. The PR-body claim that the v0.1.34
branch "closes the O2 orphan class" was true ONLY for short-window cases where
both old and new history entries fit within 240 blocks of the settlement block —
NOT for the practically-relevant mainnet first-post-upgrade window-offset change.
Fix:
- Drop the `effHeight < earliest` early-stop entirely.
- Track `nextEffHeight` across the reverse iteration so each historical epoch's
owned session-end range can be bounded by `nextEffHeight - 1`.
- For each epoch encountered, compute its OWN tail (sum of claim+proof window
offsets under that epoch's params) and skip the `addCandidate` call when
`(nextEffHeight - 1) + thisEpochTail + 1 < blockHeight` (no in-flight claim
from this epoch possible).
- ALWAYS continue iterating: older epochs may have had different — potentially
longer — proof window offsets, so they may still own in-flight claims even
when the epoch immediately following them does not. The check is a per-epoch
skip, not a global terminator.
Performance: params history is governance-rate-limited (at most a handful of
entries per N change per year on mainnet). The unbounded reverse walk is bounded
in practice and dominated by per-iteration map-lookup cost. Benchmarks at K=1/2/3
×M=200/1k/2.5k from the prior O2 fix remain representative — they exercised
small-K paths and the new code retains the same per-iteration cost.
Regression test: TestSettlementCandidateScan_GenesisSeededOldEpochOnLongRunningChain
sets up the exact scenario the audit identified — blockHeight=720_301, history
{h=1: OLD (tail=60), h=720_241: NEW (tail=30)}, NEW live, OLD's last session
ending at h=720_240 with proof-window-close at 720_240 + 60 + 1 = 720_301 (i.e.
the LAST settlement block on which an OLD-epoch claim is still in-flight).
Without the fix the test fails on `require.Contains(candidates, 720_240)` because
the iterator early-stops on h=1.
Updates the function-level docstring to call out the per-epoch in-flight filter
(not a global lookback cutoff) and the regression test as the pin for the
genesis-seeded-h=1 scenario.
Companion to commits 7100aaa (revshare fallback), e1b559f (validator
distribution polish), 1dac883 (app stuck-coins + idempotency + GetAllParamsHistory).
…umentation (audit pass 3 MED1+MED2) MED1 — Upgrade handler: seed grid before unbond ----------------------------------------------- v0.1.34 upgrade handler previously ran `unbondBelowMinStakeApplications` BEFORE `seedAnchoredSessionGrid`. The unbond pass calls `GetSessionEndHeight` which routes through `sessionGridAnchor` — with an unstamped (anchor=0) live params snapshot the call resolves to the legacy block-1 grid via the §3.4 fallback path. Output is correct on mainnet first run BUT is order-dependent on that fallback continuing to behave that way. Reordered so `seedAnchoredSessionGrid` runs first; the unbond pass then reads a stamped (anchor=1, session=1) live params snapshot and never relies on the fallback. Output state IS IDENTICAL on mainnet first run (the fallback and the stamped path return the same session-end height) but the dependency is now explicit instead of latent. MED2 — MsgUpdateParam cross-param-loss documentation ---------------------------------------------------- Adds a code-level comment block on `UpdateParam` (singular) documenting the known limitation: chained MsgUpdateParam calls in the same in-flight session, each targeting a DIFFERENT session-timing param, lose the first call's change because the second call's base is the (stale) LIVE params and the second SetParamsAtHeight overwrites the first at the same effective_height key. Workaround: bulk MsgUpdateParams (msg_update_params.go) takes a full Params struct in one shot and writes the union atomically. Governance proposals that modify multiple session-timing params MUST use the bulk variant. Same-param sequential updates work correctly (last-write-wins; the existing TestTwoSessionTimingParamChanges_SameSession_LastOneWins pins this). A consensus-affecting fix that reads from "next-effective state" instead of LIVE is deferred to a follow-up release with explicit governance signal. Neither change modifies behavior of a running network: MED1's reorder produces identical output on mainnet first run, MED2 is documentation only.
oten91
added a commit
that referenced
this pull request
May 29, 2026
…iner throughput, indexer events (#1952) v0.1.34 release. Squash-merge of `rc/0.1.34` into `main`. Adds 6 commits not yet on `main` (the other 24 in `v0.1.33..HEAD` already landed on `main` piecemeal). Full release delta: v0.1.33...rc/0.1.34 ## What merged **deps (`287a9b9a1`, #1950)** — cosmos-sdk 0.53.7 (distribution overflow fix, historical-grpc routing, OTel), CometBFT fork v0.38.23, shannon-sdk pagination, security fixes. **core / consensus-breaking (`5bf44d2c3`, #1949)** — validator commission on settlement: two-level distribution (commission carved per-validator, remainder to delegators by stake). Anchored session grid (num_blocks_per_session changeable for QoS). Settlement fixes + app/supplier hardening. Ships with the v0.1.34 upgrade handler. **app (`994be94f8`, #1951)** — in-process file streamer for KV state archival. **perf(relayer) — off-chain (`51fc0484a`)** — stop silent relay drops at high throughput: - `relayminer_relays_dropped_total` metric at the drop site (served-but-unpaid relays now measurable). - Configurable buffers `served_relays_buffer_size` (default 1000) / `mining_pipeline_buffer_size` (default 50); defaults match prior hardcoded values. - `channel.MapParallel` + `mining_workers` (0 = GOMAXPROCS) for the marshal+hash stage (pure, per-relay-independent; SMST insert is commutative). Bench: serial 36.3ms → parallel-8 9.2ms (~3.95×). **docs(relayer) (`af547155d`)** — document the three knobs in JSON schema + example YAML + config reference. Also closes a pre-existing gap: `enable_eager_relay_request_validation` was missing from the schema. **feat(events) — consensus-breaking proto fields (`54f96b43b`)** — indexer event-payload asks: - `num_estimated_relays` (uint64) added to `EventClaimCreated`/`EventClaimUpdated`/`EventProofSubmitted`/`EventProofUpdated` (field 13) and `EventClaimExpired` (field 13); `EventClaimSettled` already had it (field 20). New `Claim.GetNumEstimatedRelays` helper (single source of the formula); also removes the indexer divide-by-zero when `num_relays == 0`. - `supplier_stake_after_slash` (string) added to `EventSupplierSlashed` (field 9) — post-slash stake directly, a single scalar (NOT a full `shared.Supplier` embed; consistent with `EventSupplierStaked` having dropped its embed for size). ## Notes - Relayer perf/docs are off-chain (no coordinated upgrade). Settlement/commission + new event fields are consensus-breaking and gated by the v0.1.34 upgrade handler. - Event additions are additive proto fields (new field numbers, backward-compatible). Indexer "B-items" need no protocol change — the data was already emitted. ## Testing `make proto_regen` clean; full `go build ./...`; proof + tokenomics + observable + relayer (proxy/miner/config/session) suites green; `MapParallel` tests pass under `-race`. Docusaurus validate green (after re-run of a flaky corepack/yarn cache-setup step — unrelated to content). --------- Co-authored-by: Jorge S. Cuesta <jorge@poktscan.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v0.1.34 release bundle. Six consensus-breaking features + one upgrade handler, all gated behind the
v0.1.34plan name. Three independent audit passes have landed; every BLOCKER/HIGH/MED finding is closed or explicitly documented below.Features
1. Validator commission on settlement (
feat(tokenomics))Two-level reward distribution in
distribution_validator.go: commission carved per-validator, remainder to delegators by stake weight. Validator share rises from ~0.27% → ~16.56% of pool. Adds additiveEventValidatorRewardDistributionfor per-validator breakdown — needed because cross-validator delegatorEventSettlementBatchrows merge.2. Anchored session grid — safely change
num_blocks_per_session(feat(shared), #543)Epoch-relative session boundary math anchored at
(session_grid_anchor_height, session_number_at_anchor). v0.1.34 upgrade handler seeds anchor=1 / number=1 + records a params-history entry at effective_height=1, so pre-upgrade heights resolve to the legacy block-1 grid via the at-height resolver.Option B locked — N changes deferred to the next session boundary via shared
EndBlocker. Extended in this branch to all six session-timing params (num_blocks_per_session, grace period, claim/proof window open/close offsets).Anchor stability refactor: anchor advances only when
num_blocks_per_sessionactually changes — not on every session or every non-N param change.3. Cross-session window-offset claim safety (
fix(tokenomics))Closes the O2 orphan class. Settlement walks the full params-history reverse and computes a candidate
sessionEndHeightunder each epoch's offsets, so claims created under old offsets settle at their storedsessionEndHeighteven when live offsets at settlement height resolve to a different one.Per-epoch in-flight filter: each historical epoch's owned session-end range is bounded by
nextEffHeight - 1; epochs with no in-flight claims atblockHeightare skipped but the iterator continues (older epochs may have had longer tails). No global lookback cutoff — the v0.1.34 upgrade handler's genesis-seeded history entry ath=1must remain visitable on a long-running chain (blockHeight>> 240). See Pass 3 audit response for the regression this avoids.Bench: K=1 legacy parity; K=2/3 bounded — max 8.4ms at 2551 claims (0.012% of block time). Params history is governance-rate-limited so the unbounded walk is bounded in practice.
4. Lazy service config history (
feat(application))Deterministic session queries for applications via lazy backfill of
ServiceConfigHistory. Fixes nil-history footguns inGetAllApplications, transfer merge, and same-session swaps.ApplicationAddresson every history entry.ActivationHeight == nextSessionStartHeight.5. On-chain
min_stakefor app auto-unstake (fix(tokenomics), #1846)Use the live on-chain param, not the compiled-in default.
6. Duplicate revshare validation + module-account owner check (
fix)Same-address revshare entries now rejected at validation; module accounts can't be owners. v0.1.34 upgrade handler deduplicates pre-existing on-chain entries.
Upgrade handler (
app/upgrades/v0.1.34.go)SessionGridAnchorHeight == 0, history-seed gated onGetParamsHistoryEntry(ctx, 1)existence).DeduplicateSupplierRevShareAddresses— cleans pre-existing duplicate revshare rows.MarkBelowMinStakeApplicationsUnbonding— flags apps below the live min_stake for unbonding.StoreUpgrades(no new stores).Audit findings landed on this branch
Two independent audit passes, both broadly resolved. Squash-resistant changelog of every fix below.
Pass 1 (P0–P2) — landed earlier
P0:
fbb88fd93SharedMsgUpdateParamsbulk path now records params history.P1 — commit
de74942eb:x/application/keeper/unbond_applications.go:72— nil-MinStakeguard at unbond entry (mirrors line-119 defensive pattern +DefaultMinStakefallback).x/supplier/keeper/unbond_suppliers.go:102— same nil-MinStakeguard, sister site missed in original audit.x/supplier/keeper/unbond_suppliers.go:88-94— bank-send error no longer halts on legacy module-account-owned suppliers. Emits newEventSupplierStakeStuckInModulePool+ removes supplier + continues. Coins remain in module pool; indexer tracks them.x/supplier/keeper/migrate_duplicate_revshare.go— nil-ServiceRevenueShareskip in bothhasDuplicateRevShareAddressesandmergeRevShareDuplicates.P2 — commits
aa8961e87+6a315a1de:x/shared/keeper/params.go— three readers (GetParamsAtHeight,GetParamsHistoryEntry,IterateParamsHistoryReverse) switched fromMustUnmarshaltoUnmarshal+log+fallback.app/upgrades/v0.1.34.go— idempotent grid seed at h=1 viaGetParamsHistoryEntryexistence check.tlm_suite_test.go assertNoPendingClaims— switched off deprecatedGetExpiringClaimsIterator; walksGetExpiringClaimsSessionEndHeights+ per-candidate iterator. Fixes silent-pass on cross-epoch claims (O2).proto/pocket/tokenomics/event.protoEventValidatorRewardDistribution— CROSS-DELEGATION ACCOUNTING NOTE block added so indexers know cross-delegator income is bucketed under VALIDATOR inEventSettlementBatch.distribution_validator.go— log line now reports botheligible_stakeANDvalidator_set_total.P2 test gaps — commit
3d6def673:TestValidatorRewardDistribution_FullCommission(100% commission edge case)TestValidatorRewardDistribution_CrossDelegatorWithCommission(pins cross-delegation bucketing)TestTwoSessionTimingParamChanges_SameSession_LastOneWins(C3: 2ndSetParamsAtHeightoverwrites first at same effective_height)TestDeduplicateSupplierRevShareAddresses_MergedSumExceeds100(migration must not panic on boundary case)F1 supplier regression — commit
7058a0d1c:TestEndBlockerUnbondSuppliers_NumBlocksPerSessionDecreaseDoesNotReleaseEarly+ 5 sibling unbonding tests.GetParamsAtHeight(unstakeSessionEndHeight)); regression risk low.Lint:
82fce701cSA1019 exclusion forEmitEventin settlement bench.c5a0222f3goimports realignment.Pass 2 (independent audit, 2026-05-27) — landed in this update
Chain-halt vector closed — commit
7100aaa64:distribution_supplier.goagainst migrated state with sum > 100%:GetSupplierShareholderAmountMapnow returns(map, error), pre-validates: empty list, nil entries, duplicate addresses, sum != 100, negative remainder.distributeSupplierRewardsToShareholdersfalls back to paying the FULL amount tosupplier.OwnerAddress(proto field, populated at stake time) when validation fails.EventSupplierRevShareFallbackDistributionproto event inpocket/tokenomics/event.protocarries: operator, owner, service_id, session_end_block_height, amount, op_reason, observed_sum_pct, rejection reason.GetSupplierShareholderAmountMap, 5 integration with realEventManager(happy path, sum=140, sum=80, dup-with-sum-100, empty-owner faulty claim).ctx context.Contextthreaded intodistributeSupplierRewardsToShareholdersso events can be emitted viasdkCtx.EventManager().EmitTypedEvent.Validator-distribution polish — commit
e1b559f0c:distribution_validator.go— three distinguishing log paths in the zero-delegation branch (!okskipped, bonded-with-zero-delegations → Warn, all-shares-zero → Warn). Reward credit unchanged in all three.BigInt().Int64()comment fix — does NOT panic, silently truncates. Bound argument (len(stakeholders) ≤ ~hundreds) added.queueRewardTransfersLRM conservation telemetry — takesexpectedTotalReward, logs Error on mismatch. Does NOT return error (logging invariant must not block settlement).buildValidatorStakes— defensive AccAddress dedup guard. Consensus-impossible (staking module keys by OperatorAddress) but settlement-corrupting if silent overwrite ever fired. Logs Error + skips duplicate.TestBuildValidatorStakes_DuplicateAccAddressIsSkipped— 2 unique entries, total stake reflects ONLY unique contributions.App stuck-coins parity + history idempotency — commit
1dac8836c:UnbondApplicationno longer returns bank-send error fromSendCoinsFromModuleToAccount. Emits newEventApplicationStakeStuckInModulePool(operator, stuck coin, reason, session_end_height) + removes app + continues. Mirror of supplier-side fix inde74942eb. No dedicated test — mirrors supplier-side precedent which also lacked one (failure injection requires scaffold work).EventValidatorRewardDistribution.session_end_block_heightproto comment — caveat for the cross-session O2 candidate-scan path: field reports FIRST claim's session, not every contributing session. Indexers should use settlement block height from SDK header for canonical timestamp on cross-session batches.SetParamsidempotency guard. Live-params anchor stamp now gated onliveParams.SessionGridAnchorHeight == 0(mirror of the history-seed guard). Mainnet first-run unaffected (anchor is zero pre-upgrade). Testnet re-run with prior pinning is now preserved.GetAllParamsHistory— safe-Unmarshal+log+skip-corrupted, mirror of the three sister production readers. Genesis-export only; no consensus impact.Pass 3 (independent audit, 2026-05-27) — landed
HIGH — O2 lookback bug closed — commit
5662502b6:candidateSessionEndHeightsForLiveParamspreviously stopped the reverse params-history walk on the first entry belowblockHeight - max(4*N, 240). On a long-running chain, the v0.1.34 upgrade handler's history seed at h=1 sits far below that floor, so the OLD epoch's offsets never produced a candidate sessionEndHeight after the first post-upgrade window-offset change.earliestearly-stop; replace with a per-epoch in-flight filter that tracksnextEffHeightacross the reverse iteration. Each epoch's owned session-end range is bounded bynextEffHeight - 1; the iterator continues past epochs with no in-flight claims because OLDER epochs may have had different (potentially longer) offsets.TestSettlementCandidateScan_GenesisSeededOldEpochOnLongRunningChain— blockHeight=720_301, history {h=1: OLD (tail=60), h=720_241: NEW (tail=30)}, OLD's last session at h=720_240 with proof-close at 720_240+60+1=720_301. Pre-fix the test fails onrequire.Contains(candidates, 720_240); post-fix passes.MED1 — Upgrade handler ordering — commit
1d4ed9661:seedAnchoredSessionGridnow runs BEFOREunbondBelowMinStakeApplications. The unbond pass callsGetSessionEndHeightwhich routes throughsessionGridAnchor— previously the (unstamped) anchor=0 live params triggered the §3.4 fallback path. Reordered so the unbond pass reads stamped (anchor=1, session=1) live params and never relies on the fallback.MED2 —
MsgUpdateParamcross-param same-session loss — commit1d4ed9661:MsgUpdateParams(which takes the full Params struct in one shot) writes the union atomically and must be used for multi-param governance proposals.TestTwoSessionTimingParamChanges_SameSession_LastOneWinspins this).LOW — Transfer-merge
ServiceConfigHistorydiscard — acknowledged in earlier doc. Niche edge case.LOW — App + gateway F1 regression tests deferred — acknowledged, see [application_gateway_f1_test_blocker memory].
Explicitly skipped (with reasons)
GetSessionEndHeight) — FALSE POSITIVE.ClaimSettlementResult.Claimis a VALUE field (types.Claim, not*types.Claim), can never be nil.GetSessionHeader()andGetSessionEndBlockHeight()are gogoproto-generated nil-safe getters returning 0 cleanly on a zero-valueClaim. No halt vector exists.WithRealSharedKeeperoption toNewApplicationModuleKeepers+NewGatewayModuleKeepers(~50 lines, post-v0.1.34).ServiceConfigHistorydiscard — new-address path properly clones (verified line 161-165); merge path doesn't propagate srcApp's history entries. Niche edge case — only matters if someone queries past-session configs against the merged dst. Design call needed; not tag-blocking.isAtClaimSettlementHeightlive params inpkg/client/query/cache/options.go— off-chain only (relay-miner difficulty cache); not consensus.Pre-upgrade ops checklist (operator + indexer side, NOT this PR)
EventValidatorRewardDistribution,EventSupplierStakeStuckInModulePool,EventApplicationStakeStuckInModulePool,EventSupplierRevShareFallbackDistributionBEFORE upgrade height committed.EventValidatorRewardDistributionper-validator rather than fromEventSettlementBatchtotals.EventSettlementBatchamount jumps ~60× (~0.27% → ~16.56% of pool) at upgrade height. Annotate.max_snapshot_chunkstoconfig.toml(default seeded by upgraded binary; existing nodes without statesync are unaffected).MsgUpdateParams(full Params struct, atomic), NOT chained singularMsgUpdateParamcalls in the same session. The singular path's stale-LIVE base causes the second call to overwrite the first at the same effective_height key (audit Pass 3 MED2). Same-param sequential updates are safe (last-write-wins).Tests
app/upgrades/v0.1.34_test.go— plan metadata, grid seed, idempotencytests/integration/tokenomics/window_offset_change_settlement_test.go— O2 shrink + growtests/integration/tokenomics/two_param_changes_same_session_test.go— C3 last-write-winsx/tokenomics/keeper/settlement_candidate_scan_test.go— K=3 correctness +TestSettlementCandidateScan_GenesisSeededOldEpochOnLongRunningChain(Pass 3 regression: long-running chain + genesis-seeded h=1 OLD epoch)x/tokenomics/keeper/settlement_candidate_scan_bench_test.go— K=1/2/3 × M=200/1k/2.5k matrixx/tokenomics/keeper/settle_pending_claims_bench_test.go— settlement event aggregation memory benchx/tokenomics/token_logic_module/distribution_supplier_test.go— 12 tests covering owner-fallback + sum/dup/negative-remainder guardsx/tokenomics/token_logic_module/validator_distribution_test.go— addsTestBuildValidatorStakes_DuplicateAccAddressIsSkipped,TestValidatorRewardDistribution_FullCommission,TestValidatorRewardDistribution_CrossDelegatorWithCommissionx/supplier/keeper/unbond_suppliers_test.go— 7 unbonding tests including F1 regressionx/supplier/keeper/migrate_duplicate_revshare_test.go—TestDeduplicateSupplierRevShareAddresses_MergedSumExceeds100applicationhas 0 balance,applicationis undelegated fromgateway#1846 headroomTest status
make test_all: green (one pre-existing flake inpkg/observable/channel TestReplayObservableunder CPU contention — passes in isolation, documented by Makefile banner)make test_integration: greenapplicationhas 0 balance,applicationis undelegated fromgateway#1846 + anchor-field denylist)golangci-lint v2.11.4clean acrossx/tokenomics/... x/supplier/... x/shared/... x/application/... tests/integration/tokenomics/... app/upgrades/...Out of scope (separate branches)
rc/0.1.34rc/0.1.34Consensus safety checklist
settlementContext)ctx.BlockTime()only, notime.Now()big.Ratonly, nobig.NewFloatMustUnmarshalreplaced withUnmarshal+log+fallback at all params-history read sites (4 of 4:GetParamsAtHeight,GetParamsHistoryEntry,IterateParamsHistoryReverse,GetAllParamsHistory)5662502b6with regression test; MED1 (handler ordering) reordered in1d4ed9661; MED2 (cross-param same-session) documented + workaround pinned in ops checklist