Skip to content

Refactor connectionId, nodeId, clusterId#375

Merged
ArgusLi merged 10 commits into
mainfrom
fix/connection-id-node-id
Jun 24, 2026
Merged

Refactor connectionId, nodeId, clusterId#375
ArgusLi merged 10 commits into
mainfrom
fix/connection-id-node-id

Conversation

@ArgusLi

@ArgusLi ArgusLi commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Description

Previously the metrics state was keyed by the db-suffixed connectionId (--db) even though the metrics process is unique per Valkey node. That mismatch caused state to be stored and looked up under inconsistent keys.

This branch establishes a single id space for metrics and re-keys all node-level state to the db-less nodeId (-).

Bug fixes

  • Fixed double hot-key response.
  • Fixed standalone hot-keys showing no result.

Tooling

  • Renamed eslint.config.js to eslint.config.mjs so the ESM-only config loads (the plugins are ESM-only and the root package is CommonJS).
  • Make npm run lint automatically fix.

ArgusLi added 4 commits June 18, 2026 14:17
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
@ArgusLi ArgusLi marked this pull request as draft June 22, 2026 22:41
ArgusLi added 5 commits June 23, 2026 13:21
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
Signed-off-by: Argus Li <argus@argusli.dev>
@ArgusLi ArgusLi marked this pull request as ready for review June 23, 2026 22:23
return
if (typeof clusterId === "string") {
const nodeIds = Object.keys(clusterNodesRegistry[clusterId] ?? {}).filter((id) => metricsServerMap.has(id))
const responses = await Promise.all(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a node may be in a semi-alive state like when it's a failover or something, and it will fail the promise, so maybe worth switching to allSettled instead?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other words, if one node is to be killed anyway — do we need to fail the operation that succeeded in all alive nodes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good change. For this PR, my focus was on retaining existing logic and just ensuring my changes don't break anything. Considering #373 is waiting for this PR so that the bug where errors are not propagated to the right ID is fixed, I would prefer to not add this and the additional handling in this PR, and add it in its own PR right after.

const responses = await Promise.all(
nodeIds.map((nodeId) => postConfigToNode(metricsServerMap.get(nodeId)?.metricsURI, config)),
)
const firstFailure = responses.find((r) => !r.success)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subsequently, just getting the entire list of failing nodes instead of re-sending an update just to discover another failing node?

and then use filter instead of findFirst?

sendUpdateError(ws, { clusterId }, firstFailure)
} else {
// All nodes responses are the same so we use the first.
sendUpdateFulfilled(ws, { clusterId }, responses[0] ?? { success: true, message: "", data: {} })

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating a config is a rarely used operation so we don't have to optimize for network traffic. I'd prefer to have an explicit node: status response

const nodes =
typeof clusterId === "string"
? clusterNodesRegistry[clusterId]
typeof clusterId === "string"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did lint add trailing spaces?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to have.

// `targetId` keys node-level metrics state: `clusterId` for a cluster, else
// the db-less `nodeId`. (Connection-scoped state below still uses `id`, the
// db-suffixed connectionId.)
const nodeId = toNodeId(id!)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to just store nodeId in frontend state instead of stripping the db suffixed ID every time we need it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually are using targetId as the key in the frontend state for node level metrics state. This can be seen in the following functions in the code i.e. selectCommandLogs()(), selectHotKeys()()...

The reason for keeping the connectionId key state is that we still have things that need to be scoped by db such as the connection details, keys etc.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more so referring to storing nodeId in connection state, so components can read it directly instead of calling toNodeId every time. But this isn't a blocker if you don't think its worth the time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding nodeId in connection state means that if the connectionId gets changed, we have to ensure nodeId stays consistent, which is an additional thing to manage. toNodeId() is also cheap operation so there's no real performance benefit from caching it directly in connection state.

So in my mind it's not worth gaining that really small performance gain in exchange for an additional state we have to manage.

Comment thread apps/frontend/src/components/ui/monitor-warning-banner.tsx
Comment thread apps/frontend/src/state/valkey-features/config/configSlice.ts
Comment thread apps/server/src/actions/commandLogs.ts
Signed-off-by: Argus Li <argus@argusli.dev>
@ArgusLi ArgusLi merged commit 0a58993 into main Jun 24, 2026
8 checks passed
@ArgusLi ArgusLi deleted the fix/connection-id-node-id branch June 24, 2026 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants