Skip to content

valkey restore post-ready: recover target context and sentinel registration in post-restore-sentinel.sh #2595

@weicao

Description

@weicao

Summary

Valkey restore post-ready / Sentinel registration could fail because post-restore-sentinel.sh could not reliably recover target context or discover the current primary in restore jobs.

This issue tracks the addon-side stop points that were reproduced, narrowed with evidence, and then fixed in post-restore-sentinel.sh.

Reproduced stop points

1. Single-shot primary discovery was too rigid

post-ready step 1 could fail with:

  • could not find primary among data pods — Sentinel registration failed

2. Restore job context variables were absent

In post-ready jobs:

  • DP_TARGET_POD_NAME was empty
  • DP_TARGET_NAMESPACE was empty

3. DP_DB_HOST could be only pod.headless-service

The fallback path executed, but DP_DB_HOST did not always include namespace / cluster domain, so target context still could not be reconstructed.

Fix scope

The validated addon fix stays in:

  • addons/valkey-bk/dataprotection/post-restore-sentinel.sh

It includes three minimal changes:

  1. bounded retry for primary discovery
  2. fallback from DP_TARGET_* to DP_DB_HOST
  3. namespace fallback from the current serviceaccount namespace when DP_DB_HOST is only pod.headless-service

Validation boundary

This issue covers the current Valkey restore post-ready / Sentinel registration stop point.
It does not claim that all restore scenarios are fully passed.

Validation evidence

On the targeted rerun after the third patch:

  • postReady-0 = Completed
  • postReady-1 = Completed
  • restore reached phase=Completed
  • type=PostReady -> True / Succeed
  • logs contained:
    • resolved target context ...
    • current primary is ...
    • Post-restore Sentinel registration complete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions