Skip to content

WIP: Wait for Kubernetes readiness#347

Open
timflannagan wants to merge 1 commit intoagentregistry-dev:mainfrom
timflannagan:fix/kube-wait
Open

WIP: Wait for Kubernetes readiness#347
timflannagan wants to merge 1 commit intoagentregistry-dev:mainfrom
timflannagan:fix/kube-wait

Conversation

@timflannagan
Copy link
Collaborator

Description

We now keep Kubernetes deployments in an in-flight state until the live platform resources report readiness, and the CLI wait path surfaces current condition details instead of treating apply as success.

Previously, the Kubernetes adapter returned deployed immediately after apply, so --wait could exit successfully while the underlying Agent or MCP resources were still not ready. Follow-up to #296 which introduced the initial scaffolding, but didn't actually work.

Now, Kubernetes deploys stay in deploying until GetDeploymentByID refreshes their state from live CR conditions, and wait failures include the current deployment error text.

Fixes #230.

Change Type

/kind fix

Changelog

NONE

Additional Notes

We now keep Kubernetes deployments in an in-flight state until the live platform
resources report readiness, and the CLI wait path surfaces current condition
details instead of treating apply as success.

Previously, the Kubernetes adapter returned deployed immediately after apply,
so --wait could exit successfully while the underlying Agent or MCP resources
were still not ready.

Now, Kubernetes deploys stay in deploying until GetDeploymentByID refreshes
their state from live CR conditions, and wait failures include the current
deployment error text.

Signed-off-by: timflannagan <timflannagan@gmail.com>
@timflannagan
Copy link
Collaborator Author

Worked locally but mostly AI generated. Want to manually validate these changes. Adding WIP label for now.

@timflannagan timflannagan changed the title internal: Wait for Kubernetes readiness WIP: Wait for Kubernetes readiness Mar 13, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Kubernetes deployments being reported as successful immediately after apply by keeping them in an in-flight state until live Kubernetes resources report readiness, and by surfacing current condition/error details in the CLI --wait path.

Changes:

  • Kubernetes deploy now returns deploying after apply, and adds a platform hook to refresh managed deployment state from live CR conditions.
  • GetDeploymentByID refreshes managed deploying deployments via the platform adapter and persists updated status/error back to the DB.
  • CLI wait logic now includes the deployment’s current error text in failure/timeout messages, with new tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/registry/service/registry_service.go Adds optional state refresh hook and refresh-on-GetDeploymentByID logic plus patch-change detection helper.
internal/registry/service/registry_service_test.go Adds tests for preserving in-flight status and refreshing managed deploying state.
internal/registry/platforms/kubernetes/deployment_adapter_kubernetes.go Returns deploying from Deploy and implements RefreshDeploymentState.
internal/registry/platforms/kubernetes/deployment_adapter_kubernetes_platform.go Implements Kubernetes readiness evaluation by listing CRs by deployment label and interpreting conditions into status/error patches.
internal/registry/platforms/kubernetes/deployment_adapter_kubernetes_platform_test.go Adds unit tests covering readiness transitions and condition-derived messages.
internal/cli/common/wait.go Enhances wait error messaging and makes poll/timeout configurable for tests.
internal/cli/common/wait_test.go Adds tests verifying polling behavior and failure detail propagation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +436 to +448
case metav1.ConditionFalse:
result.status = "deploying"
result.message = kubernetesFormatConditionMessage("remote MCP server", server.Name, *cond)
return result
}
}
if cond := meta.FindStatusCondition(server.Status.Conditions, "Accepted"); cond != nil && cond.Status == metav1.ConditionTrue {
result.status = "deployed"
return result
}

result.status = "deploying"
result.message = fmt.Sprintf("waiting for remote MCP server %s to report Accepted condition", server.Name)
Comment on lines +952 to +957
if patch.ProviderConfig != nil {
return true
}
if patch.ProviderMetadata != nil {
return true
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deploying an agent to k8s reports successful state even when Agent isn't ready

2 participants