e2e: reduce pod-pending-timeout to fail fast on stuck pods#5001
e2e: reduce pod-pending-timeout to fail fast on stuck pods#5001joepvd wants to merge 2 commits intoopenshift:mainfrom
Conversation
The default --pod-pending-timeout of 60m causes e2e tests to waste up to an hour waiting when a step pod is stuck in Pending state (e.g. due to scheduling or image pull issues). Set it to 10m for e2e tests, which is long enough for normal scheduling but fails fast when something is fundamentally wrong. This mirrors the existing --lease-acquire-timeout=2s override already applied for the same reason. Made-with: Cursor
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughAdded the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/test e2e |
|
/retest-required |
|
/test e2e |
|
/test images |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danilo-gemoli, joepvd The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test images |
|
Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage. |
|
@joepvd: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/hold Revision 8f8e712 was retested 3 times: holding |
Summary
--pod-pending-timeoutof 60m causes e2e tests to waste up to an hour waiting when a step pod is stuck in Pending state (e.g. due to scheduling or image pull issues).--lease-acquire-timeout=2soverride already applied for the same reason.Context
Observed in PR #4996 e2e job where
configurable-leases-check-leases(a trivial step that should complete in under a second) failed after 59m59s — almost exactly the 60m--pod-pending-timeoutdefault.Made with Cursor
Summary by CodeRabbit