e2e: reduce pod-pending-timeout to fail fast on stuck pods by joepvd · Pull Request #5001 · openshift/ci-tools

joepvd · 2026-03-11T07:49:38Z

Summary

The default --pod-pending-timeout of 60m causes e2e tests to waste up to an hour waiting when a step pod is stuck in Pending state (e.g. due to scheduling or image pull issues).
Set it to 10m for e2e tests, which is long enough for normal scheduling but fails fast when something is fundamentally wrong.
This mirrors the existing --lease-acquire-timeout=2s override already applied for the same reason.

Context

Observed in PR #4996 e2e job where configurable-leases-check-leases (a trivial step that should complete in under a second) failed after 59m59s — almost exactly the 60m --pod-pending-timeout default.

Made with Cursor

Summary by CodeRabbit

Tests
- Enhanced end-to-end testing: increased the pod pending timeout used by the CI test runner to reduce flaky failures and improve stability of integration/e2e test runs.

The default --pod-pending-timeout of 60m causes e2e tests to waste up to an hour waiting when a step pod is stuck in Pending state (e.g. due to scheduling or image pull issues). Set it to 10m for e2e tests, which is long enough for normal scheduling but fails fast when something is fundamentally wrong. This mirrors the existing --lease-acquire-timeout=2s override already applied for the same reason. Made-with: Cursor

openshift-ci-robot · 2026-03-11T07:49:41Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

coderabbitai · 2026-03-11T07:50:26Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 25161fbe-628b-464e-b1bf-2c2d7f2163da

📥 Commits

Reviewing files that changed from the base of the PR and between ae2d54e and 8f8e712.

📒 Files selected for processing (1)

test/e2e/framework/ci-operator.go

🚧 Files skipped from review as they are similar to previous changes (1)

test/e2e/framework/ci-operator.go

Walkthrough

Added the --pod-pending-timeout=20m flag to the ci-operator command invocation in the test framework, adjusting the pod pending timeout to 20 minutes without other behavioral changes.

Changes

Cohort / File(s)	Summary
CI Operator Flag Configuration `test/e2e/framework/ci-operator.go`	Added `--pod-pending-timeout=20m` flag to the ci-operator command invocation (one-line CLI argument change).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title describes reducing pod-pending-timeout to fail fast on stuck pods. However, the actual change implements a 20-minute timeout, not a reduction to the 10-minute value mentioned in the PR description.	Clarify whether the title accurately reflects the final timeout value (20m vs 10m) to ensure the title matches the implemented change.
Test Structure And Quality	❓ Inconclusive	Cannot access the modified file to assess Ginkgo test code quality. Repository appears to be in an inaccessible state or file does not exist in expected location.	Provide access to test/e2e/framework/ci-operator.go content or verify the file path and repository state to enable proper Ginkgo test quality assessment.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Stable And Deterministic Test Names	✅ Passed	The modified file ci-operator.go is a framework utility file containing no Ginkgo test definitions (It, Describe, Context, When), making the custom check for stable test names inapplicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

danilo-gemoli · 2026-03-11T08:37:31Z

/test e2e
/lgtm

joepvd · 2026-03-11T10:31:30Z

/retest-required

test/e2e/framework/ci-operator.go

joepvd · 2026-03-11T14:47:58Z

/test e2e

joepvd · 2026-03-11T19:40:20Z

/test images
/test e2e

danilo-gemoli · 2026-03-12T13:32:58Z

/lgtm
/retest-required

openshift-ci · 2026-03-12T13:38:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danilo-gemoli, joepvd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [danilo-gemoli]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

danilo-gemoli · 2026-03-13T08:05:33Z

/test images

openshift-ci-robot · 2026-03-13T09:33:26Z

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

openshift-ci-robot · 2026-03-13T16:27:28Z

/retest-required

Remaining retests: 0 against base HEAD 2b0ca42 and 2 for PR HEAD 8f8e712 in total

openshift-ci-robot · 2026-03-13T20:26:50Z

/retest-required

Remaining retests: 0 against base HEAD cc514eb and 1 for PR HEAD 8f8e712 in total

openshift-ci-robot · 2026-03-14T00:26:49Z

/retest-required

Remaining retests: 0 against base HEAD aa49356 and 0 for PR HEAD 8f8e712 in total

openshift-ci · 2026-03-14T02:54:24Z

@joepvd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/breaking-changes	`8f8e712`	link	false	`/test breaking-changes`
ci/prow/images	`8f8e712`	link	true	`/test images`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot · 2026-03-14T03:26:40Z

/hold

Revision 8f8e712 was retested 3 times: holding

openshift-ci bot requested review from deepsm007 and droslean March 11, 2026 07:50

openshift-ci bot assigned danilo-gemoli Mar 11, 2026

openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 11, 2026

joepvd commented Mar 11, 2026

View reviewed changes

test/e2e/framework/ci-operator.go Outdated Show resolved Hide resolved

Bump to 20 mins

8f8e712

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2026

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 12, 2026

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 14, 2026

Conversation

joepvd commented Mar 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Mar 11, 2026

Uh oh!

coderabbitai bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 inconclusive)

Uh oh!

danilo-gemoli commented Mar 11, 2026

Uh oh!

joepvd commented Mar 11, 2026

Uh oh!

Uh oh!

joepvd commented Mar 11, 2026

Uh oh!

joepvd commented Mar 11, 2026

Uh oh!

danilo-gemoli commented Mar 12, 2026

Uh oh!

openshift-ci bot commented Mar 12, 2026

Uh oh!

danilo-gemoli commented Mar 13, 2026

Uh oh!

openshift-ci-robot commented Mar 13, 2026

Uh oh!

openshift-ci-robot commented Mar 13, 2026

Uh oh!

openshift-ci-robot commented Mar 13, 2026

Uh oh!

openshift-ci-robot commented Mar 14, 2026

Uh oh!

openshift-ci bot commented Mar 14, 2026

Uh oh!

openshift-ci-robot commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joepvd commented Mar 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 11, 2026 •

edited

Loading