Skip to content

netvsp: eqe 135 handling clashing with fhr#2909

Open
erfrimod wants to merge 2 commits intomicrosoft:mainfrom
erfrimod:erfrimod/vf-reconfiguration-fhr
Open

netvsp: eqe 135 handling clashing with fhr#2909
erfrimod wants to merge 2 commits intomicrosoft:mainfrom
erfrimod:erfrimod/vf-reconfiguration-fhr

Conversation

@erfrimod
Copy link
Contributor

@erfrimod erfrimod commented Mar 7, 2026

Problems happen when device arrival / removal occur during VF Reconfiguration delays and waits. Modifying the behavior in VF Manager to consider device arrival/removal to indicate that the old VF is gone and the reconfiguration should be abandoned.

Example: EQE 135 arrives -> delay -> FHR -> device removal

  • If ManaDeviceRemoved fires during Reconfiguring, clears vf_reconfig_backoff → no more retries
  • If ManaDeviceArrived fires while vf_reconfig_backoff is Some, code panics (assert!)
  • If the uevent add occurs while state is Reconfiguring, the handling code maps it to Continue → event discarded

@erfrimod erfrimod requested a review from a team as a code owner March 7, 2026 01:49
Copilot AI review requested due to automatic review settings March 7, 2026 01:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the netvsp VF manager worker’s reconfiguration state machine to better tolerate VF arrival/removal events that occur during VF reconfiguration delays (e.g., EQE 135 + FHR timing), avoiding panics and inappropriate retry behavior.

Changes:

  • Treat uevent “device exists” notifications as ManaDeviceArrived even while in Reconfiguring.
  • Replace a panic on device arrival during reconfig backoff with “abandon reconfiguration” behavior (clears backoff + logs).
  • On device removal, clear any outstanding reconfig backoff (with logging) rather than silently resetting.

@github-actions
Copy link

github-actions bot commented Mar 7, 2026

@ben-zen
Copy link

ben-zen commented Mar 11, 2026

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants