tabletmanager: detach DemotePrimary rollback contexts#20343
Conversation
Signed-off-by: Mohamed Hamza <mhamza@fastmail.com>
DemotePrimary rollback defers reused the caller context after partial demotion failures. If that context was canceled or timed out, rollback could fail before restoring read-write state or primary-side semi-sync. This gives each rollback operation its own bounded context detached from caller cancellation. Signed-off-by: Mohamed Hamza <mhamza@fastmail.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a failure-mode in TabletManager.demotePrimary() where rollback defers could reuse a canceled/timed-out caller context, causing rollback operations (restoring read-write state and/or primary-side semi-sync) to fail prematurely. The change ensures rollback actions run under their own bounded context detached from caller cancellation.
Changes:
- Use
context.WithoutCancel(ctx)+context.WithTimeout(..., topo.RemoteOperationTimeout)for rollback operations indemotePrimary()so rollback isn’t short-circuited by caller cancellation. - Add a unit test intended to validate rollback behavior under a canceled demotion context.
- Introduce a generated GoMock
MysqlDaemonmock to support the new test.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| go/vt/vttablet/tabletmanager/rpc_replication.go | Detaches rollback contexts for semi-sync and read-only restoration paths during partial demotion failures. |
| go/vt/vttablet/tabletmanager/rpc_replication_test.go | Adds a regression test for the rollback-context behavior (currently needs strengthening to assert the intended property). |
| go/vt/mysqlctl/mock/mysqldaemon.go | Adds generated GoMock implementation for mysqlctl.MysqlDaemon to enable targeted unit testing. |
Files not reviewed (1)
- go/vt/mysqlctl/mock/mysqldaemon.go: Generated file
| mockMysqlDaemon.EXPECT().SetSemiSyncEnabled(gomock.Any(), true, true).DoAndReturn( | ||
| func(ctx context.Context, primary bool, replica bool) error { | ||
| return ctx.Err() | ||
| }, | ||
| ), | ||
| mockMysqlDaemon.EXPECT().SetSuperReadOnly(gomock.Any(), false).DoAndReturn( | ||
| func(ctx context.Context, on bool) (mysqlctl.ResetSuperReadOnlyFunc, error) { | ||
| return nil, ctx.Err() | ||
| }, | ||
| ), | ||
| mockMysqlDaemon.EXPECT().SetReadOnly(gomock.Any(), false).DoAndReturn( | ||
| func(ctx context.Context, on bool) error { | ||
| return ctx.Err() | ||
| }, | ||
| ), |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20343 +/- ##
===========================================
- Coverage 69.67% 63.21% -6.46%
===========================================
Files 1614 60 -1554
Lines 216793 11293 -205500
===========================================
- Hits 151044 7139 -143905
+ Misses 65749 4154 -61595
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
The regression test needed specific ordering and conditions that were simpler to organize with a gomock version of mysqldaemon rather than the existing fake.
Signed-off-by: Mohamed Hamza <mhamza@fastmail.com>
Signed-off-by: Mohamed Hamza <mhamza@fastmail.com>
Description
DemotePrimaryrollback defers reused the caller context after partial demotion failures. If that context was canceled or timed out, rollback could fail before restoring read-write state or primary-side semi-sync.This gives each rollback operation its own bounded context detached from caller cancellation.
Related Issue(s)
Fixes #20342
Checklist
Deployment Notes
No deployment notes.
AI Disclosure
This PR was written primarily by GPT-5 Codex under user direction.