chunked builtin backup engine#20167
Conversation
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
There was a problem hiding this comment.
Pull request overview
This PR adds optional chunking to the builtinbackupengine so that large MySQL files can be backed up and restored as independently-compressed pieces, enabling higher parallelism (especially beneficial for object stores like S3) and improving restore throughput.
Changes:
- Introduces chunk metadata in the backup manifest (
FileEntry.Chunks) and new flags to control chunking (--builtinbackup-file-chunk-threshold,--builtinbackup-file-chunk-size). - Updates builtin backup/restore to split large files into chunks for parallel backup and to restore chunked files via parallel
WriteAt(pwrite-style) writes into a pre-sized destination. - Adds unit and end-to-end tests validating chunk name parsing and verifying chunked vs non-chunked backups via MANIFEST inspection.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| go/vt/mysqlctl/file_close_test.go | Updates tests for the new backupFile(..., chunkIndex) signature. |
| go/vt/mysqlctl/builtinbackupengine.go | Core implementation: chunking flags, manifest schema, chunked backup work scheduling, and parallel chunk restore. |
| go/vt/mysqlctl/builtinbackupengine_test.go | Adds unit tests for parsing storage names (parseBackupName). |
| go/test/endtoend/backup/vtctlbackup/backup_utils.go | Adds helpers to verify chunking by reading MANIFEST and counting chunks. |
| go/test/endtoend/backup/vtctlbackup/backup_test.go | Adds end-to-end tests for chunked and non-chunked builtin backups with forced small thresholds/sizes. |
| go/flags/endtoend/vttestserver.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vttablet.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vtctld.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vtcombo.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vtbackup.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
Comments suppressed due to low confidence (2)
go/vt/mysqlctl/builtinbackupengine.go:1372
- Chunk restore goroutines close over loop variables
jandfe(and usedest/fe.Nameinside the closure). This can result in writing the wrong chunk offset/data and misreporting errors/logs. Rebind the loop variables (e.g.j := j,feLocal := fe) before starting each goroutine.
for j := range fe.Chunks {
g.Go(func() error {
chunk := &fe.Chunks[j]
select {
go/vt/mysqlctl/builtinbackupengine.go:1394
- Non-chunked restore goroutine closes over
i/fefrom the enclosing for-loop. This can cause it to restore the wrong file index and log/record errors under the wrong name. Capture locals (e.g.iLocal := i,feLocal := fe) before g.Go.
// Non-chunked file: restore as before.
g.Go(func() error {
name := strconv.Itoa(i)
select {
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
|
Promptless prepared a documentation update related to this change. Triggered by PR #20167 Added documentation for the new |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20167 +/- ##
===========================================
- Coverage 69.67% 52.95% -16.72%
===========================================
Files 1614 46 -1568
Lines 216793 7315 -209478
===========================================
- Hits 151044 3874 -147170
+ Misses 65749 3441 -62308
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
mattlord
left a comment
There was a problem hiding this comment.
go/vt/mysqlctl/builtinbackupengine.go:1386-1390 ignores final close errors for chunked restore destinations. Non-chunked restore propagates destination close failures because they can mean data was not safely flushed; chunked restore only logs them and can report success, then later attempt to start MySQL on an incomplete/corrupt file. Please collect these close errors and return them, and also check the dest.Close() in createChunkedDestinations at line 1367.
go/vt/mysqlctl/builtinbackupengine.go:196-198 / :691-694 has no bound on the chunk count. A small --builtinbackup-file-chunk-size typo, e.g. 1, can allocate one FileChunk and one work item per byte of a large InnoDB file before backup starts. Please enforce a sane minimum chunk size or a max chunks-per-file limit before allocating.
go/vt/mysqlctl/builtinbackupengine.go:281-283 validates chunk size even when chunking is disabled or when taking an incremental backup that may not use chunking. I’d validate chunk-size > 0 only when chunk-threshold > 0, reject negative thresholds explicitly, and fix the message to say must be > 0.
I agree with the compatibility caveat too: once chunking is enabled, those backups are not restorable by older Vitess versions because old restore code ignores Chunks and looks for whole-file objects. That should be called out in release notes summary.
|
@mattlord PR updated! |
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
90da58e to
26c261d
Compare
mattlord
left a comment
There was a problem hiding this comment.
Just one (yay!) note... In go/vt/mysqlctl/builtinbackupengine.go:1329-1359 we should preserve chunked-destination close failures across retries. restoreFileEntries can return both retryable per-chunk errors from bh.Error() and direct errors from closing the shared chunked destination file (:1419-1424, after return bh.Error() at :1507-1508). restoreFiles then retries only bh.GetFailedFiles() and returns success if that retry succeeds, which drops the first-pass close failure. A close error can be where delayed writeback/ENOSPC is reported, so retrying only one failed chunk does not prove the already-written chunks are safe. Please keep direct restore errors separate and return them after retry, or mark the whole affected file/chunks for retry. A regression test should combine one retryable chunk failure with a chunked destination close failure.
Thanks, @rvrangel !
|
@mattlord if we hit a close error (specially for ENOSPC but also for failed flushes to disk) it might be better to skip retries altogether as it is very likely the same issue will fail again. what do you think if on the deferred function (builtinbackupengine.go#L1419-L1426) we just wrapped the |
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
|
added the above fix and a test for it too |
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
mattlord
left a comment
There was a problem hiding this comment.
In 1go/vt/mysqlctl/builtinbackupengine.go:717I think that we should preserve first-pass backupWorkItems errors. The first backup pass still seems to discard thebackupWorkItems’ return value, but the retry path treats that same return as significant. If bh.EndBackup(ctx)returns a direct error that is not represented inbh.GetFailedFiles()`, this can continue into MANIFEST upload and report a successful backup. Please capture the first-pass error and return it when there are no per-file failures to retry, or preserve it across retry if needed. Unless I'm missing something?
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
| - `--builtinbackup-file-chunk-threshold` (default `0`, chunking disabled): files larger than this size in bytes are split into chunks during backup. | ||
| - `--builtinbackup-file-chunk-size` (default `1073741824` / 1 GiB): the target size in bytes for each chunk. | ||
|
|
||
| **Compatibility note:** Backups created with chunking enabled are **not restorable by older Vitess versions** that do not understand the `Chunks` field in the backup MANIFEST. Non-chunked backups (the default) remain fully compatible with older versions. |
There was a problem hiding this comment.
This is forward compatible though right? Taking as an example a vtbackup job running periodically, if vtbackup starts being configured to use chunks will it be able to restore the previous non-chunked backup?
There was a problem hiding this comment.
yes, that is correct. if the Chunks field in the MANIFEST is empty, it will use the same approach and functions as before.
There was a problem hiding this comment.
I added a new e2e test that takes a back without chunking and restores it on a replica with chunking enabled in 29ad57c
| // How many times we will retry file close operations. Note that a file operation that | ||
| // returns a vtrpc.Code_FAILED_PRECONDITION error is considered fatal and we will | ||
| // not retry. | ||
| maxFileCloseRetries = 20 |
There was a problem hiding this comment.
Any reason to make this a non-constant?
There was a problem hiding this comment.
this was done to avoid the exponential back-off on one of the tests by changing the retry value:
| workItems = append(workItems, restoreWorkItem{ | ||
| fe: fe, | ||
| chunk: &fe.Chunks[j], | ||
| chunkIndex: j, |
There was a problem hiding this comment.
We infer the chunk index via j, the position of the chunk in the slice of chunks (fe.Chunks). fes can be built during a retry in the function restoreFiles before being sent down to restoreFileEntries.
The retry-failed-files code path in restoreFiles builds a new fes using only failed files/chunks. It is not impossible that a chunk that was previously in the index e.g. 10 now becomes index e.g. 0. We would run into a similar scenario each time a subset of chunks failed and are being retried.
To keep the chunk index consistent, can we infer its index another way than by looking at the slice index?
There was a problem hiding this comment.
ah, good point! We actually don't use the index for restoring, just when we are logging, so we can drop it and usage the StorageName of the chunk instead
There was a problem hiding this comment.
I would like to see more E2E tests for this with these characteristics:
- Assertions on a deterministic amount of chunks and files
- Large amount of chunks
- With chunked and unchunked files
- With different storage handlers, like S3 using MinIO (e.g.
go/test/endtoend/backup/s3/s3_builtin_test.go) - Using vtbackup
- With failure scenarios
- Showing the forward compatibility of restoring an unchunked backup with chunks parameters
- Showing the failure scenario of what happens when a chunked backup is restored without the chunks parameters
Can you also provide a performance comparison between chunking and not-chunking. Mostly to get an idea of what the performance impact can be on a stable cluster and hardware.
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
|
@frouioui ack, let me take a look to see if I can cover most of that this week |
Description
This is the first PR as part of #20159
This PR adds chunked parallel backup/restore to the builtin backup engine. Files larger than a configurable threshold are split into independently-compressed chunks during backup, which can then be restored in parallel using writes at known offsets.
Changes:
--builtinbackup-file-chunk-threshold(default 0, disabled) and--builtinbackup-file-chunk-size(default 1GiB)Related Issue(s)
Checklist
Deployment Notes
AI Disclosure
PR created by me with support of Claude, fully tested by me before publishing on our own branch and tested with unit tests and e2e on the
mainbranch