PXB-3757 Document --check-tables feature in 8.4.0-6#475
PXB-3757 Document --check-tables feature in 8.4.0-6#475alina-derkach-oaza wants to merge 3 commits into
Conversation
new file: docs/innodb-btree-check.md
80916db to
4082a04
Compare
modified: docs/innodb-btree-check.md modified: docs/xtrabackup-option-reference.md
4082a04 to
95af2d0
Compare
|
|
||
| * Runtime depends on the number of tablespaces and indexes | ||
|
|
||
| * Validation does not replace logical consistency checks such as `CHECK TABLE` |
There was a problem hiding this comment.
Add a table compares physical (--check-tables) and logical (CHECK TABLE) checks, highlighting what they detect and what they miss.
There was a problem hiding this comment.
our --check-tables and server's CHECK TABLE are same 😄 Essentially we call the same innodb fucntion on both (btr_validate_index).
|
We should also add a link to How to section.
[Side topic] Ideally, I would love to see all workflows in howto section. ie duplicating some links. We already covered good number of Howtos.. the list in my mind so far (including what we have right now) How to |
| Structural corruption that can pass checksum validation includes: | ||
|
|
||
| * Broken sibling page links | ||
|
|
||
| * Incorrect `PAGE_INDEX_ID` assignments | ||
|
|
||
| * Missing or misplaced minimum-record flags | ||
|
|
||
| * Invalid parent-to-child page references | ||
|
|
||
| * Shared external LOB (large object) pages | ||
|
|
||
| * All-zero pages with valid checksums |
There was a problem hiding this comment.
Ok, this section explains what a structural integrity means. I guess this should go before "How --check-tables option work" section
| ## Why checksum validation is not enough | ||
|
|
||
| Percona XtraBackup verifies page checksums during `--backup`. Checksum validation detects physical page corruption, including: | ||
|
|
||
| * Torn pages | ||
|
|
||
| * Storage bit rot | ||
|
|
||
| * Corrupted transfers | ||
|
|
||
| * Filesystem-level damage | ||
|
|
||
| Checksum validation confirms page integrity at the byte level. B-tree structure validation requires additional checks across related pages. |
There was a problem hiding this comment.
Storage bit-rot is an AI word 😄
This is a bit of introduction on what xtrabackup does so far without the --check-tables feature
-
currently backup ensure every page it copies is valid by verifying the checksum (verify : checksum value of data part should be same the number stored in the page header). InnoDB uses crc32 checksums.
-
If server and xtrabackup operate on same page, xtrabackup recopies the page until checksum matches. this way backup never copies a invalid checksum page from server. [There is a certain number of retries, after which it will fail the backup. This detail is not very important for now].
-
A successful backup always ensure checksum-correct backups.
Now we introduce why "checksum-correct backups" are not sufficient.
-
Disk corruptions/ File system corruptions
although the data is checksum correct when the backup is taken, a corruption can happen later after the abckup is created. Prepare can detect such corruption only if the page is touched by redo log that xtrabackup copied. This is why it is a good practice to run CHECK TABLE after restore. this ensure all pages are checksum-verified and also the structural integrity is checked. -
Pages can be checksum correct but structurally wrong.
This you mentioned below
|
|
||
| * All-zero pages with valid checksums | ||
|
|
||
| Applying the redo log during `--prepare` copies the existing structural corruption from the source server into the prepared backup. As a result, backups can remain physically consistent while containing logically corrupted indexes. |
There was a problem hiding this comment.
Well, this can happen during the backup. We dont really verify structural integrity during backup. So the corruption can happen during the abckup time.
Can happen during the prepare phase too (if there was such a redo log entry that breaks the structure of index). this is very rare and indicates a server bug. May be we can remove mentioning this possiblity of such corruption occured by applying redo log (prepare phase). A rare thing and complex for user to understand.
| A failed validation operation returns a non-zero exit code and logs the following message: | ||
|
|
||
| ```text | ||
| Table check failed. The backup may be corrupted. |
There was a problem hiding this comment.
Invidual table names are reported. At the end we have
2026-05-15T13:42:24.670469+01:00 0 [ERROR] [MY-011825] [Xtrabackup] Table check failed. The backup may be corrupted.
|
|
||
| A failed validation operation returns a non-zero exit code and logs the following message: | ||
|
|
||
| ```text |
There was a problem hiding this comment.
Sample log when xtrabackup is processing tables.
2026-05-15T13:42:23.691691+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: test/t1
2026-05-15T13:42:23.697349+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: test/t_lob
2026-05-15T13:42:23.782555+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/dd_properties
2026-05-15T13:42:23.782835+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/innodb_dynamic_metadata
2026-05-15T13:42:23.782992+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/innodb_table_stats
2026-05-15T13:42:23.783112+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/innodb_index_stats
2026-05-15T13:42:23.783276+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/innodb_ddl_log
2026-05-15T13:42:23.783446+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/catalogs
2026-05-15T13:42:23.783767+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/character_sets
2026-05-15T13:42:23.784568+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/check_constraints
2026-05-15T13:42:23.785049+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/collations
2026-05-15T13:42:23.788787+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/column_statistics
2026-05-15T13:42:23.789259+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/column_type_elements
2026-05-15T13:42:23.793267+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/columns
2026-05-15T13:42:23.927658+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/events
2026-05-15T13:42:23.928928+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/foreign_key_column_usage
2026-05-15T13:42:23.930218+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/foreign_keys
2026-05-15T13:42:23.932221+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/index_column_usage
| A successful validation operation ends with: | ||
|
|
||
| ```text | ||
| All table checks passed |
There was a problem hiding this comment.
sample log:
2026-05-15T15:41:57.808327+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/replication_group_member_actions
2026-05-15T15:41:57.808630+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/replication_group_configuration_version
2026-05-15T15:41:57.808810+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/server_cost
2026-05-15T15:41:57.808998+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/engine_cost
2026-05-15T15:41:57.809190+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/proxies_priv
2026-05-15T15:41:57.809511+01:00 2 [Note] [MY-011825] [Xtrabackup] Checking: mysql/ndb_binlog_index
2026-05-15T15:41:58.051499+01:00 0 [Note] [MY-011825] [Xtrabackup] All table checks passed.
|
|
||
| Applying the redo log during `--prepare` copies the existing structural corruption from the source server into the prepared backup. As a result, backups can remain physically consistent while containing logically corrupted indexes. | ||
|
|
||
| ## How `--check-tables` works |
There was a problem hiding this comment.
Ah, sorry, you wrote it here. I mentioned that How check-tables work section in another comment. Please see if we can integrate something from that comment here.
Either way, we have to decide the order of sections/sub sections
Pre work:
1.How xtrabckup ensures checksum-correct backups (happens during --backup)
2. what do page-checksum correct backup gurantees
3. what page-checksum backups do NOT guarantee
a. the strucutural corruption etc. What does strucutural correctness etc here
Current work
- What this feature is
- How to use this feature
a. Use when prepare
b. Cannot be used during --backup - How this feature works
|
|
||
| * Validation increases CPU and I/O usage on the backup host | ||
|
|
||
| * Runtime depends on the number of tablespaces and indexes |
There was a problem hiding this comment.
I think we should recommend to use --check-tables on the final prepare only because verify the corruption after every incremental could be slow. --check-tables will verify all the tables every time the option is used.
No description provided.