-
-
Notifications
You must be signed in to change notification settings - Fork 792
fix inconsistencies in original size computation, fixes #8898 (1.4-maint) #9003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Fix and test mostly written by Junie AI, some cleanups by me. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 1.4-maint #9003 +/- ##
=============================================
- Coverage 80.60% 80.59% -0.01%
=============================================
Files 38 38
Lines 11250 11247 -3
Branches 1767 1767
=============================================
- Hits 9068 9065 -3
Misses 1611 1611
Partials 571 571 ☔ View full report in Codecov by Sentry. |
|
I finally did some simple tests with these changes and the results are more consistent than what I got before. I will run some bigger tests over night.
What I also do not understand: what are these new metadata stats used for, are they stored or reported at any time? If not: maybe they do not even need to be generated at all. |
|
I did a few more tests for archive creation under linux with this PR added: Original size exactly matches the size of the original files (If only regular files are used, symlinks are not counted. I have not tested other special cases.) The compressed size always looks plausible The deduplicated size still has some unexpected results: Do I understand the code correctly that the reported sizes in borg info are "cached" values that are stored to the repo instead of recalculated values, so that they should not change between create and info? The empty repo without any archives reported 0 for all sizes. And here an example for a bigger archive:
I did not test any other operations than archive creation, that I would have to do next. |
80cb794 to
0fa6f57
Compare
|
@MichaelDietzel I let Junie do a bit more work. First it added a bloody workaround, but on second try I guess it found the correct way. :-) borg now in general does not account for metadata chunks in "this archive" stats, neither in "create" nor in "info". |
…"This archive" deduplicated size (refs borgbackup#9003)
…cated size" stats by excluding metadata chunks fixes issue found in borgbackup#9003 comments.
0fa6f57 to
b4efc1c
Compare
|
@MichaelDietzel guess this is as good as it gets? there is still some discrepancy between "this archive" and "all archives" (which is rather "whole repo", computed in a rather different way). |
|
Thanks, I will take a look. |
|
@MichaelDietzel Thanks for helping! BTW, I am preparing a borg 1.4.2 release, would be good if this PR could go into that. |
d3064e0 to
094b6ba
Compare
add tests ensuring: - borg info and create report same "This archive" deduplicated size - before/after borg recreate it reports same "This archive" deduplicated size - this/all archive(s) stats are same if 1 archive is in repo - all archives stats is same for borg create and borg info note that some stats differences are expected.
borgbackup#9003 do not account archive metadata, only file contents.
|
Tests rewritten to use borg's json output rather than regex-parsing the console output. |
add tests ensuring: - borg info and create report same "This archive" deduplicated size - before/after borg recreate it reports same "This archive" deduplicated size - this/all archive(s) stats are same if 1 archive is in repo - all archives stats is same for borg create and borg info note that some stats differences are expected.
borgbackup#9003 do not account archive metadata, only file contents.
bdb1c66 to
e1c18f7
Compare
borgbackup#9003 do not account archive metadata, only file contents.
e1c18f7 to
9b0cb70
Compare
|
Thank you for all your hard work and sorry for taking that long to find time to review this. Compared to the version I tested previously the statistics printed by create and by info now match each other! I just think I do not fully understand the resoning behind the
|
|
@MichaelDietzel I addressed all your feedback in the recent commit, thanks for finding these issues! |
69a7336 to
9e80850
Compare
add tests ensuring:
- borg info and create report same "This archive" deduplicated size
- borg info and create report same "All archives" deduplicated size
- before/after borg recreate it reports same "This archive" deduplicated size
note that some stats differences are expected, because the repo-level
deduplication ("All archives") is computed in a different way than "This archive".
borgbackup#9003 do not account archive metadata, only file contents.
9e80850 to
ce27418
Compare
|
Collapsed some commits, improved commit comments. |
No description provided.