potential silent data corruption with dump-0.4b46-6 on 16TB+ filesystems

Bug #1980392 reported by Greg Oster
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dump (Ubuntu)
New
Undecided
Unassigned

Bug Description

On an ext4 filesystem that is 16TB+ in size and that has data of 16TB+ dump will claim that it has successfully dumped the data, when, in fact, it has not. The primary issue occurs on files whos logical block addresses are larger than 32-bits. E.g. if dumping block 4294968320 (2**32+1024) the block dumped will be 1024, not 4294968320, but there is no operator notification of the change. There is also no indication of the issue on restore, unless a 'diff' (or some other validation method) is applied to the original and restored files.

The upstream report (with patches) can be found here:
https://sourceforge.net/p/dump/bugs/174/

There is a secondary issue of a file that is split across tapes being corrupt if the 'c_firstrec' block ID overflows, corrupting the file that spans tapes. Patches are still needed for this issue.

Original corruption was observed on Ubuntu 20.04.4 LTS amd64 with dump-0.4b46-6, and the issues were verified to be present in the up-stream dump-0.4b47 as well. While it should be possible to create a sparse ext4 filesystem that would exhibit the problem, I've been unable to create one, and have been relying on actual data to diagnose the issue and test the patches.

Let me know if any other info is needed.

Thanks.

Later...

Greg Oster

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.