decompresses invalid bz2 files

Bug #910057 reported by Mikolaj Izdebski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
pbzip2
New
Undecided
Unassigned

Bug Description

pbzip2 version 1.1.7 successfully decompresses some invalid bz2 files if the input is coming from stdin.

$ ./pbzip2 -dcp4 --ignore-trailing-garbage=1 <tests/0797c0037ad38395907f0a1279395dd0.bz2 >/dev/null
pbzip2: *WARNING: Trailing garbage after EOF ignored!

$ echo $?
0

$ ./pbzip2 -dcp4 --ignore-trailing-garbage=1 tests/0797c0037ad38395907f0a1279395dd0.bz2 >/dev/null
pbzip2: *ERROR: Data integrity (CRC) error in data! Skipping...
Terminator thread: premature exit requested - quitting...

$ echo $?
1

$ bzip2 -dc <tests/0797c0037ad38395907f0a1279395dd0.bz2 >/dev/null

bzip2: Data integrity error when decompressing.
 Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

$ echo $?
2

A list of 43 files that trigger the bug is attached. These files are part of "PROTOS Genome Test Suite c10-archive" and can be found at: https://www.ee.oulu.fi/research/ouspg/PROTOS_Test-Suite_c10-archive

Revision history for this message
Mikolaj Izdebski (zurgunt) wrote :
Revision history for this message
Yavor Nikolov (yavor-nikolov) wrote :

The problem is that curent pbzip2 definition of "ignorable trailing garbage" is a bit different than the one used in bzip2.

* What bzip2 seems to be doing - it's looking at garbage and if it seems to look like a beginning of a bzip2 stream: it's considered a dangerous scenario and whole operation fails with error.
* pbzip2 is currently not doing that: any garbage is ignored when --ignore-trailing-garbage=1. And what has been decompressed up to that moment is considered OK.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.