negative compression ratio if list a big gzip file

Bug #1022287 reported by tommy
36
This bug affects 8 people
Affects Status Importance Assigned to Milestone
gzip (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Please notice the uncompressed part and ratio, there should be wrong. uncompressed size should be more than 200GiB actually. it should be a overflow of integer.

tommy@tommy-ubuntu-1110:/media/PartBackup/sdb3$ gzip -l sdb3.ntfsclone.img.gz
         compressed uncompressed ratio uncompressed_name
       185808678561 4242664346 -4279.5% sdb3.ntfsclone.img
tommy@tommy-ubuntu-1110:/media/PartBackup/sdb3$ ls -l sdb3.ntfsclone.img.gz
-rw-rw-r-- 1 tommy tommy 185808678561 7月 8 17:04 sdb3.ntfsclone.img.gz

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gzip (Ubuntu):
status: New → Confirmed
Revision history for this message
Eric Druid (eric-druid+ubuntu) wrote :

To me it looks like this is an integer overflow on the uncompressed filesize

druid@#### ~ gzip -l dump.sql.gz
         compressed uncompressed ratio uncompressed_name
         2850064545 1792003237 -59.0% dump.sql
druid@#### ~ gzip --version
gzip 1.4
Copyright (C) 2007 Free Software Foundation, Inc.
Copyright (C) 1993 Jean-loup Gailly.
This is free software. You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.

Written by Jean-loup Gailly.

Revision history for this message
Gert van den Berg (mohag1) wrote :

See page 7 of https://www.ietf.org/rfc/rfc1952.txt (ISIZE is the relevant field)

It is an limitation of the gzip file format...

It is also documented in the gzip man page under "BUGS", with a (slow) workaround...

One way to fix it might be to define an "extra field" with a larger original file size... The extra fields go into the header though, instead of the trailer, which means that it can't be correctly populated for data piped through gzip, since the size is not known when the header is written... (This also seems to be the suggestion here: https://lists.gnu.org/archive/html/bug-gzip/2010-08/msg00009.html )

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.