deb package archive should use "lzma -1" or "xz -1e" compression by default

Bug #1037193 reported by Jérôme on 2012-08-15
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dpkg
New
Undecided
Unassigned
dpkg (Ubuntu)
Wishlist
Unassigned

Bug Description

If I understood, in order to fit any system, the compression method has to ensure that the required memory while decompression won't be too high. Maybe that's a reason of the choice of gzip compression method (in addition to its speed for decompression).

However, would it be possible to use the "--fast" or "-1" option of the lzma or xz compression programs ? With such a low force of those compression algorithms, the required memory for decompression is quite the same than the memory required by gunzip. However the compression ratio becomes better.

Below are a few details of the comparison :
----------------------------------------------------------
j@j-lt:~/tmp$ 7z x /var/cache/apt/archives/linux-libc-dev_2.6.32-42.95_i386.deb

7-Zip 9.04 beta Copyright (c) 1999-2009 Igor Pavlov 2009-05-30
p7zip Version 9.04 (locale=fr_FR.UTF-8,Utf16=on,HugeFiles=on,1 CPU)

Processing archive: /var/cache/apt/archives/linux-libc-dev_2.6.32-42.95_i386.deb

Extracting control.tar.gz
Extracting data.tar.gz

Everything is Ok

Files: 2
Size: 856328
Compressed: 856520
j@j-lt:~/tmp$ gunzip -c data.tar.gz > data.tar
j@j-lt:~/tmp$ gzip -c9 data.tar > data.tar.gz-9
j@j-lt:~/tmp$ lzma -c1 data.tar > data.tar.lzma-1
j@j-lt:~/tmp$ xz -c1e data.tar > data.tar.xz-1
j@j-lt:~/tmp$ ls -lh
total 6,0M
-rw-r--r-- 1 j j 18K 2012-07-25 19:36 control.tar.gz
-rw-r--r-- 1 j j 3,0M 2012-08-15 17:46 data.tar
-rw-r--r-- 1 j j 820K 2012-07-25 19:36 data.tar.gz
-rw-r--r-- 1 j j 820K 2012-08-15 17:46 data.tar.gz-9
-rw-r--r-- 1 j j 765K 2012-08-15 17:47 data.tar.lzma-1
-rw-r--r-- 1 j j 718K 2012-08-15 17:52 data.tar.xz-1
j@j-lt:~/tmp$

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-libc-dev 2.6.32-42.95
Regression: No
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-41.94-generic 2.6.32.59+drm33.24
Uname: Linux 2.6.32-41-generic i686
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] Aucun fichier ou dossier de ce type
Architecture: i386
ArecordDevices: Error: [Errno 2] Aucun fichier ou dossier de ce type
BootDmesg: (Nothing has been logged yet.)
Date: Wed Aug 15 18:07:22 2012
Dependencies:

Lsusb:
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 002: ID 046d:c016 Logitech, Inc. M-UV69a/HP M-UV96 Optical Wheel Mouse
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Sony Corporation PCG-FX301(FR)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-41-generic root=UUID=cc325b47-77d9-4b57-98af-0db1393969c6 ro quiet splash
ProcEnviron:
 LANG=fr_FR.UTF-8
 SHELL=/bin/dash
SourcePackage: linux
dmi.bios.date: 07/04/2001
dmi.bios.vendor: Sony Corporation
dmi.bios.version: R0104K5
dmi.board.name: QII-Project
dmi.board.vendor: Sony Corporation
dmi.board.version: 1A
dmi.chassis.asset.tag: 962C8394A050580000000000
dmi.chassis.type: 10
dmi.chassis.vendor: Sony Corporation
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnSonyCorporation:bvrR0104K5:bd07/04/2001:svnSonyCorporation:pnPCG-FX301(FR):pvr01:rvnSonyCorporation:rnQII-Project:rvr1A:cvnSonyCorporation:ct10:cvrN/A:
dmi.product.name: PCG-FX301(FR)
dmi.product.version: 01
dmi.sys.vendor: Sony Corporation

Jérôme (jerome-bouat) wrote :
Brad Figg (brad-figg) on 2012-08-15
Changed in linux (Ubuntu):
status: New → Confirmed
affects: linux (Ubuntu) → dpkg (Ubuntu)
Changed in dpkg (Ubuntu):
status: Confirmed → New
Jérôme (jerome-bouat) wrote :

Of course, the applications which require more memory at runtime would use a higher compression force for the deb package.

summary: - deb package archive should use "lzma -1" or "xz -1e" compression
+ deb package archive should use "lzma -1" or "xz -1e" compression by
+ default
Jérôme (jerome-bouat) wrote :

Maybe the binary package builder could use an information into the source package about the estimated required memory at runtime in order to automatically choose the right compression force.

BenHagan (smooth-texan) wrote :

This is a feature request not a bug that needs to be forwarded upstream. Marking as such.

Daniel Manrique (roadmr) on 2012-08-16
Changed in dpkg (Ubuntu):
importance: Undecided → Wishlist
BenHagan (smooth-texan) on 2012-08-16
Changed in dpkg (Ubuntu):
status: New → Incomplete
status: Incomplete → Confirmed
Dimitri John Ledkov (xnox) wrote :

This is a significant change to the archive, which has a high impact. A blueprint about this is linked to this bug and it is currently planned to discuss this at the UDS-R.

Changed in dpkg (Ubuntu):
status: Confirmed → Opinion
Jérôme (jerome-bouat) wrote :

You can also perform the same comparison between bzip2 and lzma.

The bzip2 man page shows that a bzip2 archive requires roughly 3700kB of memory when uncompressing if it has been compressed with "-9" force.

On the other side, the xz man page tells that a xz archive requires roughly 3 MB of memory when uncompressing if it has been compressed with "-4" force.

With the below example, compared to the bzip2 compression method, the lzma method :
- saves up to 18% of archive size
- requires less memory for decompression.

Maybe we could imagine the below rules for packaging deb archives :
- "xz -1e" by default
- "xz -4e" for archives which were previously compressed with bzip2
- "xz -9e" for archives whose application requires a lot of memory at runtime

Below are the result of the compression ratios of bzip2 and lzma :
-----------------------------------------------------------------
j@dt:~/tmp$ 7z x /var/cache/apt/archives/linux-image-2.6.32-42-generic_2.6.32-42.95_amd64.deb

7-Zip 9.04 beta Copyright (c) 1999-2009 Igor Pavlov 2009-05-30
p7zip Version 9.04 (locale=fr_FR.UTF-8,Utf16=on,HugeFiles=on,2 CPUs)

Processing archive: /var/cache/apt/archives/linux-image-2.6.32-42-generic_2.6.32-42.95_amd64.deb

Extracting control.tar.gz
Extracting data.tar.bz2

Everything is Ok

Files: 2
Size: 31896829
Compressed: 31897022
j@dt:~/tmp$ bunzip2 -c data.tar.bz2 > data.tar
j@dt:~/tmp$ bzip2 -c9 data.tar > data.tar.bz2-9
j@dt:~/tmp$ lzma -c4 data.tar > data.tar.lzma-4
j@dt:~/tmp$ xz -c4e data.tar > data.tar.xz-4e
j@dt:~/tmp$ xz -c1e data.tar > data.tar.xz-1e
j@dt:~/tmp$ lzma -c1 data.tar > data.tar.lzma-1
j@dt:~/tmp$ ls -lk
total 289852
-rw-r--r-- 1 j j 91 2012-07-25 18:44 control.tar.gz
-rw-r--r-- 1 j j 117370 2012-08-19 14:52 data.tar
-rw-r--r-- 1 j j 31059 2012-07-25 18:44 data.tar.bz2
-rw-r--r-- 1 j j 31059 2012-08-19 15:02 data.tar.bz2-9
-rw-r--r-- 1 j j 29946 2012-08-19 15:28 data.tar.lzma-1
-rw-r--r-- 1 j j 26193 2012-08-19 15:10 data.tar.lzma-4
-rw-r--r-- 1 j j 28509 2012-08-19 15:27 data.tar.xz-1e
-rw-r--r-- 1 j j 25605 2012-08-19 15:14 data.tar.xz-4e
j@dt:~/tmp$
-----------------------------------------------------------------

Dave Vasilevsky (djvasi) wrote :

If compression time is a concern, it's possible to do multi-threaded xz compression. See my project https://github.com/vasi/pixz , but of course it wouldn't be terribly difficult for someone to implement this specifically in dpkg, either.

Jérôme (jerome-bouat) wrote :

I don't think compression time is a concern when building a deb archive. Moreover, if you want to use several cores, just build several archives in parallel.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers