rpm doesn't handle sparse files efficiently

Bug #638632 reported by Jeff Johnson on 2010-09-15
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Won't Fix

Bug Description


Some ELF programs/shared libraries are sparse.
E.g. on x86-64 if -Wl,-z,relro is used, most of the shared libraries
have around 1MB of zeros in the middle.
$ rpm -qf --qf '%{name}-%{version}-%{release}.%{arch}\n' /usr/lib64/gconv/IBM904.so
$ cp -a --sparse=always /usr/lib64/gconv{,.new}
$ du -sk /usr/lib64/gconv{,.new}
197972 /usr/lib64/gconv
6368 /usr/lib64/gconv.new

Either rpm could do what e.g. cpio --sparse does when unpacking,
or it could special case just ELF files which have sufficiently big
gap between PT_LOAD segments.

rpm decompresses into a mmap'd buffer, so sparse files
are filled in by zlib I believe.

Possibly blocks of zero's can be eliminated after installing.

Special handling for PT_LOAD and elf within cpio unpacking
is almost certainly not the right approach.

> Special handling for PT_LOAD and elf within cpio unpacking
> is almost certainly not the right approach.

If you don't want to do this, it is required that you record somehow
somewere in the .rpm file which parts of the original files have been
sparsely allocated. The problem is that you in general cannot just
force every file to be sparse. There are noticeable difference: the
file layout will be non-sequential in some cases (-> performance), or
no disk space can be allocated for a file once data is written to the
sparse area.

On the o ther hand, these issues will never pop up with executables.
So it makes sense to treat them special if you don't want to do it
100% right.

OK, I know how to implement sparse file handling in rpm for PT_LOAD
segment gaps.

If this change is actually necessary, rather than desirable,
then I suggest you expedite through other channels than bugzilla.

NEEDINFO, I know what to do in rpm, but perhaps installing
all executables sparsely needs to be carefully thought through
before attempting an implementation.

I've never seen any unix user who was not surprised by, say,
    cp -R /usr /var/tmp
when there are sparse files that are involved,
one's disk space mysteriously vanishes. And yes I know
there are options to handle sparse files correctly, I
just don't believe those options are widely known or used.

I cannot see any problem with installing all binaries sparsely. You
can recognize ELF binaries easily and then treat them appropriately.

Sure there's no problem installing elf files sparsely in rpm.

The problem is the change when users do "cp -R" or any other
command that does not copy files sparsely. Disk space can/will
be used when sparse blocks are filled in, that is invariably
surprising to end-users.

Again, not my call. You want sparse file handling in rpm, it's
easier to implement than to discuss the merits of sparse files.

Off to distribution, please attach to some tracking bug so that
I have a clue when and where sparse file rpm installs are desired.

Currently not being considered for U5 unless informed otherwise.

Jeff Johnson (n3npq) on 2010-09-15
tags: added: fedora sparse
Changed in fedora:
importance: Unknown → Medium
status: Unknown → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.