pbzip2 -d --ignore-trailing-garbage=1 hangs on certain archives with higher #CPUs (-p# >2)

Bug #740502 reported by Yavor Nikolov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
pbzip2
Fix Released
High
Yavor Nikolov

Bug Description

Issue initially reported on https://bugs.gentoo.org/show_bug.cgi?id=320313:

Kevin Korb 2011-03-22 19:59:19 UTC

I have also run into this issue with binary packages built by portage. I built
a package for gcc on one system but was unable to install it on others...

If I install pbzip2-1.1.1 and do a 'tar -tvf
/usr/portage/packages/sys-devel/gcc-4.4.5.tbz2' I get:
pbzip2: *ERROR unconsumed in after BZ2_bzDecompress loop:ret=4; block=0;
seq=37; avail_in=4513
Terminator thread: premature exit requested - quitting...
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

All that is expected. However, if I upgrade to pbzip2-1.1.2 which ignores
trailing garbage and run the same tar command it gets to the next to the last
file within the archive and then simply hangs.

An strace on the running pbzip2 -d process shows it stuck on:
futex(0x410eabd8, FUTEX_WAIT, 15911, NULL

The file is simply the result of an 'emerge -b gcc' but just in case it is hard
to duplicate I have posted the 12MB file here:
http://bikergeeks.com/gcc-4.4.5.tbz2

I have not had any other issues with binary packages built by portage on either
version so I am not sure what is different about the gcc package. It does work
fine with regular bzip2.

----
Yavor Nikolov 2011-03-22 20:34:46 UTC

(In reply to comment #26)
Thanks for reporting that, Kevin!
What #CPU-s is used (or -p# parameter)?

----
Kevin Korb 2011-03-22 20:42:02 UTC

(In reply to comment #27)
I had only tried it with 2 CPUs and no -p.

I didn't see a way to inject the -p1 into the tar command so I just called the
bzip2 symlink to pbzip2 directly. With -p1 it worked with the warning about
trailing garbage. However, with -p2 it also worked. Therefore this may be
more about the use of 'bzip2 -d' from within tar.

----
Yavor Nikolov 2011-03-22 20:49:35 UTC

What is the output of
pbzip2 --ignore-trailing-garbage=1 -d -k -v gcc-4.4.5.tbz2

It's not about tar - that's pbzip2 bug. I got a similar hang with -p4 and -p3
though things worked fine with -p2. (Maybe you have dual-core cpu-s or
hyper-threading enabled or just the issue is not very deterministic).

---
Kevin Korb 2011-03-22 20:56:26 UTC

(In reply to comment #29)
> What is the output of
> pbzip2 --ignore-trailing-garbage=1 -d -k -v gcc-4.4.5.tbz2
>
> It's not about tar - that's pbzip2 bug. I got a similar hang with -p4 and -p3
> though things worked fine with -p2. (Maybe you have dual-core cpu-s or
> hyper-threading enabled or just the issue is not very deterministic).

Actually, I forgotten that I had switched to my quad core desktop for more
rapid testing and re-installing of pbzip2.

# pbzip2 --ignore-trailing-garbage=1 -d -k -v gcc-4.4.5.tbz2
Parallel BZIP2 v1.1.2 - by: Jeff Gilchrist [http://compression.ca]
[Feb. 19, 2011] (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <email address hidden>

         # CPUs: 4
 Maximum Memory: 100 MB
 Ignore Trailng Garbage: on
-------------------------------------------
         File #: 1 of 1
     Input Name: gcc-4.4.5.tbz2
    Output Name: gcc-4.4.5.tbz2.out

 BWT Block Size: 900k
     Input Size: 11626360 bytes
Decompressing data...
Completed: 58%
[hang]

This was done on my desktop which actually does have 4 CPUs. But if I add -p2
to force 2 threads like my dual core server I get:

# pbzip2 --ignore-trailing-garbage=1 -d -k -v -p2 gcc-4.4.5.tbz2
Parallel BZIP2 v1.1.2 - by: Jeff Gilchrist [http://compression.ca]
[Feb. 19, 2011] (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <email address hidden>

         # CPUs: 2
 Maximum Memory: 100 MB
 Ignore Trailng Garbage: on
-------------------------------------------
         File #: 1 of 1
     Input Name: gcc-4.4.5.tbz2
    Output Name: gcc-4.4.5.tbz2.out

 BWT Block Size: 900k
     Input Size: 11626360 bytes
Decompressing data...
pbzip2: *WARNING: Trailing garbage after EOF ignored!
    Output Size: 33013760 bytes
-------------------------------------------

     Wall Clock: 2.201255 seconds

and success.

Related branches

Revision history for this message
Yavor Nikolov (yavor-nikolov) wrote :
Revision history for this message
Jeff Gilchrist (jeff-gilchrist) wrote : Re: [Bug 740502] [NEW] pbzip2 -d --ignore-trailing-garbage=1 hangs on certain archives with higher #CPUs (-p# >2)

On Tue, Mar 22, 2011 at 5:26 PM, Yavor Nikolov
<email address hidden> wrote:

> Issue initially reported on
> https://bugs.gentoo.org/show_bug.cgi?id=320313:

Thanks for letting me know. This week is totally insane for me with
several deadlines. As soon as we can get a fix and test it, we can
issue a new version with the fix.

Jeff.

Revision history for this message
Yavor Nikolov (yavor-nikolov) wrote :

I'm attaching a patch which is supposed to resolve the problem (tested on top of test-case archive - worked fine for me).

Changed in pbzip2:
assignee: nobody → Yavor Nikolov (yavor-nikolov)
milestone: none → 1.1.3
status: Confirmed → In Progress
Changed in pbzip2:
status: In Progress → Fix Committed
Changed in pbzip2:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.