busybox 1.30.1 crashes bzip2 test case with glibc 2.29, always

Bug #1828282 reported by Dimitri John Ledkov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
BusyBox
Fix Released
Medium
Ubuntu on IBM z Systems
Invalid
High
bugproxy
busybox (Ubuntu)
Fix Released
Undecided
Unassigned
glibc (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Steps to reproduce:

1) Get a system with glibc 2.29

2) Get busybox 1.30.1 installed (e.g. eoan, or download busybox package from https://launchpad.net/ubuntu/+source/busybox/1:1.30.1-4ubuntu3/+build/16724246 and use $ apt install ./busybox*.deb to install)

3) Get busybox 1.30.1 source code, e.g. $ pull-lp-source busybox
Or like download the orig tarball from https://launchpad.net/ubuntu/+source/busybox/1:1.30.1-4ubuntu3

4) Run the bunzip2 testsuite:

cd testsuite/
ECHO=/bin/echo ./bunzip2.tests

Observe that with glibc 2.29 the:
PASS: bunzip2: bz2_issue_11.bz2 corrupted example

is XFAIL or FAIL, on s390x, whereas it passes on all other arches.

If one uses glibc 2.28 (ie. use Cosmic, and install busybox & use matching test suite from eoan using links above) one can observe that the testcase always passes.

We suspect this might be a glibc 2.29 s390x-specific setjmp regression. Probably due to setjmp usage in ./archival/libarchive/decompress_bunzip2.c

The tests were done on a z13 machine.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → bugproxy (bugproxy)
tags: added: reverse-proxy-bugzilla s390x
bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-177501 severity-high targetmilestone-inin1910
tags: added: id-5cc732a8910db44841cff9f0
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (6.8 KiB)

------- Comment From <email address hidden> 2019-05-13 07:27 EDT-------
Hi xnox,

this issue has nothing todo with an issue in s390x specific setjmp/longjmp implementation!
Setjmp/longjmp is just used for error handling inside bunzip2 implementation in busybox!
But due to an issue in busybox implementation, longjmp is called on s390x but not on e.g. x86.
Please report this bug to busybox with the detailed information below!

According to bunzip2.tests:
bunzip2: bunzip error -5 => PASS
bunzip2: bunzip error -3 => XFAIL

As side note:
Error -3 also occures on s390x Ubuntu 18.04.2 LTS!

According to archival/libarchive/decompress_bunzip2.c:
62#define RETVAL_UNEXPECTED_INPUT_EOF (dbg("%d", __LINE__), -3)
64#define RETVAL_DATA_ERROR (dbg("%d", __LINE__), -5)

RETVAL_UNEXPECTED_INPUT_EOF is used only in get_bits():
128 bd->inbufCount = read(bd->in_fd, bd->inbuf, IOBUF_SIZE);
129 if (bd->inbufCount <= 0)
130 longjmp(*bd->jmpbuf, RETVAL_UNEXPECTED_INPUT_EOF);
If you start gdb and set a breakpoint there ...:
busybox-1.30.1/testsuite$ gdb ../busybox_unstripped
(gdb) b decompress_bunzip2.c:130
(gdb) run bunzip2 <bz2_issue_11.bz2 2>&1 >/dev/null
... it will be hit, bd->inbufCount will be zero and the longjmp jumps back to setjmp in unpack_bz2_stream().
i will be -3 and "bunzip error -3" will be reported.:
788 i = setjmp(jmpbuf);
789 if (i == 0)
790 i = start_bunzip(&jmpbuf, &bd, xstate->src_fd, outbuf + 2, len);
791
792 if (i == 0) {
793 while (1) { /* "Produce some output bytes" loop */
794 i = read_bunzip(bd, outbuf, IOBUF_SIZE);
795 if (i < 0) /* error? */
796 break;
...
808 if (i != RETVAL_LAST_BLOCK
809 /* Observed case when i == RETVAL_OK:
810 * "bzcat z.bz2", where "z.bz2" is a bzipped zero-length file
811 * (to be exact, z.bz2 is exactly these 14 bytes:
812 * 42 5a 68 39 17 72 45 38 50 90 00 00 00 00).
813 */
814 && i != RETVAL_OK
815 ) {
816 bb_error_msg("bunzip error %d", i);
817 break;
818 }

The difference between reporting -5 or -3 depends on uninitialized values on the stack while calling read_bunzip()->get_next_block().
There you have the array mtfSymbol on stack:
156/* Unpacks the next block and sets up for the inverse Burrows-Wheeler step. */
157static int get_next_block(bunzip_data *bd)
158{
159 int groupCount, selector,
160 i, j, symCount, symTotal, nSelectors, byteCount[256];
161 uint8_t uc, symToByte[256], mtfSymbol[256], *selectors;
...

The groupCount is read and values in mtfSymbol are initialized:
...
219 /* How many different Huffman coding groups does this block use? */
220 groupCount = get_bits(bd, 3);
221 if (groupCount < 2 || groupCount > MAX_GROUPS)
222 return RETVAL_DATA_ERROR;
...
228 for (i = ...

Read more...

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Will do! thanks for digging into this even though it's well, busybox issue.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-15 08:26 EDT-------
@Xnox: due to the fact that this bugzilla is a busybox problem , can I close it on my side. This still is than open in LP for your tracking..

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Invalid
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

@IBM
Please close your LTC bugzilla entry. We will continue to use this LP issue to pursue busybox upstream. Thanks a lot for your input!

Changed in busybox (Ubuntu):
status: New → Triaged
Changed in glibc (Ubuntu):
status: New → Invalid
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-15 10:19 EDT-------
IBM Bugzilla status ->closed, tracking will be done via LP only.

Revision history for this message
In , Dimitri John Ledkov (xnox) wrote :

Originally reported at https://bugs.launchpad.net/ubuntu/+source/busybox/+bug/1828282 with initial suspicion at glibc, however later diagnosed to be a busybox issue.

The full analysis is at https://bugs.launchpad.net/ubuntu/+source/busybox/+bug/1828282/comments/1

In short bz2_issue_11.bz2 test case always fails on s390x since bunzip2 depends on uninitialised values, which happen to always be "wrong" on s390x.

This is observable with valgrind too:

# valgrind busybox bunzip2 <bz2_issue_11.bz2 2>&1 >/dev/null
==40965== Memcheck, a memory error detector
==40965== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==40965== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==40965== Command: busybox bunzip2
==40965==
==40965== Conditional jump or move depends on uninitialised value(s)
==40965== at 0x17C1D4: get_next_block (decompress_bunzip2.c:393)
==40965== by 0x17C37F: get_next_block (decompress_bunzip2.c:419)
==40965==
bunzip2: bunzip error -5
==40965==
==40965== HEAP SUMMARY:
==40965== in use at exit: 0 bytes in 0 blocks
==40965== total heap usage: 7 allocs, 7 frees, 4,539,696 bytes allocated
==40965==
==40965== All heap blocks were freed -- no leaks are possible
==40965==
==40965== For counts of detected and suppressed errors, rerun with: -v
==40965== Use --track-origins=yes to see where uninitialised values come from
==40965== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

For the time being we are skipping the bz2_issue_11.bz2 test case in ubuntu.

Changed in busybox:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , Vda-linux (vda-linux) wrote :

Fixed in git, lots of thanks!

Revision history for this message
In , Dimitri John Ledkov (xnox) wrote :

Nice!

It is valgrind clean now, but the testsuite fails:

$ ./bunzip2.tests
PASS: bunzip2: doesnt exist
PASS: bunzip2: unknown suffix
PASS: bunzip2: already exists
PASS: bunzip2: stream unpack
PASS: bunzip2: delete src
PASS: bunzip2: test_bz2 file
PASS: bunzip2: pbzip_4m_zeros file
PASS: bunzip2: bz2_issue_11.bz2 corrupted example
FAIL: bunzip2: bz2_issue_12.bz2 corrupted example

Maybe, now that this is fixed, the issue_12 expectation should be changed?

It currently expects "bunzip2: bunzip error -3:1", yet we now generate "bunzip2: bunzip error -5:1" (just like issue_11 corrupted example)

Changed in busybox (Ubuntu):
status: Triaged → Fix Committed
Changed in busybox:
status: Confirmed → Fix Released
Revision history for this message
In , Dimitri John Ledkov (xnox) wrote :

Test suite got fixed in master too, all is good:
https://git.busybox.net/busybox/commit/?id=b2c123d484dbe261758f27ced213f4649173803b

Thanks a lot for the quick fixes! Included in Ubuntu devel series.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package busybox - 1:1.30.1-4ubuntu4

---------------
busybox (1:1.30.1-4ubuntu4) eoan; urgency=medium

  * Revert previous upload, cherrypick upstream fix for the issue. LP:
    #1828282
  * Adjust testsuite expectations.

 -- Dimitri John Ledkov <email address hidden> Thu, 23 May 2019 14:37:05 +0100

Changed in busybox (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.