losetup -f broken in 2.0.6-1ubuntu2

Bug #1850184 reported by Balint Reczey on 2019-10-28
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
klibc (Ubuntu)
Status tracked in Focal
Eoan
Undecided
Unassigned
Focal
Undecided
Unassigned

Bug Description

[Impact]

 * sudo /usr/lib/klibc/bin/losetup -vf, which appears to be missbuilt, as main(argc) is reset to zero, after ioctl() operations in a function call, quite unexpectadly.

[Test Case]

 * $ sudo /usr/lib/klibc/bin/losetup -vf
Loop device is /dev/loop20
loop: can't get info on device /dev/loop20: No such device or address

is bad.

Note that ioctl() must succeed, thus loop0 device must be configured to trigger the bug.

[Regression Potential]

 * klibc is quite special, as it uses linux kernel headers/assembly. It seems like there is incompatibility between klibc sources, and gcc-9 with linux-5.3 when used to build userspace programmes.

 * disabling cf-protection and stack-clash-protection did not help.

 * building with gcc-8 does not exhibit the problem.

 * the workaround is quite simple in the code, keep a copy of argc to compare to it later in the code.

[Other Info]

 * Original bug report

http://autopkgtest.ubuntu.com/packages/c/casper/focal/amd64

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-focal/focal/amd64/c/casper/20191025_214555_df8b8@/log.gz

...
[ 11.751912] EXT4-fs (sda1): mounting ext2 file system using the ext4 subsystem
[ 11.761441] EXT4-fs (sda1): mounted filesystem without journal. Opts: (null)
loop: can't get info on device /dev/loop1: No such device or address

BusyBox v1.30.1 (Ubuntu 1:1.30.1-4ubuntu4) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) + mkdir result
+ set -x
+ read LINE
+ grep -e '^--OUT .* BEGIN-- .* --END--$' qemu-output.txt
++ grep -q /rofs result/lsblk.txt
grep: result/lsblk.txt: No such file or directory
autopkgtest [21:45:45]: test boot: -----------------------]
autopkgtest [21:45:45]: test boot: - - - - - - - - - - results - - - - - - - - - -
boot FAIL non-zero exit status 2
autopkgtest [21:45:45]: @@@@@@@@@@@@@@@@@@@@ summary
boot FAIL non-zero exit status 2
...

Michael Hudson-Doyle (mwhudson) wrote :

This is a bit strange. FWIW, this test has never passed properly on focal, the test doesn't really test anything until the cloud images appear and they weren't there for the first couple of tests.

Michael Hudson-Doyle (mwhudson) wrote :

This is the failure:

(initramfs) losetup -f
loop: can't get info on device /dev/loop1: No such device or address

And it reflects behaviour in the focal live-server images too, i.e. they are completely broken. Fun.

Michael Hudson-Doyle (mwhudson) wrote :

So this was somehow caused by the ftbfs fix for klibc. On my eoan system:

mwhudson@anduril:~/images$ sudo /usr/lib/klibc/bin/losetup -f
loop: can't get info on device /dev/loop30: No such device or address
mwhudson@anduril:~/images$ sudo apt install klibc-utils=2.0.6-1ubuntu1 libklibc=2.0.6-1ubuntu1
[...]
mwhudson@anduril:~/images$ sudo /usr/lib/klibc/bin/losetup -f
/dev/loop30

affects: casper (Ubuntu) → klibc (Ubuntu)
summary: - casper >= 1.427 autopkgtest is failing
+ losetup -f broken in 2.0.6-1ubuntu2
Dimitri John Ledkov (xnox) wrote :

Rebuilding with gcc-8 => losetup works
Rebuilding with gcc-9 => losetup does not work
Rebuilding with gcc-9 & -fcf-protection=none & -fno-stack-clash-protection does not work (and double checked that there are no gcc invocations in the build log without those two options set)

Dimitri John Ledkov (xnox) wrote :

device = find_unused_loop_device();
if (device == NULL)
 return -1;
if (verbose)
 printf("Loop device is %s\n", device);
if (argc == optind) {
 printf("%s\n", device);
 return 0;
}
file = argv[optind];

Somehow... argc == optind condition is false, and instead of exiting the program, we go into "show" mode on the just detected loop device, which fails, as it is a free one and there is nothing to show.

Dimitri John Ledkov (xnox) wrote :

so calling an ioctl seems to clear the global argc in klibc built with gcc-9

163 if(ioctl (fd, LOOP_GET_STATUS, &loopinfo) == 0)
(gdb) bt
#0 find_unused_loop_device () at usr/utils/losetup.c:163
#1 0x0000000000401135 in main (argc=2, argv=0x7fffffffe618) at usr/utils/losetup.c:454
(gdb) n
164 someloop++; /* in use */
(gdb) bt
#0 find_unused_loop_device () at usr/utils/losetup.c:164
#1 0x0000000000401135 in main (argc=0, argv=0x7fffffffe618) at usr/utils/losetup.c:454

Thorsten Glaser (mirabilos) wrote :

Fun: this works with 2.0.7-1 built with gcc-9 in Debian.

Dimitri John Ledkov (xnox) wrote :

So, I'm not sure if this is a kernel headers/assembly bug (as ioctl is used from there), gcc-9 bug, or the combination of the two.

I'm going to save argc, and use a saved copy for now, but this needs deeper analysis. This sounds like a retpoline mitigation.

Dimitri John Ledkov (xnox) wrote :
description: updated

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1850184

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Eoan):
status: New → Incomplete
Adam Conrad (adconrad) wrote :

The testcase here doesn't seem to be working (or, rather, failing) for me, which makes it harder to investigate this. It passes on both sid and eoan for me:

(sid-amd64)root@nosferatu:~# dpkg -l \*klibc\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-============-============-===============================================
ii klibc-utils 2.0.7-1 amd64 small utilities built with klibc for early boot
ii libklibc:amd64 2.0.7-1 amd64 minimal libc subset for use with initramfs
(sid-amd64)root@nosferatu:~# /usr/lib/klibc/bin/losetup -vf
Loop device is /dev/loop0
/dev/loop0
(sid-amd64)root@nosferatu:~#

(eoan-amd64)root@nosferatu:~# dpkg -l \*klibc\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-==============-============-===============================================
ii klibc-utils 2.0.6-1ubuntu2 amd64 small utilities built with klibc for early boot
ii libklibc:amd64 2.0.6-1ubuntu2 amd64 minimal libc subset for use with initramfs
(eoan-amd64)root@nosferatu:~# /usr/lib/klibc/bin/losetup -vf
Loop device is /dev/loop0
/dev/loop0
(eoan-amd64)root@nosferatu:~#

description: updated
Adam Conrad (adconrad) wrote :

Ah-ha. If loop0 is in use, then the test-case appropriately fails in both unstable and eoan, which is comforting, as I didn't look forward to figuring out why this works in Debian (it doesn't).

tags: added: patch
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package klibc - 2.0.7-1ubuntu1

---------------
klibc (2.0.7-1ubuntu1) focal; urgency=low

  * Merge from Debian unstable. Remaining changes:
    * Fix FTBFS on eoan and later with new gcc
      - cf-protection.patch: Disable cf-protection for syscalls stub.

  * save-argc.patch: when build with gcc-9 linux-5.3, calling ioctl,
    clears global argc, thus save it, to compare to it later. Otheriwse
    losetup -f is broken LP: #1850184

klibc (2.0.7-1) unstable; urgency=medium

  [ Ben Hutchings ]
  * New upstream version:
    - klcc: Enable stripping even if CONFIG_DEBUG_INFO is enabled
    - run-init: Allow the initramfs to be persisted across root changes
      (thanks to Matthew Garrett)
    - ipconfig: Implement support -d ...:dns0:dns1 options (Closes: #931416)
    - Kbuild: Work around broken "ar s" in binutils 2.32 (see #941921)
  * debian/rules: Reorganise make flags variables
  * debian/rules: Define ARCH for klibc, for all architectures
  * debian/rules: Delete redundant architecture mappings
  * debian/rules: Delete redundant export
  * klibc-utils: Trigger update-initramfs on install/upgrade
  * initramfs-tools: Don't install commands that already exist in /sbin
  * initramfs-tools: Exclude kinit and zcat commands earlier
  * initramfs-tools: Exclude gzip command
  * Drop "resume: Backward compatibility for resume_offset", which will
    not be needed in the next release
  * [klibc] fstype: Drop obsolete support for "ext4dev" (Closes: #932926)
  * debian/control: Set Maintainer to Debian Kernel Team; move maks to
    Uploaders

  [ James Clarke ]
  * debian/control: Restrict m4 build dependency to just sparc

  [ Helmut Grohne ]
  * Honour DEB_BUILD_OPTIONS=nocheck. (Closes: #922814)

 -- Dimitri John Ledkov <email address hidden> Thu, 31 Oct 2019 11:50:44 +0000

Changed in klibc (Ubuntu Focal):
status: New → Fix Released
Dimitri John Ledkov (xnox) wrote :

This is my replacement "minimized" losetup.c that exhibits the problem.

Simply, drop this into klibc sources
and run:

./debian/rules build; sudo ./usr/utils/static/losetup
1 Argc before find_unused_loop_devices()
0 Argc after find_unused_loop_devices()
Argc values before and after did not match.

Note that same source works fine as glibc binary:

gcc ./usr/utils/losetup.c; sudo ./a.out
1 Argc before find_unused_loop_devices()
1 Argc after find_unused_loop_devices()

no longer affects: gcc-9 (Ubuntu)
no longer affects: gcc-9 (Ubuntu Eoan)
no longer affects: gcc-9 (Ubuntu Focal)
no longer affects: linux (Ubuntu Focal)
no longer affects: linux (Ubuntu Eoan)
no longer affects: linux (Ubuntu)
Adam Conrad (adconrad) on 2019-10-31
Changed in klibc (Ubuntu Focal):
status: Fix Released → Confirmed
Changed in klibc (Ubuntu Eoan):
status: New → Confirmed
Thorsten Glaser (mirabilos) wrote :

How is this even a fix?

Also, does this affect other applications built against klibc?

We found the issue -- I think xnox is going to report it via proper channels. But basically it's this:

<mwhudson> sizeof(dev_t) = 4 sizeof(__kernel_old_dev_t) = 8

This makes the kernel's loop_info 8 bytes bigger than klibc's and so the kernel is writing past the end of the loopinfo on the stack, which until now has avoided causing problems by sheer luck.

Thorsten Glaser (mirabilos) wrote :

Yeah, I saw the mail; much better. I fixed one of these in dietlibc the other day…

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package klibc - 2.0.7-1ubuntu4

---------------
klibc (2.0.7-1ubuntu4) focal; urgency=medium

  * Fix losetup, by switching to kernel uapi header, instead of buggy
    klibc one. LP: #1850184

 -- Dimitri John Ledkov <email address hidden> Thu, 07 Nov 2019 01:08:13 +0000

Changed in klibc (Ubuntu Focal):
status: Confirmed → Fix Released

Hello Balint, or anyone else affected,

Accepted klibc into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/klibc/2.0.6-1ubuntu3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in klibc (Ubuntu Eoan):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-eoan
Adam Conrad (adconrad) wrote :

(eoan-amd64)root@nosferatu:~# apt-get install klibc-utils=2.0.6-1ubuntu2 libklibc=2.0.6-1ubuntu2
(eoan-amd64)root@nosferatu:~# /sbin/losetup -vf
/dev/loop6
(eoan-amd64)root@nosferatu:~# /usr/lib/klibc/bin/losetup -vf
Loop device is /dev/loop6
loop: can't get info on device /dev/loop6: No such device or address
(eoan-amd64)root@nosferatu:~# apt-get install klibc-utils=2.0.6-1ubuntu3 libklibc=2.0.6-1ubuntu3
(eoan-amd64)root@nosferatu:~# /sbin/losetup -vf
/dev/loop6
(eoan-amd64)root@nosferatu:~# /usr/lib/klibc/bin/losetup -vf
Loop device is /dev/loop6
/dev/loop6

tags: added: verification-done verification-done-eoan
removed: verification-needed verification-needed-eoan
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package klibc - 2.0.6-1ubuntu3

---------------
klibc (2.0.6-1ubuntu3) eoan; urgency=medium

  * Pull upstream fixes for losetup issues raised with gcc-9 (LP: #1850184):
    - loop-header.patch: Switch to using the kernel's UAPI exported loop.h
    - loop-fixes.patch: Fix some type mismatch warnings from above change.
    - loop-fixes-2.patch: Fix last type mismatch in code dropped upstream.

 -- Adam Conrad <email address hidden> Wed, 06 Nov 2019 23:15:53 -0700

Changed in klibc (Ubuntu Eoan):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for klibc has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments