[SRU] mountall crashes on udev node with missing devname

Bug #1807077 reported by Ryan Finnie on 2018-12-06
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Undecided
Unassigned
Trusty
High
Unassigned
Xenial
Undecided
Unassigned

Bug Description

[Impact]

 * udev block nodes without a devname will crash mountall, resulting in an unbootable system (emergency root shell)

 * While this is not likely to happen in a matched distro/kernel environment (it was discovered while needing to run a bionic 4.15 kernel on trusty), it is possible.

 * The code in try_udev_device() assumes a block subsystem will always have a devname; the SRU patches explicitly check for a devname and return if null.

[Test Case]

 * HPE DL385 Gen10, Samsung a822 NVMe controller, trusty install

 * Kernels <4.15 (or possibly 4.14) do not expose nvme0c33n1 and do not trigger the bug. Tested on 4.13, 4.4 and 3.13.

 * Kernel 4.15 exposes nvme0c33n1 in udev but does not have a devname, mountall crash ensues.

[Regression Potential]

 * Patch might ignore legitimate block devices on existing installations. Unlikely, since the logic path for null devname leads directly to a program crash.

[Other Info]

 * Additional context for Canonical employees: PS4.5 is a trusty backend cloud, but we now have Gen10 hardware incoming (this was discovered while adding new nova-compute hardware). Older kernels are not usable because Gen10 requires ilorest, which requires a >4.4 kernel (at least artful 4.13 is known good). So trusty+4.15 is the only viable combination for continued support of the cloud while adding new hardware. This is done via apt pinning of bionic for the kernel packages, and, mountall notwithstanding, is working fine so far.

TEST CASE:
1. Enable -proposed
2. apt-get install mountall=2.53ubuntu1
3. update-initramfs -k all -u
4. Reboot

VERIFICATION DONE
Rebooted successfully on affected Gen10 systems. Confirmed no regression on unaffected systems.

Original description:

Running bionic's 4.15 kernel on trusty on an HPE DL385 Gen10 results in a device node for the NVMe controller, /devices/pci0000:40/0000:40:03.1/0000:43:00.0/nvme/nvme0/nvme0c33n1 which itself does not have a devname. When mountall gets to it:

fsck_update: updating check priorities
try_mount: /srv/nova/instances waiting for device
try_udev_device: ignored /dev/loop5 (not yet ready?)
try_udev_device: ignored /dev/loop6 (not yet ready?)
try_udev_device: ignored /dev/loop1 (not yet ready?)
try_udev_device: ignored /dev/loop0 (not yet ready?)
try_udev_device: block (null) (null) (null)

and then crashes, leaving the boot at an emergency root shell. A successful scan looks like this for comparison:

try_udev_device: block /dev/sdb (null) (null)
try_udev_device: block /dev/sdb (null) (null)
try_udev_device: block /dev/sda (null) (null)
try_udev_device: block /dev/nvme0n1 ed56e3a9-60f7-4636-85a2-b53137b598e7 (null)
try_udev_device: block /dev/bcache0 756cb2c6-b999-4905-a021-c2e688e81a86 instances

The debdiffs check for a null devname in try_udev_device() and will not attempt to process it.

Ryan Finnie (fo0bar) wrote :
Ryan Finnie (fo0bar) on 2018-12-06
description: updated
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mountall (Ubuntu):
status: New → Confirmed
Ryan Finnie (fo0bar) on 2018-12-06
tags: added: patch trusty xenial
summary: - mountall crashes on udev node with missing devname
+ [SRU] mountall crashes on udev node with missing devname
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mountall (Ubuntu Trusty):
status: New → Confirmed
Changed in mountall (Ubuntu Xenial):
status: New → Confirmed
Steve Langasek (vorlon) wrote :

mountall is not used at boot in xenial, xenial is systemd-only for the system init and upstart is only used for user sessions in xenial (so the mountall dep is vestigial). So we should not spend time fixing this bug in xenial, it's not relevant.

Changed in mountall (Ubuntu Xenial):
status: Confirmed → Won't Fix
Steve Langasek (vorlon) wrote :

And for devel, this is invalid because mountall does not exist in bionic and later.

Changed in mountall (Ubuntu):
status: Confirmed → Invalid
Ryan Finnie (fo0bar) wrote :

Thanks. I saw xenial had mountall and figured maybe there was transition path where it was used, but if it's vestigial, I agree this should be trusty-only.

tags: removed: xenial
Ryan Finnie (fo0bar) on 2018-12-06
Changed in mountall (Ubuntu Trusty):
importance: Undecided → High

Hello Ryan, or anyone else affected,

Accepted mountall into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mountall/2.53ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-trusty to verification-done-trusty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-trusty. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in mountall (Ubuntu Trusty):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-trusty
Ryan Finnie (fo0bar) on 2018-12-25
tags: added: verification-done verification-done-trusty
removed: verification-needed verification-needed-trusty
Ryan Finnie (fo0bar) on 2018-12-25
description: updated
Łukasz Zemczak (sil2100) wrote :

What packages and package version have been used for verification? We need at least the basic info regarding the verification performed as part of the SRU.

Changed in mountall (Ubuntu Trusty):
status: Fix Committed → Incomplete
Ryan Finnie (fo0bar) wrote :

Per https://wiki.ubuntu.com/QATeam/PerformingSRUVerification I had added the procedure and results to the description:

TEST CASE:
1. Enable -proposed
2. apt-get install mountall=2.53ubuntu1
3. update-initramfs -k all -u
4. Reboot

VERIFICATION DONE
Rebooted successfully on affected Gen10 systems. Confirmed no regression on unaffected systems.

Changed in mountall (Ubuntu Trusty):
status: Incomplete → Fix Committed

The verification of the Stable Release Update for mountall has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 2.53ubuntu1

---------------
mountall (2.53ubuntu1) trusty; urgency=medium

  * try_udev_device: Ignore udev block device nodes which are missing
    devnames (LP: #1807077).

 -- Ryan Finnie <email address hidden> Thu, 06 Dec 2018 03:29:27 +0000

Changed in mountall (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers