hwe-edge kernel 5.3.0-23.25 kernel does not boot on Precision 5720 AIO

Bug #1852581 reported by Harm van Bakel on 2019-11-14
34
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Critical
Seth Forshee
Bionic
Undecided
Unassigned
Eoan
Critical
Seth Forshee
linux-hwe-edge (Ubuntu)
Critical
Seth Forshee
Bionic
Critical
Seth Forshee
Eoan
Undecided
Unassigned

Bug Description

SRU Justification

Impact: The fix for bug 1850234 does not function as intended in bionic, as a result of modinfo not knowing about module signatures. This results in no modules being signed in hwe kernels based on 5.3, rendering systems with secure boot enabled unbootable.

Fix: Check for the module signature at the end of modules instead of relying on modinfo. This can be done without any external tools needing to be aware of module signatures.

Test Case: Check that all built modules contain signatures, except for those in staging which have not been whitelisted.

Regression Potential: I can think of two possible regression situations. We could regress to the behavior prior to the fix for bug 1850234, or the eoan 5.3 kernel could also end up with all modules unsigned. I've done test builds of both the eoan 5.3 kernel and the bionic 5.3 hwe-edge kernel with this patch and checked that the results are as intended. We should also check this again once new kernels have been built, before copying them out to -proposed.

---

The latest hwe-edge kernel 5.3.0-23.25 fails to boot with the message that it cannot find the UUID associated with the root partition. The user gets dropped to a busybox shell with an initramfs prompt. The standard hwe kernel does not have this issue and the last hwe-edge kernel that does work is 5.3.0-19.20.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-generic-hwe-18.04-edge 5.3.0.23.90
ProcVersionSignature: Ubuntu 5.3.0-19.20~18.04.2-generic 5.3.1
Uname: Linux 5.3.0-19-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Thu Nov 14 08:19:57 2019
EcryptfsInUse: Yes
InstallationDate: Installed on 2019-09-01 (73 days ago)
InstallationMedia: Ubuntu 18.04.3 LTS "Bionic Beaver" - Release amd64 (20190805)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-meta-hwe-edge
UpgradeStatus: No upgrade log present (probably fresh install)

Harm van Bakel (hvbakel) wrote :
Harm van Bakel (hvbakel) on 2019-11-14
summary: - hwe-edge kernel 5.3.0-32.25 kernel does not boot on Precision 5720 AIO
+ hwe-edge kernel 5.3.0-23.25 kernel does not boot on Precision 5720 AIO
description: updated
il_santo (ivanhoe-p) wrote :

Same problem for me on a Lenovo ideapad 320-15ABR since 5.3.0-22.24.
All kernel versions up to 5.3.0-19.20 work fine.

I inspected /proc/modules and /dev after being dropped to busybox:
- /proc/modules is empty
- neither hardisks nor partitions are present under /dev

John Doe (somerandomjohndoe) wrote :

I'm also getting the same issue on an ASUS VivoBook K451LB (x86_64, UEFI boot). 5.3.0-19.20 is the last kernel that boots on this system, all subsequent ones fail and drop to the initramfs prompt.

Unpacking /boot/initrd.img-5.3.0-19-generic and /boot/initrd.img-5.3.0-23-generic (using unmkinitramfs) and performing a diff doesn't appear to yield much, with only one file missing (an RTC driver that was intentionally removed).

However, I did notice that the kernel modules were just slightly smaller in the 5.3.0-23 initrd, and that eventually led me to this:

$ modinfo '/run/shm/initrd.img-5.3.0-19-generic/main/lib/modules/5.3.0-19-generic/kernel/drivers/ata/libahci.ko'
filename: /run/shm/initrd.img-5.3.0-19-generic/main/lib/modules/5.3.0-19-generic/kernel/drivers/ata/libahci.ko
license: GPL
description: Common AHCI SATA low-level routines
author: Jeff Garzik
srcversion: 20FB7D717055C5AA0AF9896
depends:
retpoline: Y
intree: Y
name: libahci
vermagic: 5.3.0-19-generic SMP mod_unload
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4
parm: skip_host_reset:skip global host reset (0=don't skip, 1=skip) (int)
parm: ignore_sss:Ignore staggered spinup flag (0=don't ignore, 1=ignore) (int)
parm: ahci_em_messages:AHCI Enclosure Management Message control (0 = off, 1 = on) (bool)
parm: devslp_idle_timeout:device sleep idle timeout (int)

$ modinfo '/run/shm/initrd.img-5.3.0-23-generic/main/lib/modules/5.3.0-23-generic/kernel/drivers/ata/libahci.ko'
filename: /run/shm/initrd.img-5.3.0-23-generic/main/lib/modules/5.3.0-23-generic/kernel/drivers/ata/libahci.ko
license: GPL
description: Common AHCI SATA low-level routines
author: Jeff Garzik
srcversion: EF173B0C134561F058A30A9
depends:
retpoline: Y
intree: Y
name: libahci
vermagic: 5.3.0-23-generic SMP mod_unload
parm: skip_host_reset:skip global host reset (0=don't skip, 1=skip) (int)
parm: ignore_sss:Ignore staggered spinup flag (0=don't ignore, 1=ignore) (int)
parm: ahci_em_messages:AHCI Enclosure Management Message control (0 = off, 1 = on) (bool)
parm: devslp_idle_timeout:device sleep idle timeout (int)

Notice that the output of modinfo for the 5.3.0-19 module contains the following lines (which are missing in the 5.3.0-23 modinfo output):
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4

My hypothesis is that the newer kernel modules being unsigned is the root cause behind these boot failures, since I currently have Secure Boot enabled. I'll attempt a boot with Secure Boot turned off and report back again.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-meta-hwe-edge (Ubuntu):
status: New → Confirmed
John Doe (somerandomjohndoe) wrote :

I can confirm that disabling Secure Boot does indeed allow the system to boot normally.

$ uname -srvp
Linux 5.3.0-23-generic #25~18.04.1-Ubuntu SMP Tue Nov 12 10:58:57 UTC 2019 x86_64

$ sudo od -An -t u1 /sys/firmware/efi/vars/SecureBoot-8be4df61-93ca-11d2-aa0d-00e098032b8c/data
   0

$ inxi -v 1
System: Host: HOSTNAME Kernel: 5.3.0-23-generic x86_64 bits: 64 Desktop: MATE 1.20.1
           Distro: Ubuntu 18.04.3 LTS
CPU: Dual core Intel Core i5-4200U (-MT-MCP-) speed/max: 1044/2600 MHz
Graphics: Card-1: Intel Haswell-ULT Integrated Graphics Controller
           Card-2: NVIDIA GK208M [GeForce GT 740M]
           Display Server: x11 (X.Org 1.20.4 ) drivers: modesetting (unloaded: fbdev,vesa)
           Resolution: 1366x768@60.00hz
           OpenGL: renderer: Mesa DRI Intel Haswell Mobile version: 4.5 Mesa 19.2.1
Drives: HDD Total Size: 250.1GB (5.5% used)
Info: Processes: 208 Uptime: 2 min Memory: 487.5/7845.5MB Client: Shell (bash) inxi: 2.3.56

As an additional data point, the output of dmesg when dropped to the initramfs prompt reveals:
...
Run /init as init process
Lockdown: systemd-udevd: Loading of unsigned module is restricted; see man kernel_lockdown.7
...

FWIW, 5.0.0-36-generic (i.e. the newest 18.04 HWE, non-edge kernel) is *not* affected by this issue and boots normally with Secure Boot enabled.

Lynn (griffin-ld) wrote :

Same problem with Dell Precision 7540

Unfortunately, disabling secure boot is not an option.

Seth Forshee (sforshee) wrote :

The problem is that during the build we're using modinfo to determine if a module is signed before adding a .gnu_debuglink section, and if it is we then re-sign the module after adding the section. This works fine in eoan, but in bionic it appears that modinfo doesn't know about module signatures and so none of the resulting modules end up signed.

I'll get this fixed.

affects: linux-meta-hwe-edge (Ubuntu) → linux-hwe-edge (Ubuntu)
Changed in linux-hwe-edge (Ubuntu):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Critical
Changed in linux-hwe-edge (Ubuntu Eoan):
status: New → Invalid
Changed in linux (Ubuntu Bionic):
status: New → Invalid
Changed in linux-hwe-edge (Ubuntu Bionic):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Critical
status: New → In Progress
Changed in linux-hwe-edge (Ubuntu):
status: Confirmed → Invalid
Changed in linux (Ubuntu):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Critical
status: New → In Progress
Changed in linux (Ubuntu Eoan):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Critical
status: New → In Progress
Seth Forshee (sforshee) on 2019-11-18
description: updated
Changed in linux (Ubuntu Eoan):
status: In Progress → Fix Committed
Changed in linux-hwe-edge (Ubuntu Bionic):
status: In Progress → Fix Committed
Harm van Bakel (hvbakel) wrote :

I gave 5.3.0-23.25~18.04.2 on the canonical kernel team ppa a try, but it looks like the signing issue still persists.

Adil Hussain (ao7) wrote :

I got hit too...

This is a Precision 5520; had to revert kernel to 5.3.0-19-generic to boot system.

I see segfaults in /dev/mapper/control and /dev/pts/ptmx when I'm dropped to initramfs shell.
My /etc/fstab is empty. /dev doesn't contain any device files of my hard drive.

Adil Hussain (ao7) wrote :

missed adding another comment...

I don't have hwe-edge installed.

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Harm van Bakel (hvbakel) on 2019-11-25
tags: added: verification-failed-bionic
removed: verification-needed-bionic

Hi Harm van Bakel,

Can you please try the same version of the linux-hwe-edge (5.3.0-23.25~18.04.2) but installing it from the -proposed pocket? The build in the canonical-kernel-team ppa doesn't have the complete signing bits, so the binaries are not the same.

Thank you.

tags: added: verification-needed-bionic
removed: verification-failed-bionic
Harm van Bakel (hvbakel) wrote :

Ah, my apologies. After reinstalling from the -proposed pocket the 5.3.0-23.25~18.04.2 kernel is indeed booting properly.

Hello, I have HP x360
The same situation that laptop does not boot with kernel 5.3.0-23 with error

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-hwe-edge - 5.3.0-23.25~18.04.2

---------------
linux-hwe-edge (5.3.0-23.25~18.04.2) bionic; urgency=medium

  * bionic/linux-hwe-edge: 5.3.0-23.25~18.04.2 -proposed tracker (LP: #1853459)

  * hwe-edge kernel 5.3.0-23.25 kernel does not boot on Precision 5720 AIO
    (LP: #1852581)
    - [Packaging] Fix module signing with older modinfo

 -- Kleber Sacilotto de Souza <email address hidden> Thu, 21 Nov 2019 15:36:45 +0100

Changed in linux-hwe-edge (Ubuntu Bionic):
status: Fix Committed → Fix Released
il_santo (ivanhoe-p) wrote :

Well, 5.3.0-23 now boots indeed. But it doesn't shut down, that is even worse than not booting at all... I tried three times, every time it got stuck during the shutdown process, unable to kill process #1 (systemd) leaving the CPU stale on that process. I had to long press the power button and kill my PC using the hard ways. Since I really don't like the idea of ending up with a corrupted filesystem I stopped trying booting kernel 5.3.0-23 again. I will stay on 5.3.0-19 until a new working kernel will be released. BTW, two major bugs on the same kernel is not that nice, from a QA perspective (ok, it's an "edge" kernel, but anyway...).

il_santo (ivanhoe-p) wrote :

When I try to shutdown my PC with kernel 5.3.0-23 i get:

[54592.092003] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! (systemd:1)
[54620.092003] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! (systemd:1)
[54648.092003] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! (systemd:1)
[54676.092003] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! (systemd:1)
[54704.092003] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! (systemd:1)
[54732.092003] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! (systemd:1)
[54746.232003] INFO: rcu_sched self-detected stall on CPU

This goes on forever and my PC never shuts down.

Harm van Bakel (hvbakel) wrote :

@il_santo: you should probably submit a separate bug report for this issue as it is likely unrelated to the module signing issue that was addressed here.

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Bertrand Presles (bpresles) wrote :

On an HP EliteBook 840 G6 I had the same issue with Kernel 5.3.0-20 -> 23.25~18.04.1, that was solved with 5.3.0-23.25~18.04.2.

But since an update was released, version 5.3.0-24.26~19.04.2, that broke the boot again with a kernel panic. I didn't find the log of the kernel panic, so sadly I can't provide it right now (I've looked in /var/log/syslog, /var/log/kern.log and /var/log/boot.log).

Bertrand Presles (bpresles) wrote :

I can confirm that proposed version 5.3.0-25.27~18.04.1 of the kernel works fine (no KP, boot OK, shutdown Ok, everything work).

Thank you :)

Khaled El Mously (kmously) wrote :

Marked as verified based on comments #22 and #14

tags: added: verification-done-eoan
removed: verification-needed-eoan
Launchpad Janitor (janitor) wrote :
Download full text (27.4 KiB)

This bug was fixed in the package linux - 5.3.0-26.28

---------------
linux (5.3.0-26.28) eoan; urgency=medium

  * eoan/linux: 5.3.0-26.28 -proposed tracker (LP: #1856807)

  * nvidia-435 is in eoan, linux-restricted-modules only builds against 430,
    ubiquity gives me the self-signed modules experience instead of using the
    Canonical-signed modules (LP: #1856407)
    - Add nvidia-435 dkms build

linux (5.3.0-25.27) eoan; urgency=medium

  * eoan/linux: 5.3.0-25.27 -proposed tracker (LP: #1854762)

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * [CML] New device id's for CMP-H (LP: #1846335)
    - mmc: sdhci-pci: Add another Id for Intel CML
    - i2c: i801: Add support for Intel Comet Lake PCH-H
    - mtd: spi-nor: intel-spi: Add support for Intel Comet Lake-H SPI serial flash
    - mfd: intel-lpss: Add Intel Comet Lake PCH-H PCI IDs

  * i915: Display flickers (monitor loses signal briefly) during "flickerfree"
    boot, while showing the BIOS logo on a black background (LP: #1836858)
    - [Config] FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER=y

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * Kernel build log filled with "/bin/bash: line 5: warning: command
    substitution: ignored null byte in input" (LP: #1853843)
    - [Debian] Fix warnings when checking for modules signatures

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * Dell XPS 13 9350/9360 headphone audio hiss (LP: #1654448) // [XPS 13 9360,
    Realtek ALC3246, Black Headphone Out, Front] High noise floor (LP: #1845810)
    - ALSA: hda/realtek: Reduce the Headphone static noise on XPS 9350/9360

  * no HDMI video output since GDM greeter after linux-oem-osp1 version
    5.0.0-1026 (LP: #1852386)
    - drm/i915: Add new CNL PCH ID seen on a CML platform
    - SAUCE: drm/i915: Fix detection for a CMP-V PCH

  * [broadwell-rt286, playback] Since Linux 5.2rc2 audio playback no longer
    works on Dell Venue 11 Pro 7140 (LP: #1846539)
    - [Config] Drop snd-sof-intel-bdw build
    - SAUCE: ASoC: SOF: Intel: Broadwell: clarify mutual exclusion with legacy
      driver

  * [CML-S62] Need enable turbostat patch support for Comet lake- S 6+2
    (LP: #1847451)
    - SAUCE: tools/power turbostat: Add Cometlake support

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerp...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (8.6 KiB)

This bug was fixed in the package linux - 5.4.0-9.12

---------------
linux (5.4.0-9.12) focal; urgency=medium

  * alsa/hda/realtek: the line-out jack doens't work on a dell AIO
    (LP: #1855999)
    - SAUCE: ALSA: hda/realtek - Line-out jack doesn't work on a Dell AIO

  * scsi: hisi_sas: Check sas_port before using it (LP: #1855952)
    - scsi: hisi_sas: Check sas_port before using it

  * CVE-2019-19078
    - ath10k: fix memory leak

  * cifs: DFS Caching feature causing problems traversing multi-tier DFS setups
    (LP: #1854887)
    - cifs: Fix retrieval of DFS referrals in cifs_mount()

  * Support DPCD aux brightness control (LP: #1856134)
    - SAUCE: drm/i915: Fix eDP DPCD aux max backlight calculations
    - SAUCE: drm/i915: Assume 100% brightness when not in DPCD control mode
    - SAUCE: drm/i915: Fix DPCD register order in intel_dp_aux_enable_backlight()
    - SAUCE: drm/i915: Auto detect DPCD backlight support by default
    - SAUCE: drm/i915: Force DPCD backlight mode on X1 Extreme 2nd Gen 4K AMOLED
      panel
    - USUNTU: SAUCE: drm/i915: Force DPCD backlight mode on Dell Precision 4K sku

  * The system cannot resume from S3 if user unplugs the TB16 during suspend
    state (LP: #1849269)
    - PCI: pciehp: Do not disable interrupt twice on suspend
    - PCI: pciehp: Prevent deadlock on disconnect

  * change kconfig of the soundwire bus driver from y to m (LP: #1855685)
    - [Config]: SOUNDWIRE=m

  * alsa/sof: change to use hda hdmi codec driver to make hdmi audio on the
    docking station work (LP: #1855666)
    - ALSA: hda/hdmi - implement mst_no_extra_pcms flag
    - ASoC: hdac_hda: add support for HDMI/DP as a HDA codec
    - ASoC: Intel: skl-hda-dsp-generic: use snd-hda-codec-hdmi
    - ASoC: Intel: skl-hda-dsp-generic: fix include guard name
    - ASoC: SOF: Intel: add support for snd-hda-codec-hdmi
    - ASoC: Intel: bxt-da7219-max98357a: common hdmi codec support
    - ASoC: Intel: glk_rt5682_max98357a: common hdmi codec support
    - ASoC: intel: sof_rt5682: common hdmi codec support
    - ASoC: Intel: bxt_rt298: common hdmi codec support
    - ASoC: SOF: enable sync_write in hdac_bus
    - [config]: SND_SOC_SOF_HDA_COMMON_HDMI_CODEC=y

  * Fix unusable USB hub on Dell TB16 after S3 (LP: #1855312)
    - SAUCE: USB: core: Make port power cycle a seperate helper function
    - SAUCE: USB: core: Attempt power cycle port when it's in eSS.Disabled state

  * Focal update: v5.4.3 upstream stable release (LP: #1856583)
    - rsi: release skb if rsi_prepare_beacon fails
    - arm64: tegra: Fix 'active-low' warning for Jetson TX1 regulator
    - arm64: tegra: Fix 'active-low' warning for Jetson Xavier regulator
    - perf scripts python: exported-sql-viewer.py: Fix use of TRUE with SQLite
    - sparc64: implement ioremap_uc
    - lp: fix sparc64 LPSETTIMEOUT ioctl
    - time: Zero the upper 32-bits in __kernel_timespec on 32-bit
    - mailbox: tegra: Fix superfluous IRQ error message
    - staging/octeon: Use stubs for MIPS && !CAVIUM_OCTEON_SOC
    - usb: gadget: u_serial: add missing port entry locking
    - serial: 8250-mtk: Use platform_get_irq_optional() for optional irq
    - tty: serial: fsl_lpuart: use the sg ...

Read more...

Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers