5.10 kernel fails to boot with secure boot disabled

Bug #1904906 reported by bugproxy
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Ubuntu on IBM Power Systems Bug Triage
linux (Ubuntu)
Fix Released
High
bugproxy

Bug Description

Canonical requests to test the secure boot for the 5.10 kernel but kernel fails to boot with secure boot disabled.

The 5.10 kernel can be found in:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/bootstrap

They can be installed by installing the linux-generic-wip package with
this PPA enabled. As usual, they are only signed using a key specific to
that PPA. This key can be retrieved from the signing tarballs for the
kernels, e.g.:

http://ppa.launchpad.net/canonical-kernel-team/bootstrap/ubuntu/dists/hirsute/main/signed/linux-5.10-ppc64el/5.10.0-2.3/signed.tar.gz

Our tester installed the 5.10 kernel via aptitude.
If booting directly from the bootmenu, it stucks at:
"kexec_core: Starting new kernel"

If booting recovery kernel for 5.10.0, it proceeds farther and after kexec_core, it failed at:
"
[ 0.029830] LSM: Security Framework initializing
[ 0.029916] Yama: b
"

Two attempts with a different scenario; running with 5.8 kernel and boot via commandline for 5.10:
kexec -l /boot/vmlinux-5.10.0-0-generic --initrd=/boot/initrd.img-5.10.0-0-generic --append="root=UUID=49d000cb-dba2-4d70-809e-38f2b31d0f09 ro quiet splash"
kexec -e

Both attempts also failed while rebooting, once with the same error as the error from booting with bootmenu; the other failure occurred a lot earlier.

Wondering what new CONFIGs and/or features for the 5.10 kernel?

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-189504 severity-medium targetmilestone-inin2010
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Seth Forshee (sforshee) wrote :
Revision history for this message
Seth Forshee (sforshee) wrote :
Revision history for this message
Seth Forshee (sforshee) wrote :
Revision history for this message
Seth Forshee (sforshee) wrote :

I've attached our configs for the Ubuntu 5.8 and 5.10 kernels along with a diff between these configs. I didn't see anything in the diff which looked like an obvious candidate for causing the boot problems. Please let me know if you see something there which should be changed.

Revision history for this message
Seth Forshee (sforshee) wrote :

I also note that you are testing with 5.10.0-0, which was based on 5.10-rc1. We have 5.10.0-4 in the ppa now based on -rc4, it is built for hirsute and not groovy though. I'd recommend trying that version to see if there was an upstream bug which has already been fixed.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (13.1 KiB)

------- Comment From <email address hidden> 2020-11-24 02:07 EDT-------
Hi,

Ok, I have been experimenting with the hirsute kernel: 5.10.0-4-generic #5-Ubuntu

It boots without issue on a pseries guest with secure boot both on and off. However, this doesn't exercise booting out of kexec.

Booting on a machine with no OS secure-boot support also fails coming out of kexec, so something is going wrong with kexec. I will see if this can be replicated under qemu and try a bisect over the next couple of days.

In the mean time, here's the log of a boot with `earlyprintk` and without `quiet`.

Kind regards,
Daniel

[ 182.160030] kexec_core: Starting new kernel
[ 0.000000] dt-cpu-ftrs: setup for ISA 3000
[ 0.000000] dt-cpu-ftrs: final cpu/mmu features = 0x0001f86b8f5fb1a7 0x3c007041
[ 0.000000] radix-mmu: Page sizes from device-tree:
[ 0.000000] radix-mmu: Page size shift = 12 AP=0x0
[ 0.000000] radix-mmu: Page size shift = 16 AP=0x5
[ 0.000000] radix-mmu: Page size shift = 21 AP=0x1
[ 0.000000] radix-mmu: Page size shift = 30 AP=0x2
[ 0.000000] radix-mmu: Activating Kernel Userspace Execution Prevention
[ 0.000000] radix-mmu: Activating Kernel Userspace Access Prevention
[ 0.000000] radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
[ 0.000000] radix-mmu: Mapped 0x0000000040000000-0x0000000800000000 with 1.00 GiB pages
[ 0.000000] radix-mmu: Mapped 0x0000200000000000-0x0000200800000000 with 1.00 GiB pages
[ 0.000000] radix-mmu: Initializing Radix MMU
[ 0.000000] Linux version 5.10.0-4-generic (buildd@bos02-ppc64el-008) (gcc (Ubuntu 10.2.0-17ubuntu1) 10.2.0, GNU ld (GNU Binutils for Ubun
tu) 2.35.1) #5-Ubuntu SMP Mon Nov 16 09:41:59 UTC 2020 (Ubuntu 5.10.0-4.5-generic 5.10.0-rc4)
[ 0.000000] Secure boot mode disabled
[ 0.000000] Found initrd at 0xc000000004900000:0xc0000000066f0cf4
[ 0.000000] OPAL: Found memory mapped LPC bus on chip 0
[ 0.000000] Using PowerNV machine description
[ 0.000000] printk: bootconsole [udbg0] enabled
[ 0.000000] CPU maps initialized for 4 threads per core
[ 0.000000] -----------------------------------------------------
[ 0.000000] phys_mem_size = 0x1000000000
[ 0.000000] dcache_bsize = 0x80
[ 0.000000] icache_bsize = 0x80
[ 0.000000] cpu_features = 0x0001f86b8f5fb1a7
[ 0.000000] possible = 0x000ffbfbcf5fb1a7
[ 0.000000] always = 0x00000003800081a1
[ 0.000000] cpu_user_features = 0xdc0065c2 0xaef00000
[ 0.000000] mmu_features = 0xbc007441
[ 0.000000] firmware_features = 0x0000000110000000
[ 0.000000] vmalloc start = 0xc008000000000000
[ 0.000000] IO start = 0xc00a000000000000
[ 0.000000] vmemmap start = 0xc00c000000000000
[ 0.000000] -----------------------------------------------------
[ 0.000000] kvm_cma_reserve: reserving 3276 MiB for global area
[ 0.000000] cma: Reserved 3280 MiB at 0x000020072f000000
[ 0.000000] numa: NODE_DATA [mem 0x7ffd44900-0x7ffd4bfff]
[ 0.000000] numa: NODE_DATA [mem 0x2007ff438900-0x2007ff43ffff]
[ 0.000000] rfi-flush: mttrig type flush available
[ 0.000000] count-cache-flush: fl...

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (7.2 KiB)

------- Comment From <email address hidden> 2020-11-25 00:34 EDT-------
Hi,

Looks like it fails to boot on a p9 qemu/kvm guest even out of grub: hangs trying to bring up SMP. That's probably what we saw in bare-metal too, the console probably just didn't catch up.

I will continue investigating, but I'm not sure what kernel tree you're using: git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/hirsute has something based on linux-5.8. What tree are you building from?

Kind regards,
Daniel

Loading Linux 5.10.0-4-generic ...
Loading initial ramdisk ...
OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 5.10.0-4-generic (buildd@bos02-ppc64el-008) (gcc (Ubuntu 10.2.0-17ubuntu1) 10.2.0, GNU ld (GNU Binutils for Ubuntu) 2.35.1) #5-Ubuntu SMP Mon Nov 16 09:41:59 UTC 2020 (Ubuntu 5.10.0-4.5-generic 5.10.0-rc4)
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=/boot/vmlinux-5.10.0-4-generic root=UUID=19b72275-8385-4e0e-8001-62baacf410e3 ro console=hvc0 earlyprintk xmon=rw
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
memory_limit : 0000000000000000 (16 MB aligned)
alloc_bottom : 0000000006570000
alloc_top : 0000000010000000
alloc_top_hi : 0000000400000000
rmo_top : 0000000010000000
ram_top : 0000000400000000
instantiating rtas at 0x000000000daf0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000006580000 -> 0x0000000006580b32
Device tree struct 0x0000000006590000 -> 0x00000000065a0000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000002000000 ...
[ 0.000000] radix-mmu: Page sizes from device-tree:
[ 0.000000] radix-mmu: Page size shift = 12 AP=0x0
[ 0.000000] radix-mmu: Page size shift = 16 AP=0x5
[ 0.000000] radix-mmu: Page size shift = 21 AP=0x1
[ 0.000000] radix-mmu: Page size shift = 30 AP=0x2
[ 0.000000] radix-mmu: Activating Kernel Userspace Execution Prevention
[ 0.000000] radix-mmu: Activating Kernel Userspace Access Prevention
[ 0.000000] radix-mmu: Mapped 0x0000000000000000-0x0000000002000000 with 2.00 MiB pages (exec)
[ 0.000000] radix-mmu: Mapped 0x0000000002000000-0x0000000400000000 with 2.00 MiB pages
[ 0.000000] lpar: Using radix MMU under hypervisor
[ 0.000000] Linux version 5.10.0-4-generic (buildd@bos02-ppc64el-008) (gcc (Ubuntu 10.2.0-17ubuntu1) 10.2.0, GNU ld (GNU Binutils for Ubuntu) 2.35.1) #5-Ubuntu SMP Mon Nov 16 09:41:59 UTC 2020 (Ubuntu 5.10.0-4.5-generic 5.10.0-rc4)
[ 0.000000] Secure boot mode disabled
[ 0.000000] Found initrd at 0xc000000004700000:0xc00000000656fbfa
[ 0.000000] Using pSeries machine description
[ 0.000000] printk: bootconsole [udbg0] enabled
[ 0.000000] Partition configured for 24 cpus.
[ 0.000000] CPU maps initialized for 1 thread per core
[ 0.000000] -----------------------------------------------------
[ 0.000000] phys_mem_size = 0x400000000
[ 0.000000] dcache_bsize = 0x80
[ 0.000000] icache_bsize = 0x80
[ 0.000000] cpu_features = 0x0001c07b8f4f91a7
[ 0.000000]...

Read more...

Revision history for this message
Frank Heimes (fheimes) wrote :

Hi Daniel, yes, 5.8 is - as of today - still the current kernel in hirsute (main and proposed).
But the kernel teams plans to migrate to 5.10 soon, hence the effort (and pre-requirement for the migration) to test secureboot lock-down upfront. This is to make sure that potential issues in the secureboot lock-down area will not affect any (production) keys.
The keys used for signing are bound the the archive where the kernel is build and comes from, and the PPA has a separate one.
Inside of the PPA you will also find the 5.10 source code as tar.gz file.

If you follow this PPA link:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/bootstrap/+packages
and click on the kernel that was used for the testing (and open the twistie or section), or at least on a kernel that is close enough (this these kernels get pretty frequently updated), like:
"linux-5.10 - 5.10.0-0.1",
you will find the sources in the (I think only) tar.gz file that is listed there, like:
linux-5.10_5.10.0-0.1.tar.gz (176.3 MiB)
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/bootstrap/+sourcefiles/linux-5.10/5.10.0-0.1/linux-5.10_5.10.0-0.1.tar.gz

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-11-25 06:47 EDT-------
Hi,

Thanks, I'll look at sources tarball, hopefully tomorrow. (I'm in AU, so no thanksgiving here!)

Have you tested this on any of your local systems? I can't get it to work much on P9, even on stock hardware/qemu without any secure-boot features. Indeed, it even fails on qemu TCG (so you don't actually need a Power system at all!):

qemu-system-ppc64 -M pseries -m 1G -nographic -vga none -smp 4 -cpu power9 -kernel dbg/usr/lib/debug/boot/vmlinux-5.10.0-4-generic

Actually, the failure matrix is really interesting:

Power8 host + KVM + grub -> boots
Power9 host bare metal (kexec) -> fails
Power9 host + KVM + grub -> fails
Power9 host + KVM + qemu -kernel -> boots
qemu TCG + power9 cpu -> fails
qemu TCG + power8 cpu -> fails

I'm assuming the tarball includes the debian/patches directory, in which case it should be easy to apply and git bisect.

Kind regards,
Daniel

(IBMers: is there someone outside the security team that we should pull in? It doesn't seem at all to be a security-related issue.)

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-11-25 21:40 EDT-------
Ok, so sadly I cannot find a tarball with the patches not already applied, which is very frustrating. However, it turns out I don't need that because, as it turns out...

Doing an upstream checkout of v5.10-rc4 and building with config-5.10.0-4-generic also fails to boot under qemu. Previously I hadn't tested upstream with a Canoni-config, so I thought it was an ubuntu-specific bug, which clearly it is not. Apologies about that.

It also affects 5.10-rc5 and powerpc/fixes, so I'm trying to get some eyes on it internally.

Kind regards,
Daniel

Revision history for this message
Frank Heimes (fheimes) wrote :

Thx for the update, Daniel

Changed in ubuntu-power-systems:
assignee: Canonical Kernel Team (canonical-kernel-team) → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → bugproxy (bugproxy)
Revision history for this message
Daniel Axtens (daxtens) wrote :

I cannot yet explain this, but after bisecting the config, I can repro this with pseries_le_defconfig + CONFIG_RCU_SCALE_TEST=m

That's weird to me, and I'll continue to investigate.

Revision history for this message
Seth Forshee (sforshee) wrote :

That config shouldn't be on at all, it's just a performance test that isn't likely to be of interest to users. I agree it's weird though -- it shouldn't cause boot problems, as it shouldn't be loaded automatically (the module doesn't even have any modaliases). Regardless, I will disable the option.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-12-01 23:08 EDT-------
No worries, let me know if you'd like me to test another spin.

We continue to look into the issue upstream: https://<email address hidden>/

Changed in ubuntu-power-systems:
importance: Undecided → High
Frank Heimes (fheimes)
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Seth Forshee (sforshee) wrote :

Our latest 5.10 update is building with CONFIG_RCU_SCALE_TEST disabled, I will update once the build is complete.

Revision history for this message
Seth Forshee (sforshee) wrote :

The 5.10.0-7.8 with CONFIG_RCU_SCALE_TEST disabled has completed and is available in https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/bootstrap as before.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-12-10 08:44 EDT-------
Hi,

We've had some good progress with debugging upstream: https://<email address hidden>/t/#u fixes the issue properly.

Would you prefer to take that and leave the config unchanged? It'll almost certainly end up in stable trees soon anyway...

Kind regards,
Daniel

Revision history for this message
Seth Forshee (sforshee) wrote :

That's not the sort of config we usually turn on, as it's more of a developer feature than something end users are interested in. So I don't see any need to turn the option back on. We can go ahead and grab the patch though.

In any case I'd like to get some secure boot testing with the current build, as we'd like to try and get a 5.10 build into hirsute-proposed next week.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-12-14 19:40 EDT-------
Hi,

Thanks for your patience.

I tested 5.10.0-8-generic. /boot/config-5.10.0-8-generic contains:
# CONFIG_RCU_SCALE_TEST is not set

It boots fine in a P9 kvm guest, both when loaded by kexec and when loaded by grub. There is no secure-boot in these tests.

Please let me know if you need anything else.

Kind regards,
Daniel

Revision history for this message
Frank Heimes (fheimes) wrote :

Hi Daniel, thx for testing the special kernel build and verifying that it now boots again on KVM.
Since the initial request from the kernel team was to verify that secureboot and lockdown works with 5.10 on ppc64el, it would be important to test that, too - maybe you or Nayna may try secureboot lock-down testing as well?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-12-17 23:45 EDT-------
Squeezing in right before the end of the year! I tested this with my pseries secure boot setup. I built the key from the PPA into grub and signed grub with the testing key which I built into SLOF.

I was then able to boot 5.10.0-9-generic in secure boot mode under P8 KVM.

The kernel correctly detected secure boot mode and entered lockdown:

[ 0.000000] Secure boot mode enabled
[ 0.000000] Kernel is locked down from PowerNV Secure Boot mode; see man kernel_lockdown.7

(The text is a bit of a misnomer, but that's of no consequence.)

Lockdown appears to work as expected, I can't open /dev/mem for example.

Given LP: #1903288 / BZ 189099, I didn't test kexec.

In summary, I don't see anything from booting with secure boot on or off that would prevent you promoting 5.10 for hirsute.

Enjoy your end of year break!
Kind regards,
Daniel

Frank Heimes (fheimes)
Changed in linux (Ubuntu):
status: New → Fix Committed
Changed in ubuntu-power-systems:
status: New → Fix Committed
Revision history for this message
Frank Heimes (fheimes) wrote :

Many thx Daniel for squeezing in this additional secureboot lock down test.
Glad that it worked so far. Wish you also an enjoyable break !

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Waiting to close as "Fix Released" once the 5.10 kernel has landed in hirsute. 5.10 kernel is currently in hirsute -proposed.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2021-02-16 20:13 EDT-------
I retested this with my pseries secure boot setup. I built the key from the PPA into grub and signed grub with the testing key which I built into SLOF.

I was then able to boot 5.11.0-9-generic in secure boot mode and without secure boot under P8 KVM.

The kernel correctly detected secure boot mode and entered lockdown.

Lockdown appears to work as expected, I can't open /dev/mem for example.

In summary, I don't see anything from booting with secure boot on or off that would prevent you promoting 5.11 for hirsute.

Kind regards,
Daniel

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (20.7 KiB)

This bug was fixed in the package linux - 5.10.0-14.15

---------------
linux (5.10.0-14.15) hirsute; urgency=medium

  * hirsute/linux: 5.10.0-14.15 -proposed tracker (LP: #1913724)

  * Restore palm ejection on multi-input devices (LP: #1913520)
    - HID: multitouch: Apply MT_QUIRK_CONFIDENCE quirk for multi-input devices

  * intel-hid is not loaded on new Intel platform (LP: #1907160)
    - platform/x86: intel-hid: add Rocket Lake ACPI device ID

  * Hirsute update: v5.10.11 upstream stable release (LP: #1913430)
    - scsi: target: tcmu: Fix use-after-free of se_cmd->priv
    - mtd: rawnand: gpmi: fix dst bit offset when extracting raw payload
    - mtd: rawnand: nandsim: Fix the logic when selecting Hamming soft ECC engine
    - i2c: tegra: Wait for config load atomically while in ISR
    - i2c: bpmp-tegra: Ignore unknown I2C_M flags
    - platform/x86: ideapad-laptop: Disable touchpad_switch for ELAN0634
    - ALSA: seq: oss: Fix missing error check in snd_seq_oss_synth_make_info()
    - ALSA: hda/realtek - Limit int mic boost on Acer Aspire E5-575T
    - ALSA: hda/via: Add minimum mute flag
    - crypto: xor - Fix divide error in do_xor_speed()
    - dm crypt: fix copy and paste bug in crypt_alloc_req_aead
    - ACPI: scan: Make acpi_bus_get_device() clear return pointer on error
    - btrfs: don't get an EINTR during drop_snapshot for reloc
    - btrfs: do not double free backref nodes on error
    - btrfs: fix lockdep splat in btrfs_recover_relocation
    - btrfs: don't clear ret in btrfs_start_dirty_block_groups
    - btrfs: send: fix invalid clone operations when cloning from the same file
      and root
    - fs: fix lazytime expiration handling in __writeback_single_inode()
    - pinctrl: ingenic: Fix JZ4760 support
    - mmc: core: don't initialize block size from ext_csd if not present
    - mmc: sdhci-of-dwcmshc: fix rpmb access
    - mmc: sdhci-xenon: fix 1.8v regulator stabilization
    - mmc: sdhci-brcmstb: Fix mmc timeout errors on S5 suspend
    - dm: avoid filesystem lookup in dm_get_dev_t()
    - dm integrity: fix a crash if "recalculate" used without "internal_hash"
    - dm integrity: conditionally disable "recalculate" feature
    - drm/atomic: put state on error path
    - drm/syncobj: Fix use-after-free
    - drm/amdgpu: remove gpu info firmware of green sardine
    - drm/amd/display: DCN2X Find Secondary Pipe properly in MPO + ODM Case
    - drm/i915/gt: Prevent use of engine->wa_ctx after error
    - drm/i915: Check for rq->hwsp validity after acquiring RCU lock
    - ASoC: Intel: haswell: Add missing pm_ops
    - ASoC: rt711: mutex between calibration and power state changes
    - SUNRPC: Handle TCP socket sends with kernel_sendpage() again
    - HID: sony: select CONFIG_CRC32
    - dm integrity: select CRYPTO_SKCIPHER
    - x86/hyperv: Fix kexec panic/hang issues
    - scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL
    - scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback
    - scsi: qedi: Correct max length of CHAP secret
    - scsi: scsi_debug: Fix memleak in scsi_debug_init()
    - scsi: sd: Suppress spurious errors when WRITE SAME is being disabled
    - riscv: ...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Frank Heimes (fheimes) wrote :

Since kernel 5.10 migrated to hirsute (release):
linux-generic | 5.10.0.14.16 | hirsute
this but can now be closed as Fix Released.

Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2023-03-27 11:56 EDT-------
This was fixed and closed on the Canonical side. Moving to closed on ours.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.