Ampere AltraMax sometimes hangs after "EFI stub: Exiting boot services..."
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
grub2 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Invalid
|
Undecided
|
Taihsiang Ho |
Bug Description
Kernel trace happens when bringing up CPU with focal HWE kernel 5.15.0-43-generic
Steps to reproduce: enable earlycon and run reboot loop
Expected result: system boots and ready to use
Actual result: kerenl trace happens when bringing up CPU. See the call trace below[1].
[1] (copy from comment #7)
[ 3.321372] arch_timer: Enabling local workaround for ARM erratum 1418040
[ 3.321394] CPU245: Booted secondary processor 0x0100310100 [0x413fd0c1]
[ 8.536939] CPU246: failed to come online
[ 8.550338] Detected PIPT I-cache on CPU246
[ 8.555817] CPU246: failed in unknown state : 0x0
[ 8.555987] GICv3: CPU246: found redistributor 100370000 region 1:0x0000500100f
[ 8.562727] ------------[ cut here ]------------
[ 8.562809] GICv3: CPU246: using allocated LPI pending table @0x0000080002680000
[ 8.569364] Dying CPU not properly vacated!
[ 16.668834] WARNING: CPU: 0 PID: 1 at kernel/
[ 16.681496] Modules linked in:
[ 16.684597] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-43-generic #46~20.04.1-Ubuntu
[ 16.693010] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 16.700095] pc : sched_cpu_
[ 16.704432] lr : sched_cpu_
[ 16.708770] sp : ffff80000881bbe0
[ 16.712132] x29: ffff80000881bbe0 x28: ffffcca930eada80 x27: 0000000000000000
[ 16.719390] x26: ffff7395107ce000 x25: 00000000000000f6 x24: ffff403e40ed6728
[ 16.726651] x23: ffffcca930eb2618 x22: 0000000000000000 x21: 00000000000000f6
[ 16.733907] x20: ffffcca930719100 x19: ffff403e40ee7100 x18: 0000000000000014
[ 16.741165] x17: 3836323030303830 x16: 3030303078304020 x15: 656c62617420676e
[ 16.748425] x14: 69646e6570204950 x13: 3030303038363230 x12: 3030383030303030
[ 16.755683] x11: 78304020656c6261 x10: 7420676e69646e65 x9 : ffffcca92e7e39a0
[ 16.762940] x8 : 6c6c6120676e6973 x7 : 205d393038323635 x6 : c0000000fffeffff
[ 16.770198] x5 : 000000000017ffe8 x4 : 0000000000000000 x3 : ffff80000881b8c8
[ 16.777457] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 16.784717] Call trace:
[ 16.787197] sched_cpu_
[ 16.791182] cpuhp_invoke_
[ 16.795699] cpuhp_invoke_
[ 16.800568] _cpu_up+0x234/0x360
[ 16.803844] cpu_up+0xb8/0x110
[ 16.806945] bringup_
[ 16.811195] smp_init+0x3c/0x98
[ 16.814385] kernel_
[ 16.818813] kernel_
[ 16.822359] ret_from_
[ 16.825992] ---[ end trace bb9ca924bc23dd10 ]---
[ 16.830685] CPU246 enqueued tasks (0 total):
[ 16.835356] ------------[ cut here ]------------
[ 16.835431] GICv3: CPU246: found redistributor 100370000 region 1:0x0000500100f
[ 16.840045] Dying CPU not properly vacated!
[ 16.847925] WARNING: CPU: 0 PID: 1 at kernel/
== Original Description ==
When kernel test rebooted onto the 5.15.0-43-generic HWE kernel, no output appeared on the console after the EFI stub:
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint A0
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint 92
Checkpoint AD
EFI stub: Booting Linux Kernel...
EFI stub: ERROR: FIRMWARE BUG: kernel image not aligned on 64k boundary
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...
no longer affects: | ubuntu |
tags: | added: foundations-triage-discuss |
Changed in grub2 (Ubuntu): | |
status: | New → Invalid |
description: | updated |
I put our system in a reboot loop to see if I could reproduce and, after 16 boots of 5.15.0-43-generic, it hung again. Seems we have a real issue here.
As a next step, I suggest installing the same OS (focal) and downgrading to the same kernel (5.15.0-43) on the same system (papat) and adding "earlycon" to the kernel command line, then restarting a reboot loop. If the system still hangs w/ earlycon, we might get more information out of the kernel.
= Quick way to set up a reboot loop = reboot- loop.sh
sudo apt-add-repository -y ppa:dannf/dannf && sudo apt install -y dannf
dannf-setup-
sudo reboot
To stop the reboot loop, interrupt GRUB and add "stop" to the kernel command line.