Kernel 5.15.0-71 fails to boot on Ubuntu 22.04 (possibly specific to Ryzen APUs)

Bug #2017929 reported by Lin Manfu
94
This bug affects 16 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Since upgrading to 5.15.0-71 a couple of days ago, my PC sporadically fails to boot. I see the following error messages:

"error: cannot allocate kernel buffer
error: you need to load the kernel first.

Press any key to continue..._ "

I am then returned to the GRUB boot menu.

This has also been experienced by two other AskUbuntu users: https://askubuntu.com/questions/1465460/after-kernel-update-error-cannot-allocate-kernel-buffer-you-need-to-load-kern . The common theme appears to be Ryzen APUs; two of us have 3400Gs and one has a 7600.

In my case, the issue appears to be more likely to occur when I have to force-reboot because the system is frozen (due to a separate bug which I will link when I can find it again). But I have not yet reproduced it.

Also the log files attached to this report may not contain relevant information as the last couple of boots worked correctly. I will try to capture a relevant kernel log now that I know this is a kernel issue.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.15.0-71-generic 5.15.0-71.78
ProcVersionSignature: Ubuntu 5.15.0-71.78-generic 5.15.92
Uname: Linux 5.15.0-71-generic x86_64
ApportVersion: 2.20.11-0ubuntu82.4
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: matthew 1288 F.... pulseaudio
 /dev/snd/controlC0: matthew 1288 F.... pulseaudio
CasperMD5CheckResult: pass
CurrentDesktop: KDE
Date: Thu Apr 27 17:49:11 2023
InstallationDate: Installed on 2022-01-14 (467 days ago)
InstallationMedia: Kubuntu 21.10 "Impish Indri" - Release amd64 (20211012)
MachineType: Micro-Star International Co., Ltd. MS-7B89
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-71-generic root=UUID=7348d288-7351-45cd-8da2-592c59a4f7dd ro quiet splash
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-71-generic N/A
 linux-backports-modules-5.15.0-71-generic N/A
 linux-firmware 20220329.git681281e4-0ubuntu3.12
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to jammy on 2022-05-26 (336 days ago)
dmi.bios.date: 09/28/2021
dmi.bios.release: 5.17
dmi.bios.vendor: American Megatrends International, LLC.
dmi.bios.version: 2.E3
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: B450M MORTAR MAX (MS-7B89)
dmi.board.vendor: Micro-Star International Co., Ltd.
dmi.board.version: 1.0
dmi.chassis.asset.tag: To be filled by O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Micro-Star International Co., Ltd.
dmi.chassis.version: 1.0
dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvr2.E3:bd09/28/2021:br5.17:svnMicro-StarInternationalCo.,Ltd.:pnMS-7B89:pvr1.0:rvnMicro-StarInternationalCo.,Ltd.:rnB450MMORTARMAX(MS-7B89):rvr1.0:cvnMicro-StarInternationalCo.,Ltd.:ct3:cvr1.0:skuTobefilledbyO.E.M.:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: MS-7B89
dmi.product.sku: To be filled by O.E.M.
dmi.product.version: 1.0
dmi.sys.vendor: Micro-Star International Co., Ltd.

Revision history for this message
Lin Manfu (linmanfu) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Lin Manfu (linmanfu) wrote :

The bug (or possibly class of bug) in amdgpu that precedes this issue for me is this one: https://gitlab.freedesktop.org/drm/amd/-/issues/934. However, it is unlikely that they are related as the linked graphics bug has existed ever since I bought the hardware, but this kernel booting bug only appeared this week.

Revision history for this message
Thomas Gay (thomasga) wrote (last edit ):

I have the same issue on one of my Mac minis. I have three of them with Ubuntu 22.04.2 LTS, all with the Intel Core i5-4278U, but only one of them has the issue. The one with the problem has a 1TB Fusion drive while the other two that work have SSDs.

EDIT: It turns out that I hadn't rebooted the other two yet after installing this package, so I don't know if all three are effected or not.

Revision history for this message
Daniele C (got3nks) wrote :

Same issues here.

Gigabyte B650 AORUS ELITE AX (Firmware F5b)
AMD Ryzen 9 7900 12-Core Processor

After a few tries / reboot it manages to boot with kernel 5.15.0-71-generic #78-Ubuntu.

Revision history for this message
Lin Manfu (linmanfu) wrote :

I can reproduce this bug by force powering off my machine (holding down the power button) and immediately starting it again. If I delay starting it, the bug does not seem to occur (but this is after a handful of attempts).

I get the bug whether I power off kernel from 5.15.0-71 or 5.15.0-70. I also got the bug when I powered off Clonezilla's version of Lunar Lobster. Therefore I believe the bug is to do with booting into kernel 5.15.0-71. In all cases I am able to boot into kernel 5.15.0-70 successfully.

It might be helpful to know whether other affected users can reproduce the bug by this method.

As far as I can tell these failed boots are not registering in the journal as they do not appear to be listed in the output of "journalctl --list-boots". As GRUB itself does not seem to keep logs, I am not sure what other diagnostic information to provide.

Focal's kernel 5.4.0-148.165 also contains one of the two changes that are in 5.15.0-71. I might be able to try to reproduce this bug there.

Revision history for this message
Luke Steinberg (the-fool) wrote (last edit ):

With my server, the bug happens every single reboot. It is only after many (10+) consecutive attempts to boot into the 5.15.0-71 kernel that it actually loads correctly. I had no problems with previous kernels.

AMD Ryzen 5 7600 3.8GHz 6-core Processor
Gigabyte B650I AORUS ULTRA Mini ITX

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Can you try to reproduce the issue with only 5.15.0-70, given the changes in 5.15.0-71 would not justify these symptoms by itself?

Thank you very much.
Cascardo.

Revision history for this message
Andres Oscar Ochoa (andochoa72) wrote (last edit ):

Same here.
With 5.15.0-70 booted correctly.
Since 5.15.0-71 Grub stops with same error.

Mother ASRock: A320M-HDV R4.0
Bios UEFI v: P4.10 date: 11/27/2020
CPU: AMD Ryzen 3 3200G with Radeon Vega Graphics

Retrying several boot attempts and never boots.
Had to roll back to 5.15.0-70.

Linux Mint 21.1 Vera
Also, 5.15.0-70 gives the same error very very ocassionally, so didnt noticed before.

Revision history for this message
Reinhard Munz (rmgradient0) wrote :

AMD Ryzen 9 7950X, 16x 4.50GHz, 64MB L3-Cache (Zen4) | AM5
Gigabyte X670 Aorus Elite AX | AMD X670 | AM5
NVIDIA GeForce RTX 4090 24GB | MSI Gaming X Trio
64GB DDR5-5200 CL40 Corsair Vengeance LPX | 2x 32GB
1TB Samsung 980 PRO PCIe 4.0 | M.2 SSD
4TB Seagate IronWolf | 5.900 RPM | SATA3
1000W - Corsair HX Series HX1000i 2022 | 80Plus Platinum

Ubuntu 22.04 LTS (jammy) [22.04.2]
GNU/Linux 5.15.0-71-generic x86_64

Would consistently hang in the GRUB menu with the error of the bug report
Would consistently boot fine after single CTRL-ALT-DEL

I did the following things and now it boots normally again:
apport-collect -p linux 2017929 [this fails to send for me with some bug]
sudo nano /etc/default/grub [I changed one line "GRUB_TIMEOUT=0" to "GRUB_TIMEOUT=5"]
sudo update-grub
sudo update-initramfs -uk 'all'
sudo shutdown -h now

Revision history for this message
Jonas (joney) wrote :

I can confirm the issue as well. 5.15.0-70 boots fine, -71 stops with above error.
System details:

sudo inxi -v 2
System:
  Host: xxx Kernel: 5.15.0-70-generic x86_64 bits: 64
    Desktop: GNOME 42.5 Distro: Ubuntu 22.04.2 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop Mobo: Micro-Star model: B450M MORTAR (MS-7B89) v: 1.0
    serial: I816012785 UEFI: American Megatrends LLC. v: 1.I0 date: 07/27/2022
CPU:
  Info: quad core AMD Ryzen 5 2400G with Radeon Vega Graphics [MT MCP]
    speed (MHz): avg: 1600 min/max: 1600/3600

Revision history for this message
Rolfe Schmidt (rolfeschmidt) wrote :

Also confirm that 5.15.0-70 boots fine, -71 stops with above error.

sudo inxi -v 2
System:
  Host: *** Kernel: 5.15.0-70-generic x86_64 bits: 64
    Desktop: GNOME 42.5 Distro: Ubuntu 22.04.2 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop System: Micro-Star product: MS-7B09 v: 1.0 serial: N/A
  Mobo: Micro-Star model: X399 GAMING PRO CARBON AC (MS-7B09) v: 1.0
    serial: I316363281 UEFI: American Megatrends v: 1.C0 date: 11/14/2018
CPU:
  Info: 8-core AMD Ryzen Threadripper 2950X [MT MCP MCM] speed (MHz):
    avg: 2300 min/max: 2200/3500

Revision history for this message
Lin Manfu (linmanfu) wrote :

I can now consistently reproduce this error message on -71 by hitting <Enter> to choose "Ubuntu" on the GRUB menu as soon as it appears. If I leave the GRUB menu to time out and then start -71, then it does not occur.

So it looks as though it might be a race condition?

I cannot reproduce this when I choose -70. But that requires three keypresses and that perhaps takes too long to trigger the condition.

Revision history for this message
Daniele C (got3nks) wrote :

I confirm what Lin and Reinhard said. My system can boot -71 if I let the 30 seconds timer go to zero after rebooting with CTRL+ALT+DEL.

Revision history for this message
Stephane Hockenhull (rv6502) wrote (last edit ):

I have the same issue with -71
If I hit CTRL-ALT-DEL after the error and then hit enter it boots. (* EDIT: not if done immediately under 2 seconds)
Waiting for the 30sec timeout also works.

System:
  Host: --- Kernel: 5.15.0-71-generic x86_64 bits: 64 Desktop: Xfce 4.16.0
    Distro: Ubuntu 22.04.2 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop System: Micro-Star product: MS-7B09 v: 1.0 serial: N/A
  Mobo: Micro-Star model: X399 GAMING PRO CARBON AC (MS-7B09) v: 1.0
    serial: ---- UEFI: American Megatrends v: 1.C0 date: 11/14/2018
CPU:
  Info: 16-core AMD Ryzen Threadripper 1950X [MT MCP MCM] speed (MHz):
    avg: 3400 min/max: 2200/3400
Graphics:
  Device-1: NVIDIA TU106 [GeForce RTX 2070 Rev. A] driver: vfio-pci v: N/A
  Device-2: NVIDIA GA104 [GeForce RTX 3070 Ti] driver: nvidia v: 470.182.03
  Device-3: Logitech StreamCam type: USB
    driver: hid-generic,snd-usb-audio,usbhid,uvcvideo
  Display: server: X.Org v: 1.21.1.4 driver: X: loaded: nvidia
    gpu: vfio-pci,nvidia resolution: 1: 1920x1080~60Hz 2: 3840x2160
    3: 1920x1080~60Hz
  OpenGL: renderer: NVIDIA GeForce RTX 3070 Ti/PCIe/SSE2
    v: 4.6.0 NVIDIA 470.182.03

No AMD iGPU / APU in this system.

However my System76 Kudu6 laptop with Ryzen 9 5900HX, Radeon RX Vega 8 APU boots -71 just fine on the first try, no issue at all.

EDIT: both systems boot off a Samsung SSD 970 EVO Plus 2TB

The laptop has the AMD Cezanne iGPU/APU as the primary graphic device, connected only to the LCD panel, the nvidia RTX 3060M is the secondary GPU connected only to the external video ports and not in use at this point in the laptop's boot process.

Revision history for this message
Stephane Hockenhull (rv6502) wrote :

I went and edited /boot/grub/grub.cfg directly to change the timeout in case something else in the process of updating GRUB the correct way with update-grub was a factor.

my experimentations with a full shutdown every time, in order:

timeout=0 FAIL
timeout=5 boots
timeout=0 FAIL (to confirm the error was still present)
timeout=1 FAIL
press CTRL-ALT-DEL then hit enter immediately: FAIL
press CTRL-ALT-DEL then hit enter at the 28s left mark (after 2 seconds) : boots
timeout=2 boots

It looks like GRUB needs a 2 seconds delay minimum (at least on my system) before it can load kernel -71 properly.

Revision history for this message
csaleman (csaleman-yahoo) wrote (last edit ):

This worked for me: https://askubuntu.com/questions/1466789/my-5-15-0-71-kernel-on-ubuntu-22-04-2-lts-on-amd-ryzen-3-2200g-can-never-boot

/etc/default/grub
Uncommenting

GRUB_GFXMODE=1024x768

and then
$ sudo update-grub

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.