Kernels 5.15.0-57-generic and 5.15.0-58-generic kernel panic

Bug #2002779 reported by Preetham Manjunatha
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I have a GPU workstation with AMD Ryzen™ Threadripper 2950X Processor, ASUS Zenith Extreme, and four Nvidia GeForce RTX 2080Ti GPUs (Nvidia driver: 525). My OS is Ubuntu 22.04.1. I tried updating the by sudo apt update and upgrade. Kernels 5.15.0-57-generic and 5.15.0-58-generic both have kernel panic. The only option was to switch to a different kernel i.e., 5.15.0-56-generic, under Advanced Options while booting.

Please help and fix this.

Revision history for this message
Preetham Manjunatha (preethamam) wrote :
Revision history for this message
Preetham Manjunatha (preethamam) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2002779

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Preetham Manjunatha (preethamam) wrote (last edit ):

I have restarted multiple times and the issue persists and is confirmed.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Danilo Silva (danilosilva) wrote :

I can confirm the bug. In my case the workstation has an Intel(R) Core(TM) i7-10700 processor, a Gigabyte B460M AORUS PRO motherboard, and a single NVIDIA GeForce RTX 2080 Ti GPU. The steps to reproduce are:
1. install Ubuntu 22.04.1 (kernel 5.15.0-58-generic)
2. install CUDA (https://developer.nvidia.com/cuda-downloads)
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
$ sudo dpkg -i cuda-keyring_1.0-1_all.deb
$ sudo apt-get update
$ sudo apt-get -y install cuda
3. reboot
Downgrading to kernel 5.15.0-56-generic solved it.

Revision history for this message
Danilo Silva (danilosilva) wrote :

However, I have another workstation with an NVIDIA GeForce RTX 3080 Ti and also tried to install Ubuntu server 22.04.1 with CUDA today (same steps above). In this case no kernel worked. I tried 5.15.0-58, -57, -56, -50 and -25. All of them resulted in kernel panic.

Revision history for this message
Danilo Silva (danilosilva) wrote :

A workaround is to install Ubuntu Desktop 22.04 (minimal).

However, if, on top of that installation, I install package ubuntu-server, than the bug appears again.

Removing that package (after booting with a different kernel that did not have CUDA installed) removed the bug.

Thus, it must have something to do with one of the following packages:

bcache-tools byobu cloud-guest-utils cloud-initramfs-copymods cloud-initramfs-dyn-netconf cryptsetup
cryptsetup-initramfs dmeventd fonts-ubuntu-console kpartx landscape-common libaio1
libdevmapper-event1.02.1 libevent-core-2.1-7 libisns0 liblvm2cmd2.03 libmspack0 libopeniscsiusr libsgutils2-2
liburcu8 libutempter0 libxmlsec1 libxmlsec1-openssl lvm2 lxd-agent-loader motd-news-config multipath-tools
open-iscsi open-vm-tools overlayroot pastebinit pollinate python3-attr python3-automat python3-bcrypt
python3-configobj python3-constantly python3-hamcrest python3-hyperlink python3-incremental python3-magic
python3-newt python3-pyasn1 python3-pyasn1-modules python3-service-identity python3-twisted python3-zope.interface
run-one sg3-utils sg3-utils-udev sosreport thin-provisioning-tools tmux ubuntu-server zerofree

Revision history for this message
Bruno Bernardino (brunobernardino) wrote :
Download full text (6.0 KiB)

I'm also having this issue after upgrading from `5.15.0-57-generic` to `5.15.0-58-generic` (Linux Mint, though). If I revert back to `5.15.0-57-generic` I have no kernel panic.

This is on Intel, not AMD:

```
System:
  Kernel: 5.15.0-57-generic x86_64 bits: 64 compiler: gcc v: 11.3.0 Desktop: Cinnamon 5.6.7
    tk: GTK 3.24.33 wm: muffin dm: LightDM Distro: Linux Mint 21.1 Vera base: Ubuntu 22.04 jammy
Machine:
  Type: Laptop System: TUXEDO product: TUXEDO InfinityBook Pro Gen7 (MK1) v: Standard
    serial: <superuser required>
  Mobo: NB02 model: PHxARX1_PHxAQF1 v: Standard serial: <superuser required>
    UEFI: American Megatrends LLC. v: N.1.05A07 date: 11/07/2022
Battery:
  ID-1: BAT0 charge: 89.3 Wh (90.0%) condition: 99.2/99.2 Wh (100.0%) volts: 16.5 min: 15.5
    model: standard serial: <filter> status: Discharging
CPU:
  Info: 14-core (6-mt/8-st) model: 12th Gen Intel Core i7-12700H bits: 64 type: MST AMCP
    arch: Alder Lake rev: 3 cache: L1: 1.2 MiB L2: 11.5 MiB L3: 24 MiB
  Speed (MHz): avg: 1609 high: 3300 min/max: 400/4679:4700:3500 cores: 1: 3300 2: 3300 3: 400
    4: 3300 5: 2153 6: 1003 7: 461 8: 997 9: 812 10: 2775 11: 2453 12: 2679 13: 526 14: 895 15: 1292
    16: 2502 17: 463 18: 829 19: 1135 20: 921 bogomips: 107520
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: Intel Alder Lake-P Integrated Graphics vendor: Tongfang Hongkong driver: i915
    v: kernel ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4, HDMI-A-1 bus-ID: 00:02.0
    chip-ID: 8086:46a6
  Device-2: NVIDIA GA107M [GeForce RTX 3050 Ti Mobile] vendor: Tongfang Hongkong driver: N/A
    pcie: speed: 16 GT/s lanes: 4 bus-ID: 01:00.0 chip-ID: 10de:25a0
  Device-3: Chicony FHD Webcam type: USB driver: uvcvideo bus-ID: 3-6:4 chip-ID: 04f2:b75c
  Display: x11 server: X.Org v: 1.21.1.3 driver: X: loaded: modesetting unloaded: fbdev,vesa
    gpu: i915 display-ID: :0 screens: 1
  Screen-1: 0 s-res: 2880x1800 s-dpi: 96
  Monitor-1: eDP-1 res: 2880x1800 dpi: 242 diag: 356mm (14")
  OpenGL: renderer: Mesa Intel Graphics (ADL GT2) v: 4.6 Mesa 22.0.5 direct render: Yes
Audio:
  Device-1: Intel Alder Lake PCH-P High Definition Audio vendor: Tongfang Hongkong
    driver: snd_hda_intel v: kernel bus-ID: 00:1f.3 chip-ID: 8086:51c8
  Sound Server-1: ALSA v: k5.15.0-57-generic running: yes
  Sound Server-2: PulseAudio v: 15.99.1 running: yes
  Sound Server-3: PipeWire v: 0.3.48 running: yes
Network:
  Device-1: Intel Alder Lake-P PCH CNVi WiFi driver: iwlwifi v: kernel bus-ID: 00:14.3
    chip-ID: 8086:51f0
  IF: wlo1 state: up mac: <filter>
Bluetooth:
  Device-1: Intel AX201 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 3-10:5 chip-ID: 8087:0026
  Report: hciconfig ID: hci0 rfk-id: 0 state: down bt-service: enabled,running rfk-block:
    hardware: no software: yes address: <filter>
Drives:
  Local Storage: total: 1.82 TiB used: 183.33 GiB (9.8%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 PRO 2TB size: 1.82 TiB speed: 63.2 Gb/s
    lanes: 4 serial: <filter> temp: 33.9 C
Partition:
  ID-1: / size: 1.79 TiB used: 183.06 GiB (10.0%) fs: ext4 dev: /dev/dm-1 mapped: vgmint-root
  ID-2: /boot size: 1.61 GiB used: 268.6 MiB (16.3%) ...

Read more...

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.