Package linux-crashdump in 20.04 configures kernel cmdline crashkernel incorrectly causing lock-up on kernel dump

Bug #1918085 reported by Deniz Eren
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux-meta (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Package linux-crashdump in 20.04 configures kernel cmdline crashkernel incorrectly causing lock-up on kernel dump.

It is very simple to replicate. I was testing using QEmu virtualisation software, where I had QEmu running a 20.04 install within another 20.04 install.

Within the virtualisation install, simply install the package "linux-crashdump":
$ sudo apt install linux-crashdump

Answering Yes and Yes to the two questions asked:

 |------------------------| Configuring kexec-tools |------------------------|
 | |
 | |
 | If you choose this option, a system reboot will trigger a restart into a |
 | kernel loaded by kexec instead of going through the full system boot |
 | loader process. |
 | |
 | Should kexec-tools handle reboots (sysvinit only)? |
 | |
 | <Yes> <No> |
 | |
 |---------------------------------------------------------------------------|

 |------------------------| Configuring kdump-tools |------------------------|
 | |
 | |
 | If you choose this option, the kdump-tools mechanism will be enabled. A |
 | reboot is still required in order to enable the crashkernel kernel |
 | parameter. |
 | |
 | Should kdump-tools be enabled be default? |
 | |
 | <Yes> <No> |
 | |
 |---------------------------------------------------------------------------|

Check the kernel command-line Grub configured by package install:
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.4.0-66-lowlatency root=/dev/mapper/rootlvm-rootpart ro crashkernel=512M-:192M

As you can see "crashkernel=512M-:192M" is definitely a syntax error.

Furthermore, when I test with this default configuration by forcing a crash:
Enable dump then reboot testing with the following command:
sudo sysctl -w kernel.sysrq=1
Once this is done, you must become root, as just using sudo will not be sufficient. As the root user, you will have to issue the command echo c > /proc/sysrq-trigger.

Once the "echo c > /proc/sysrq-trigger" command is issued as root, the virtual host being tested locks-up at 100% CPU indefinitely. Forcing shutdown and reboot shows no crash file in /var/crash folder however I can only see files "kexec_cmd" and "kdump_lock".

To manually fix this issue I changed to "crashkernel=512M-:192M" to "crashkernel=384M-:512M" by editing (i.e. make the small number/larger number order correct):
$ sudo vim /etc/default/grub.d/kdump-tools.cfg
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=384M-:512M"

After reboot and retest of the forced crash commands, the kdump works and all the needed files are present after self-reboot:
$ ls /var/crash/
202103081345 kdump_lock kexec_cmd linux-image-5.4.0-66-lowlatency-202103081345.crash
$ ls /var/crash/202103081345/
dmesg.202103081345 dump.202103081345

In summary the problem is that the default kernel command-line configured by default "crashkernel=512M-:192M" is faulty in some way or other and causes the kernel to lock-up at 100% CPU indefinitely when kdump is triggered.

This can be manually fixed giving a workaround but future user will suffer until the default installation configuration is fixed.

Tags: focal
Revision history for this message
Deniz Eren (deniz-eren314) wrote :

Package linux-crashdump is faulty

Revision history for this message
Deniz Eren (deniz-eren314) wrote :

Package linux-crashdump version 5.4.0.66.69 installed by Ubuntu 20.04 is faulty.

no longer affects: linux-meta (Ubuntu)
affects: apport → ubuntu-ubuntu-server
no longer affects: focal (Ubuntu)
no longer affects: ubuntu-ubuntu-server
affects: focal (Ubuntu) → ubuntu
Paul White (paulw2u)
affects: ubuntu → linux-meta (Ubuntu)
tags: added: focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-meta (Ubuntu):
status: New → Confirmed
Revision history for this message
Amos (a-storkey) wrote :

Also in 22.04

Revision history for this message
Aaahh Ahh (woohoomoo2u) wrote :

Also in 22.10

Revision history for this message
Rovano (rovano) wrote (last edit ):

Yeah, 22.04.

sudo dmesg | grep 'command line'
[ 0.045410] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.19.0-45-generic root=UUID=xxx ro crashkernel=512M-:192M
[ 0.045459] Unknown kernel command line parameters "BOOT_IMAGE=/boot/vmlinuz-5.19.0-45-generic", will be passed to user space.

Revision history for this message
Dmitry-a-durnev (dmitry-a-durnev) wrote (last edit ):

I see the same message in 22.04.3:

[ 0.049372] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-91-generic root=UUID=... ro quiet splash crashkernel=512M-:192M vt.handoff=7
[ 0.049461] Unknown kernel command line parameters "splash BOOT_IMAGE=/boot/vmlinuz-5.15.0-91-generic", will be passed to user space.

But, for me (16 GB RAM) changing to crashkernel=192M-:512M not fixes anything. It just increases kexec_crash_size to 512 MB which is too much and not makes crash dump appear(for me it is still missing). The message about "unknown ...parameters" is not relevant itself, could be ignored?

"crashkernel=512M-:192M" is correct by itself (is NOT a syntax or any error) and means: for RAM size more than 512M reserve 192M for crashkernel,

see dmesg (example for the case of :192M-:512M):

Reserving 512MB of memory at 2096MB for crashkernel (System RAM: 16323MB)

Syntax is correct: "512M-:192M" is <range>:<size>, where range=start-[end]:

  'start' is inclusive and 'end' is exclusive

512M is 'start', [end] is optional and not specified

192M is size. Read the docs:

https://ubuntu.com/server/docs/kernel-crash-dump

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.