kdump cannot generate coredump file on bluefield with 5.4 and 5.15 kernel
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-bluefield (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
kdump cannot generate coredump file on bluefield with 5.4 kernel
Bug description:
Following the instruction in https:/
Bluefield is running 5.4 kernel
bf2:~$ uname -a
Linux sw-mtx-008-bf2 5.4.0-1060-
crashkernel parameter is configured
bf2:~$ cat /proc/cmdline
BOOT_IMAGE=
bf2:~$ dmesg | grep -i crash
[ 0.000000] crashkernel reserved: 0x00000000cfe00000 - 0x00000000efe00000 (512 MB)
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 8.070921] pstore: Using crash dump compression: deflate
kdump-config is as below:
bf2:~$ kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x
/var/
kdump initrd:
/var/
current state: ready to kdump
kexec command:
/sbin/kexec -p --command-
sysrq:
bf2:/# cat /proc/sys/
176
After trigged the crash manually with "echo c > /proc/sysrq-
With default 512M, it hangs at "Killed process 674"
[ 8.718188] systemd-
[ 30.252513] Out of memory: Killed process 651 (systemd-resolve) total-vm:24380kB, anon-rss:3812kB, file-rss:1828kB, shmem-rss:0kB, UID:101 pgtables:80kB o0
...
[ 34.651927] Out of memory: Killed process 674 (dbus-daemon) total-vm:7884kB, anon-rss:552kB, file-rss:1380kB, shmem-rss:0kB, UID:103 pgtables:52kB oom_sco0
With 1024M, it hangs at following
[ 8.733323] systemd-
After soft reboot the Bluefield, there's no coredump file generated.
bf2:~$ ls /var/crash/ -la
total 52
drwxrwxrwt 3 root root 4096 May 31 01:43 .
drwxr-xr-x 14 root root 4096 Apr 30 11:26 ..
drwxrwxr-x 2 ubuntu ubuntu 4096 May 31 01:43 202305310143
-rw-r----- 1 root root 34307 May 31 01:18 _usr_share_
-rw-r--r-- 1 root root 0 May 31 03:47 kdump_lock
-rw-r--r-- 1 root root 358 May 31 03:48 kexec_cmd
bf2:~$ ls /var/crash/
total 8
drwxrwxr-x 2 ubuntu ubuntu 4096 May 31 01:43 .
drwxrwxrwt 3 root root 4096 May 31 01:43 ..
This issue also happens on 5.4.0-1049-
I also tested it on 5.15.0- 1031-bluefield and it also fails.
Configurations:
root@bu-oob:~# kdump-config show lib/kdump/ vmlinuz: symbolic link to /boot/vmlinuz- 5.15.0- 1031-bluefield lib/kdump/ initrd. img: symbolic link to /var/lib/ kdump/initrd. img-5.15. 0-1031- bluefield
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0xbd000000
/var/
kdump initrd:
/var/
current state: ready to kdump
kexec command: line="BOOT_ IMAGE=/ boot/vmlinuz- 5.15.0- 1031-bluefield root=UUID= 8e8b38a6- 7d3d-4a29- b7a0-99761624f9 41 ro console=hvc0 console=ttyAMA0 earlycon= pl011,0x1301000 0 fixrtc net.ifnames=0 biosdevname=0 iommu.passthrough=1 console=tty1 console=ttyS0 reset_devices systemd. unit=kdump- tools-dump. service nr_cpus=1" --initrd= /var/lib/ kdump/initrd. img /var/lib/ kdump/vmlinuz lab60v3- oob:~#
/sbin/kexec -p --command-
root@bu-
####### ####### ####### ####### ####### /boot/vmlinuz- 5.15.0- 1031-bluefield root=UUID= 8e8b38a6- 7d3d-4a29- b7a0-99761624f9 41 ro console=hvc0 console=ttyAMA0 earlycon= pl011,0x1301000 0 fixrtc net.ifnames=0 biosdevname=0 iommu.passthrough=1 console=tty1 console=ttyS0 crashkernel= 2G-4G:320M, 4G-32G: 1024M,32G- 64G:1536M, 64G-128G: 2048M,128G- :4096M
root@bu-oob:~# dmesg |grep -i crash
[ 0.000000] crashkernel reserved: 0x00000000bd000000 - 0x00000000fd000000 (1024 MB)
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 5.230439] pstore: Using crash dump compression: deflate
root@bu-oob:~#
################ grub.d/ kdump-tools. cfg LINUX_DEFAULT= "$GRUB_ CMDLINE_ LINUX_DEFAULT crashkernel= 2G-4G:320M, 4G-32G: 1024M,32G- 64G:1536M, 64G-128G: 2048M,128G- :4096M"
root@bu-oob:~# cat /etc/default/
GRUB_CMDLINE_
root@bu- lab60v3- oob:~# grep -e "CRASH" -e "KEXEC" /boot/config- 5.15.0- 1031-bluefield KEXEC_IMAGE_ VERIFY_ SIG=y HAVE_IMA_ KEXEC=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
CONFIG_KEXEC_SIG=y
CONFIG_
CONFIG_CRASH_DUMP=y
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_
CONFIG_IMA_KEXEC=y
*** How to reproduce *** trigger"
When manually triggers the crash "echo c > /proc/sysrq-
the system just hangs without showing any message/log.