When a server is configured with the /boot as a separate partition, which is the default configuration when LVM installation is selected, the kdump mechanism fails systematically.
This is caused by the fact that the ./scripts/init-bottom/0_kdump script that is loaded into the initrd.img file make the assumption that /boot is _ALWAYS_ a directory which contains the vmcoreinfo-$KVER file. The bug is contained within the following code :
# Check that this is a kexec kernel.
grep -q kdump_needed /proc/cmdline || exit 0
# Do NOT exit the script after this point, or the system will start
# booting inside the crash kernel.
. ./scripts/functions
# Make sure makedumpfile assumptions are satisfied.
while ! test -e "$INFO"; do
panic "kdump: Missing $INFO"
done
while ! test -x "$MAKEDUMPFILE"; do
panic "kdump: Missing $MAKEDUMPFILE"
done
The test 'while !test -e "$INFO";do' fails if /boot is a separate partition.
Reproducible: 100%
How to Reproduce :
Pre-requisite : a system or VM installed with LVM and /boot as a separate partition (default option for LVM installation)
1) install the linux-crashdump package & dependancies
2) Increase the crashkernel= parameter to 128M if the RAM is below 2048M (LP Bug#785394) in /etc/grub.d/10_linux
3) Run sudo update-grub
4) Reboot the system
5) Force a panic with "echo c > /proc/sysrq-trigger
The system will reboot to the kexec kernel with complete network access enabled :
Workaround:
Copy the content of the /boot partition into the /boot directory. This is only valid until the next upgrade of the "linux-image-{version}" package.
How to workaround :
6) Reboot the system
7) Copy the content of the /boot partition into the /boot directory
# df /boot
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vda1 233191 17563 203187 8% /boot
# sudo umount /boot
# sudo mount /dev/vda1 /mnt
# sudo cp -pr /mnt/* /boot
# sudo umount /mnt
# sudo mount -a
# sudo echo c > /proc/sysrq-tgrigger
The system will correctly generate a crash dump
# find /var/crash
/var/crash
/var/crash/linux-image-2.6.32-28-server.0.crash
Description : Ubuntu 10.04.2
Release : 10.04
When a server is configured with the /boot as a separate partition, which is the default configuration when LVM installation is selected, the kdump mechanism fails systematically.
This is caused by the fact that the ./scripts/ init-bottom/ 0_kdump script that is loaded into the initrd.img file make the assumption that /boot is _ALWAYS_ a directory which contains the vmcoreinfo-$KVER file. The bug is contained within the following code :
KVER="`uname -r`" "$rootmnt/ boot/vmcoreinfo -$KVER" "$rootmnt/ var/crash/ vmcore" ="$rootmnt/ usr/bin/ makedumpfile" "$rootmnt/ var/crash/ vmcore. log" "/proc/ vmcore"
INFO=
CRASHFILE=
MAKEDUMPFILE
LOG=
VMCORE=
# Check that this is a kexec kernel.
grep -q kdump_needed /proc/cmdline || exit 0
# Do NOT exit the script after this point, or the system will start
# booting inside the crash kernel.
. ./scripts/functions
# Make sure makedumpfile assumptions are satisfied.
while ! test -e "$INFO"; do
panic "kdump: Missing $INFO"
done
while ! test -x "$MAKEDUMPFILE"; do
panic "kdump: Missing $MAKEDUMPFILE"
done
The test 'while !test -e "$INFO";do' fails if /boot is a separate partition.
Reproducible: 100%
How to Reproduce :
Pre-requisite : a system or VM installed with LVM and /boot as a separate partition (default option for LVM installation)
1) install the linux-crashdump package & dependancies d/10_linux
2) Increase the crashkernel= parameter to 128M if the RAM is below 2048M (LP Bug#785394) in /etc/grub.
3) Run sudo update-grub
4) Reboot the system
5) Force a panic with "echo c > /proc/sysrq-trigger
The system will reboot to the kexec kernel with complete network access enabled :
# cat /proc/cmdline /vmlinuz- 2.6.32- 28-server root=/dev/ mapper/ Lucid-- lvmS-root ro kdump_needed maxcpus=1 irqpoll reset_devices memmap=exactmap memmap=640K@0K memmap= 130412K@ 33408K elfcorehdr=163820K
BOOT_IMAGE=
Workaround: image-{ version} " package.
Copy the content of the /boot partition into the /boot directory. This is only valid until the next upgrade of the "linux-
How to workaround :
6) Reboot the system tgrigger
7) Copy the content of the /boot partition into the /boot directory
# df /boot
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vda1 233191 17563 203187 8% /boot
# sudo umount /boot
# sudo mount /dev/vda1 /mnt
# sudo cp -pr /mnt/* /boot
# sudo umount /mnt
# sudo mount -a
# sudo echo c > /proc/sysrq-
The system will correctly generate a crash dump crash/linux- image-2. 6.32-28- server. 0.crash
# find /var/crash
/var/crash
/var/
ProblemType: Bug 1-2.0.1- 1ubuntu3
DistroRelease: Ubuntu 10.04.02
Package: kexec-tools-
Uname: Linux 2.6.32-28-server x86_64
Architecture: amd64