Comment 0 for bug 828731

Revision history for this message
Louis Bouchard (louis) wrote :

Description : Ubuntu 10.04.2
Release : 10.04

When a server is configured with the /boot as a separate partition, which is the default configuration when LVM installation is selected, the kdump mechanism fails systematically.

This is caused by the fact that the ./scripts/init-bottom/0_kdump script that is loaded into the initrd.img file make the assumption that /boot is _ALWAYS_ a directory which contains the vmcoreinfo-$KVER file. The bug is contained within the following code :

   KVER="`uname -r`"
   INFO="$rootmnt/boot/vmcoreinfo-$KVER"
   CRASHFILE="$rootmnt/var/crash/vmcore"
   MAKEDUMPFILE="$rootmnt/usr/bin/makedumpfile"
   LOG="$rootmnt/var/crash/vmcore.log"
   VMCORE="/proc/vmcore"

   # Check that this is a kexec kernel.
   grep -q kdump_needed /proc/cmdline || exit 0

   # Do NOT exit the script after this point, or the system will start
   # booting inside the crash kernel.

   . ./scripts/functions

   # Make sure makedumpfile assumptions are satisfied.
   while ! test -e "$INFO"; do
           panic "kdump: Missing $INFO"
   done
   while ! test -x "$MAKEDUMPFILE"; do
           panic "kdump: Missing $MAKEDUMPFILE"
   done

The test 'while !test -e "$INFO";do' fails if /boot is a separate partition.

Reproducible: 100%

How to Reproduce :

Pre-requisite : a system or VM installed with LVM and /boot as a separate partition (default option for LVM installation)

1) install the linux-crashdump package & dependancies
2) Increase the crashkernel= parameter to 128M if the RAM is below 2048M (LP Bug#785394) in /etc/grub.d/10_linux
3) Run sudo update-grub
4) Reboot the system
5) Force a panic with "echo c > /proc/sysrq-trigger

The system will reboot to the kexec kernel with complete network access enabled :

 # cat /proc/cmdline
 BOOT_IMAGE=/vmlinuz-2.6.32-28-server root=/dev/mapper/Lucid--lvmS-root ro kdump_needed maxcpus=1 irqpoll reset_devices memmap=exactmap memmap=640K@0K memmap=130412K@33408K elfcorehdr=163820K

Workaround:
Copy the content of the /boot partition into the /boot directory. This is only valid until the next upgrade of the "linux-image-{version}" package.

How to workaround :

6) Reboot the system
7) Copy the content of the /boot partition into the /boot directory
   # df /boot
   Filesystem 1K-blocks Used Available Use% Mounted on
   /dev/vda1 233191 17563 203187 8% /boot
   # sudo umount /boot
   # sudo mount /dev/vda1 /mnt
   # sudo cp -pr /mnt/* /boot
   # sudo umount /mnt
   # sudo mount -a
   # sudo echo c > /proc/sysrq-tgrigger

The system will correctly generate a crash dump
   # find /var/crash
    /var/crash
    /var/crash/linux-image-2.6.32-28-server.0.crash

ProblemType: Bug
DistroRelease: Ubuntu 10.04.02
Package: kexec-tools-1-2.0.1-1ubuntu3
Uname: Linux 2.6.32-28-server x86_64
Architecture: amd64