0_kdump uses dynamic makedumpfile(8) binary, which fails horribly

Bug #785425 reported by Daniel Richard G. on 2011-05-19
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
kexec-tools (Ubuntu)
Medium
Canonical Foundations Team
Lucid
Undecided
Unassigned
Oneiric
Undecided
Unassigned

Bug Description

SRU Request for Lucid/Maverick/Natty/Oneiric:

[Impact] - When a server is configured with the /boot as a separate partition, which is the default configuration when LVM installation is selected, the kdump mechanism fails systematically.

[Development/Stable Fix] - This has been fixed in the development version with the addition of the following:

  * Backport changes to fix kdump functionality. LP: #828731.
    - debian/kdump.initramfs: call /usr/bin/makedumpfile via a chroot command,
      so that if makedumpfile is statically linked, we get proper library
      resolution. Thanks to Louis Bouchard <email address hidden> for
      the patch. LP: #785425.
    - debian/kdump.initramfs: handle the possibility that /usr, /boot, or
      /var is on a separate filesystem and needs to be manually mounted before
      calling makedumpfile. LP: #828731.
    - Depend on makedumpfile, without which the initramfs script doesn't work.
    - Fix an unnecessary bashism.
    - Only install the kdump initramfs script and depend on makedumpfile on
      architectures that makedumpfile supports.

[Test Case] - See 'how to reproduce' below.
[Regression Potential] - Need to ensure kdump works correctly in backported versions.

--

Binary package hint: kexec-tools

This concerns kexec-tools 1:2.0.1-2ubuntu4 in Ubuntu Natty.

I am trying to get a kernel crash dump. Instead of getting a nice vmcore file in /var/crash/, I get a vmcore.log file with the following:

/root/usr/bin/makedumpfile: error while loading shared libraries: libdw.so.1: cannot open shared object file: No such file or directory

Problem number 1: /usr/bin/makedumpfile is a dynamically-linked executable. Not the kind of thing you want to run in an initramfs environment.

There is, however, a "makedumpfile-static" package, which not only provides /bin/makedumpfile-static, but also an initramfs-tools hook so that this statically-linked binary is copied in as /bin/makedumpfile in the initrd.

Problem number 2: The 0_kdump script is hard-coded to use the dynamic binary in root:/usr/bin/. It is not at all aware of initrd:/bin/makedumpfile .

Louis Bouchard (louis) wrote :

The root cause for this issue is not related to kexec-tools, but is a makedumpfile issue.

The upstream version of makedumpfile that was introduced with 11.04 Natty now uses a dynamically linked makedumpfile and set aside the static version it its own separate package.

I will look to see if something can be done about that and will keep you posted. FYI, this also happens in Oneiric.

Daniel Richard G. (skunk) wrote :

Is your intent to get rid of the "makedumpfile" and "makedumpfile-static" duo, replacing it with a single "makedumpfile" package containing a statically-linked binary?

Louis Bouchard (louis) wrote :

Since it was introduced upstream, it might be a longer route to take. Right now, I am more inclined to ask for makedumpfile-static to be included in main, installed by linux-crashdump and add the adequate call in 0_kdump. But this is not my decision.

Daniel Richard G. (skunk) wrote :

One thing I've been wondering is what purpose the dynamically-linked makedumpfile(8) is intended to serve. The program only ever runs in a Linux crash environment, where two kernels are loaded in memory, and everything is running out of the second kernel's initrd---is the 800kB difference in the binary sizes that big a deal, or something?

Could be worth filing a wishlist bug on the Debian side....

Daniel Richard G. (skunk) wrote :

This bug is still present in Oneiric.

Louis Bouchard (louis) wrote :

The following patch uses the statically linked version of makedumpfile as provided by the makedumpfile-static package.

The makedumpfile-static package which is currently in the Universe archive will need to be included into the Main archive as well as its makedumpfile counterpart.

Louis Bouchard (louis) wrote :

The following patch adds a dependency to the linux-crashdump metapackage so the makedumpfile-static package is also installed.

This dependency is required in order for the previous patch to be functional.

Changed in kexec-tools (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in kexec-tools (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Canonical Foundations Team (canonical-foundations)

The attachment "kdump.initramfs_makedumpfile_static-LP785425.patch" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Daniel Richard G. (skunk) wrote :

On the second patch: Why not drop the "makedumpfile" package dependency? No need to install two binaries of the same program, especially when one of them isn't even fit for purpose.

Louis Bouchard (louis) wrote :

@Richard: mostly because makedumpfile can be used in other contexts than from within the kdump realm. I've seen situation where it was used to reduce the size of an existing vmcore file.

A longer term solution might also be to go back to one single package with statically linked executable. I still have to discuss the issue with the Debian upstream maintainer.

Daniel Richard G. (skunk) wrote :

Sure, the makedumpfile(8) program has other uses, but we're talking about the dependencies of the linux-crashdump metapackage. There's really only one usage context that's relevant for that.

I wouldn't suggest dropping the makedumpfile package altogether in favor of makedumpfile-static, but dropping makedumpfile-static and putting a static binary in makedumpfile makes all the sense in the world. I've yet to hear any reason why one would need/want the dynamic binary when they already have a static one. This is a textbook example of a program that belongs in /bin rather than /usr/bin.

Colin Watson (cjwatson) wrote :

We have plenty of dynamically-linked executables in the initramfs already. I don't see why we shouldn't just fix this one, rather than introducing more static linkage (which is a maintenance pain). I agree that there are problems here - dynamic linking just isn't intrinsically one of them.

Louis Bouchard (louis) wrote :

@cjwatson

thanks for clarifying this for me.

Indeed, when looking in the initramfs context I see this :

(initramfs) . /root/usr/bin/ldd /root/usr/bin/makedumpfile
        linux-vdso.so.1 => (0x00007fff0f9ff000)
        libdw.so.1 => not found
        libelf.so.1 => not found
        libz.so.1 => not found
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f419e171000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f419e514000)

So I've ran a new test by revoking both proposed patches and using the following syntax in /usr/share/initramfs-tools/scripts/init-bottom/0_kdump :

--- 0_kdump.orig 2011-11-23 14:32:29.113580047 +0100
+++ 0_kdump 2011-11-23 14:22:35.232067873 +0100
@@ -19,6 +19,7 @@
 MAKEDUMPFILE="$rootmnt/usr/bin/makedumpfile"
 LOG="$rootmnt/var/crash/vmcore.log"
 VMCORE="/proc/vmcore"
+export LD_LIBRARY_PATH="/root/lib:/root/usr/lib:/root/lib/x86_64-linux-gnu"

 # Check that this is a kexec kernel.
 grep -q kdump_needed /proc/cmdline || exit 0

Using the LD_LIBRARY_PATH here does allow for the dynamically-linked makedumpfile to work adequately.

So this confirms Colin's statement. I have yet to verify if this is the best place to make use of the LD_LIBRARY_PATH and to make it more generic.

So I will remove the previous patches and post the new one once ready.

Thanks Colin for steering me in this direction.

On Wed, Nov 23, 2011 at 01:39:47PM -0000, Louis Bouchard wrote:
> So I've ran a new test by revoking both proposed patches and using the
> following syntax in /usr/share/initramfs-tools/scripts/init-
> bottom/0_kdump :

> --- 0_kdump.orig 2011-11-23 14:32:29.113580047 +0100
> +++ 0_kdump 2011-11-23 14:22:35.232067873 +0100
> @@ -19,6 +19,7 @@
> MAKEDUMPFILE="$rootmnt/usr/bin/makedumpfile"
> LOG="$rootmnt/var/crash/vmcore.log"
> VMCORE="/proc/vmcore"
> +export LD_LIBRARY_PATH="/root/lib:/root/usr/lib:/root/lib/x86_64-linux-gnu"

> # Check that this is a kexec kernel.
> grep -q kdump_needed /proc/cmdline || exit 0

> Using the LD_LIBRARY_PATH here does allow for the dynamically-linked
> makedumpfile to work adequately.

> So this confirms Colin's statement. I have yet to verify if this is the
> best place to make use of the LD_LIBRARY_PATH and to make it more
> generic.

> So I will remove the previous patches and post the new one once ready.

It's somewhat unusual to call programs from the root filesystem via the
initramfs; instead of setting an LD_LIBRARY_PATH, I would suggest modifying
this script to chroot into the root mount point and running directly from
there.

(I would normally suggest copying the binary and its libraries into the
initramfs, but that may not be worthwhile here simply because the script
needs to write to the rootfs.)

If that's not practical for whatever reason, please at least use the
$rootmnt variable instead of hard-coding the /root path.

BTW, this is not very safe either:

  mount $rootmnt -o remount,rw

because there's no fsck before mounting, which is always done in the normal
boot case.

And there's no guarantee at all that /var/crash will be on the root
filesystem at all, as opposed to a submount for /var, which could make it
very difficult to find this data after reboot. But I suppose these are
comparatively minor bugs.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Louis Bouchard (louis) wrote :

@Steve

The reason why chroot is not possible, is that the intent of this script is to copy the content of /proc/vmcore to the root filesystem. Using a chroot would make things more complex as part of /proc would need to be made visible.

Using $rootmnt is also my intent. This was only a quick test to see if using LD_LIBRARY_PATH would work following cjwatson's advise.

Regarding the other comments, they pertain to the existing code that is not from me but from the current 0_kdump (kdump.initramfs) script which is part of the kexec-tools package.

As a final note, if choosen, the LD_LIBRARY_PATH option needs to be adapter to serve each architecture for which this can be used (i386,x86_64, arm & powerpc).

Kind regards,
--
Louis

Steve Langasek (vorlon) wrote :

On Thu, Nov 24, 2011 at 02:34:56PM -0000, Louis Bouchard wrote:

> As a final note, if choosen, the LD_LIBRARY_PATH option needs to be
> adapter to serve each architecture for which this can be used
> (i386,x86_64, arm & powerpc).

Yes, this is another reason that it would be simpler to mount /proc from the
script and chroot instead of passing LD_LIBRARY_PATH.

INFO="/boot/vmcoreinfo-$KVER"
CRASHFILE="/var/crash/vmcore"
MAKEDUMPFILE="/usr/bin/makedumpfile"
LOG="/var/crash/vmcore.log"

[...]

mount $rootmnt -o remount,rw
mount -n -o move /proc ${rootmnt}/proc
chroot $rootmnt $MAKEDUMPFILE -E -d 31 -i $INFO $VMCORE $CRASHFILE > $LOG 2>&1 || \
        rm -f $rootmnt/$CRASHFILE

chmod 400 $rootmnt/$CRASHFILE

mount $rootmnt -o remount,ro
reboot

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Louis Bouchard (louis) wrote :

@steve

Agree, using chroot in that context makes much more sense & simplify the arch diffs.

I'll test a patch with that in mind shortly & will try to post it here.

Louis Bouchard (louis) wrote :

@steve

Here is the new patch I've just tested on 32 & 64 bit Oneiric, which is close to what you suggested. It does a chroot on makedumpfile and verifies that the files are present in $rootmnt.

One difference with your exemple is the use of 'mount -n --bind' instead of using -o move /proc as the later mount command failed to locate the /proc/mounts.

Steve Langasek (vorlon) wrote :

Hi Louis,

+
+log_begin_msg "Saving vmcore from kernel crash"
+
+mount $rootmnt -o remount,rw
+
 # Make sure makedumpfile assumptions are satisfied.
-while ! test -e "$INFO"; do
+while ! test -e "$rootmnt/$INFO"; do
        panic "kdump: Missing $INFO"
 done
-while ! test -x "$MAKEDUMPFILE"; do
- panic "kdump: Missing $MAKEDUMPFILE"
+
+while ! test -x "$rootmnt/$MAKEDUMPFILE"; do
+ panic "kdump: Missing $rootmnt/$MAKEDUMPFILE"
 done

-log_begin_msg "Saving vmcore from kernel crash"
-
-mount $rootmnt -o remount,rw

This doesn't look like a good change to me. remounting the rootfs read-write before calling panic just increases the chance that we'll have to fsck on the next reboot due to an unclean unmount. Was there a particular reason for this earlier mounting?

Louis Bouchard (louis) wrote :

Hello Steve,

Mouting the root filesystem Read/Write is required in order to save the vmcore information on the root filesystem, which is the sole intent of this script.

This is not introduced by the patch, but is present in the original script (Line 41). And panic is not called anywhere in there either. Once the vmcore gets copied to $rootmnt/var/crash, the root FS is remounted Read Only just before the end of the script which terminates by invoking reboot :

 mount $rootmnt -o remount,ro
 reboot

I hope I'm being clear enough.

Steve Langasek (vorlon) wrote :

On Fri, Dec 02, 2011 at 07:29:50AM -0000, Louis Bouchard wrote:
> Mouting the root filesystem Read/Write is required in order to save the
> vmcore information on the root filesystem, which is the sole intent of
> this script.

Yes, I understand that. I was referring to the reordering done in your
patch, which moved the rw remount call ahead of two checks that, if they
fail, call panic().

I've dropped this particular change in committing to bzr, because I don't
see any reason it's needed. Sounds like you don't either. :)

Thanks,
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Louis Bouchard (louis) wrote :

caribou silently makes note of not commenting before 3rd cup of coffee in the morning...

You're right, there is no need to have it R/W in order to test existence of those files. And I missed the panic calls altogether.

So this is officially my first Ubuntu patch :-D

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kexec-tools - 1:2.0.2-3ubuntu2

---------------
kexec-tools (1:2.0.2-3ubuntu2) precise; urgency=low

  * debian/kdump.initramfs: call /usr/bin/makedumpfile via a chroot command,
    so that if makedumpfile is statically linked, we get proper library
    resolution. Thanks to Louis Bouchard <email address hidden> for
    the patch. LP: #785425.
  * debian/kdump.initramfs: handle the possibility that /usr, /boot, or
    /var is on a separate filesystem and needs to be manually mounted before
    calling makedumpfile. LP: #828731.
  * Depend on makedumpfile, without which the initramfs script doesn't work.
 -- Steve Langasek <email address hidden> Mon, 02 Jan 2012 22:43:33 -0800

Changed in kexec-tools (Ubuntu):
status: Triaged → Fix Released
Daniel Richard G. (skunk) wrote :

Excellent! Thank you, Louis, for getting this in.

Just one nit---that should read "... so that if makedumpfile is *dynamically* linked, we get ..."

Louis Bouchard (louis) wrote :

@daniel

I really wanted to get the whole kexec/kdump functionality working for 12.04 which will be LTS. thanks to Steve for helping me out on this. Regarding the wording of the changelog, not sure if this can be changed. Maybe Steve will want to comment.

Steve Langasek (vorlon) wrote :

On Tue, Jan 03, 2012 at 09:25:25AM -0000, Louis Bouchard wrote:
> I really wanted to get the whole kexec/kdump functionality working for
> 12.04 which will be LTS. thanks to Steve for helping me out on this.
> Regarding the wording of the changelog, not sure if this can be changed.
> Maybe Steve will want to comment.

Sure, it's possible to fix the changelog, but that's not something that's
worth an upload all on its own. :) I've committed the change to the UDD
branch for the package so that it can be picked up next time kexec-tools
needs uploaded.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Chris J Arges (arges) on 2012-01-31
description: updated
George Politis (gpolitis) wrote :

I see this is fixed with version kexec-tools 1:2.0.2-3ubuntu2. However, Oneiric is stuck with kexec-tools 1:2.0.2-1ubuntu3. Any chance to have this package updated?

Louis Bouchard (louis) wrote :

@666f6f

You might want to have a look at bug LP: #828731 which has the request for backporting this packages to all previous releases. Hopefully this should become available soon.

George Politis (gpolitis) wrote :

Great, thank you Louis!

Hello Daniel, or anyone else affected,

Accepted kexec-tools into oneiric-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Changed in kexec-tools (Ubuntu Oneiric):
status: New → Fix Committed
Martin Pitt (pitti) wrote :

Hello Daniel, or anyone else affected,

Accepted kexec-tools into lucid-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Clint Byrum (clint-fewbar) wrote :

can somebody explain what happened to version 1:2.0.1-1ubuntu4 ? It is not in lucid-proposed, but is sort of referenced in the upload of 1:2.0.1-1ubuntu5

?

On Fri, Mar 09, 2012 at 07:09:26PM -0000, Clint Byrum wrote:
> can somebody explain what happened to version 1:2.0.1-1ubuntu4 ? It is
> not in lucid-proposed, but is sort of referenced in the upload of
> 1:2.0.1-1ubuntu5

Was removed from lucid-proposed to avoid accidental promotion given the
verification-done tag on the bug.

Louis Bouchard (louis) wrote :

Hi Clint,

Le 09/03/2012 20:09, Clint Byrum a écrit :
> can somebody explain what happened to version 1:2.0.1-1ubuntu4 ? It is
> not in lucid-proposed, but is sort of referenced in the upload of
> 1:2.0.1-1ubuntu5
>
> ?
>

There was one issue specific to Lucid that made the -proposed version
fail. I have re-submitted a debdiff for a working version which is
waiting to be rebuilt & resubmitted. the "lucid-proposed" debdiff is the
most current one.

HTH,

...Louis
--
Louis Bouchard
Backline Support Analyst
Canonical Ltd
Ubuntu support: http://landscape.canonical.com

Hello Daniel, or anyone else affected,

Accepted kexec-tools into lucid-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in kexec-tools (Ubuntu Lucid):
status: New → Fix Committed
Louis Bouchard (louis) wrote :

@clint

I can confirm that the new kexec-tool package does work correctly on Lucid, though this specific problem with dynamic makedumpfile did not apply to Lucid.

So no regresssion on this one, everything works fine.

Steve Langasek (vorlon) on 2012-03-15
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kexec-tools - 1:2.0.1-1ubuntu5

---------------
kexec-tools (1:2.0.1-1ubuntu5) lucid-proposed; urgency=low

  * Backport changes to fix kdump functionality. LP: #828731.
    - debian/kdump.initramfs: call /usr/bin/makedumpfile via a chroot command,
      so that if makedumpfile is statically linked, we get proper library
      resolution. Thanks to Louis Bouchard <email address hidden> for
      the patch. LP: #785425.
    - debian/kdump.initramfs: handle the possibility that /usr, /boot, or
      /var is on a separate filesystem and needs to be manually mounted before
      calling makedumpfile. LP: #828731.
    - Depend on makedumpfile, without which the initramfs script doesn't work.
    - Fix an unnecessary bashism.
    - Only install the kdump initramfs script and depend on makedumpfile on
      architectures that makedumpfile supports.
 -- Louis Bouchard <email address hidden> Mon, 05 Mar 2012 15:10:26 -0000

Changed in kexec-tools (Ubuntu Lucid):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kexec-tools - 1:2.0.2-1ubuntu4

---------------
kexec-tools (1:2.0.2-1ubuntu4) oneiric-proposed; urgency=low

  * Backport changes to fix kdump functionality. LP: #828731.
    - debian/kdump.initramfs: call /usr/bin/makedumpfile via a chroot command,
      so that if makedumpfile is statically linked, we get proper library
      resolution. Thanks to Louis Bouchard <email address hidden> for
      the patch. LP: #785425.
    - debian/kdump.initramfs: handle the possibility that /usr, /boot, or
      /var is on a separate filesystem and needs to be manually mounted before
      calling makedumpfile. LP: #828731.
    - Depend on makedumpfile, without which the initramfs script doesn't work.
    - Fix an unnecessary bashism.
    - Only install the kdump initramfs script and depend on makedumpfile on
      architectures that makedumpfile supports.
 -- Chris J Arges <email address hidden> Wed, 18 Jan 2012 14:52:58 -0600

Changed in kexec-tools (Ubuntu Oneiric):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers