Hard-coded crashkernel=... memory reservation in /etc/grub.d/10_linux is insufficient

Bug #785394 reported by Daniel Richard G.
48
This bug affects 8 people
Affects Status Importance Assigned to Milestone
grub (Ubuntu)
Fix Released
Undecided
Unassigned
Precise
Won't Fix
Undecided
Unassigned
kexec-tools (Ubuntu)
Fix Released
Medium
Chris J Arges

Bug Description

Binary package hint: grub-pc

This concerns grub-pc 1.99~rc1-13ubuntu3 in Ubuntu Natty.

The /etc/grub.d/10_linux file contains this snippet:

    # add crashkernel option if we have the required tools
    if [ -x "/usr/bin/makedumpfile" ] && [ -x "/sbin/kexec" ]; then
        GRUB_CMDLINE_EXTRA="$GRUB_CMDLINE_EXTRA crashkernel=384M-2G:64M,2G-:128M"
    fi

I am on a system with 2GB of RAM (reported as 2038MB), and according to the kernel startup messages, 64MB is reserved for the crash kernel.

Unfortunately, this does not appear to be enough memory for the regular Ubuntu kernel to boot. I am attaching a kernel log obtained via serial cable; it shows the initial boot, a crash in the kernel's video-driver-related code, the subsequent crashkernel boot, and then an apparent "out of memory" kernel panic. (A side effect of the "double crash" is that the system is left unresponsive, requiring a manual reset instead of rebooting itself automatically.)

If I double the memory numbers in the crashkernel=... argument, so that the reservation is 128MB, the system correctly goes on to attempt a vmcore dump and reboot.

Revision history for this message
Daniel Richard G. (skunk) wrote :
Revision history for this message
Colin Watson (cjwatson) wrote :

Hmm. 128MiB seems like an awful lot to reserve, particularly towards the lower end of that memory range. It would be nice to not need quite so much.

Revision history for this message
Colin Watson (cjwatson) wrote :

It looks like it's OOMing while unpacking the initramfs. What does this output?

  ls -l /boot/initrd.img-$(uname -r)
  zcat /boot/initrd.img-$(uname -r) | wc -c

Revision history for this message
Andy Whitcroft (apw) wrote :

It seems that the default is to use the current kernel and initrd for the kexec kernel. This is going to consume some 36MB of this 64MB window, another 4.5 for the kernel itself and we down to close to 20MB of memory left to boot in, which seems to be asking a lot.

It seems we could be a little more targetted in our use of a special initrd for this purpose.

Changed in kexec-tools (Ubuntu):
status: New → In Progress
assignee: nobody → Andy Whitcroft (apw)
importance: Undecided → Medium
Revision history for this message
Andy Whitcroft (apw) wrote :

It is possible to change the initramfs used for kexec relatively easy, see kdump.init.d in the source. We could therefore consider building an initrd with MODULES=dep, for my machine here that drops the initrd by 10MB compressed and nearly 30MB uncompressed:

  $ zcat /boot/initrd.img-2.6.39-3-generic | wc -c
  36718592
  $ zcat /boot/initrd.img-2.6.39-3-generic.dep | wc -c
  7216640

Revision history for this message
Daniel Richard G. (skunk) wrote :

Hello gentlemen,

# ls -l /boot/initrd.img-2.6.38-8-generic
-rw-r--r-- 1 root root 14389163 2011-05-19 17:29 /boot/initrd.img-2.6.38-8-generic

# zcat /boot/initrd.img-2.6.38-8-generic | wc -c
32537600

I use a "generic" (rather than "targeted") initrd because the system is installed using an imaging mechanism, so the same initrd needs to work for different systems.

Is it possible to build a crash-initrd that only contains the modules that would be needed (disk I/O, etc.), without it being specific to the system?

Revision history for this message
Daniel Richard G. (skunk) wrote :

This bug is still present in Oneiric. The 3.0.0-12-generic kernel, like its predecessors, fails to crash dump with 64M and succeeds with 128M.

The failure mode with 64M is a bit less clear, however---something about a bad IRQ.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Note that even with enough memory reserved, a crash dump is still not produced due to bug #785425.

(Why won't Launchpad allow me to upload more than one attachment at a time?)

Revision history for this message
Louis Bouchard (louis) wrote :

Daniel, I have talked with Andy and Colin over the matter. I will be working with them on this bug as well as on #785425 and #828731 in order to get the kdump functionality completely functional.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Louis, thank you; that will be very much appreciated.

You may also want to look at bug #885071, against linux-crashdump, which consolidates some of these issues.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Revision history for this message
Louis Bouchard (louis) wrote :

@apw

After our meeting at UDS-P, you suggested to change the following in /etc/initramfs/initramfs.conf :

#
# MODULES: [ most | netboot | dep | list ]
#
# most - Add most filesystem and all harddrive drivers.
#
# dep - Try and guess which modules to load.
#
# netboot - Add the base modules, network modules, but skip block devices.
#
# list - Only include modules from the 'additional modules' list
#

MODULES=dep

from MODULES=most and to rebuild the initramfs

I have tested this and I still see the OOM killer kicking in while the kexec booted kernel is trying to start.

One thing that I noticed is that the memory information displayed while the OOM killer is active is the following :
[ 2.810045] active_anon:5339 inactive_anon:33 isolated_anon:0
[ 2.810046] active_file:0 inactive_file:0 isolated_file:0
[ 2.810046] unevictable:1635 dirty:0 writeback:0 unstable:0
[ 2.810047] free:280 slab_reclaimable:1007 slab_unreclaimable:1363
[ 2.810048] mapped:428 shmem:40 pagetables:727 bounce:0
[ 2.813342] Node 0 DMA free:208kB min:4kB low:4kB high:4kB active_anon:328kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictabl
e:0kB isolated(anon):0kB isolated(file):0kB present:320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_u
nreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[ 2.817647] lowmem_reserve[]: 0 51 51 51
[ 2.818158] Node 0 DMA32 free:912kB min:912kB low:1140kB high:1368kB active_anon:21028kB inactive_anon:132kB active_file:0kB inactive_file:
0kB unevictable:6540kB isolated(anon):0kB isolated(file):0kB present:52780kB mlocked:0kB dirty:0kB writeback:0kB mapped:1712kB shmem:160kB sla
b_reclaimable:4028kB slab_unreclaimable:5444kB kernel_stack:776kB pagetables:2908kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? yes
[ 2.823024] lowmem_reserve[]: 0 0 0 0
[ 2.823937] Node 0 DMA: 0*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 208kB
[ 2.826354] Node 0 DMA32: 3*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 908kB
[ 2.828162] 1675 total pagecache pages
[ 2.828597] 0 pages in swap cache
[ 2.828978] Swap cache stats: add 0, delete 0, find 0/0
[ 2.829585] Free swap = 0kB
[ 2.829919] Total swap = 0kB
[ 2.831199] 61419 pages RAM
[ 2.831580] 49505 pages reserved
[ 2.831957] 7209 pages shared
[ 2.832340] 11055 pages non-shared

There are 49505 reserved pages, compared with only 22984 pages on the running kernel when the VM is booted.

Of course, this issue goes away if the system/VM memory goes above 2G but this is still an issue with smaller amount of memory.

Revision history for this message
Bryan Quigley (bryanquigley) wrote :

This appears to be fixed in 13.04+. Any chance we can get a fix backported to 12.04?

If not, can we increase the memory by default?

Revision history for this message
Dave Chiluk (chiluk) wrote :

Yeah looks like the minimum amount of ram required to complete the writing of a dump in the case of a generic-image and default initrd is roughly 109-110M with the 3.8 kernel (I just tested it).

My $.02 on this matter is that the default values should work for default installs. Right now that is not the case even if they have enough ram to complete a dump.

My guess is that 64M works for virtual images just fine, but it's not enough for generic images. So perhaps we should make this value dependent on virtual or generic being in the kernel name.

So how about something like the below patch. I haven't fully vetted it, but want to see what people think.

--- 10_linux.orig 2013-12-11 14:33:00.384344265 -0600
+++ 10_linux 2013-12-11 15:13:42.485036921 -0600
@@ -73,7 +73,7 @@ done

 # add crashkernel option if we have the required tools
 if [ -x "/usr/bin/makedumpfile" ] && [ -x "/sbin/kexec" ]; then
- GRUB_CMDLINE_EXTRA="$GRUB_CMDLINE_EXTRA crashkernel=384M-2G:92M,2G-:128M"
+ CRASH="on"
 fi

 linux_entry ()
@@ -120,6 +120,14 @@ EOF
  echo '$message'
 EOF
   fi
+ if [ "x${CRASH}" = "xon" ]; then
+ if [ "x${basename%%generic}" != "x${basename}" ]; then
+ args="$args crashkernel=384M-2G:110M,2G-:128M"
+ elif [ "${basename%%virtual}" != ${basename} ]; then
+ args="$args crashkernel=384M-2G:64M,2G-:128M"
+ fi
+ fi
+
   if test -d /sys/firmware/efi && test -e "${linux}.efi.signed"; then
     cat << EOF
  linux ${rel_dirname}/${basename}.efi.signed root=${linux_root_device_thisversion} ro ${args}

Revision history for this message
Daniel Richard G. (skunk) wrote :

Bryan: Could you elaborate on how this issue appears to be fixed in 13.04? Was the memory reservation increased to 128MB, or is the kernel now capable of booting in 64MB? Given the lack of any updates here, I'm doubtful that any progress has been made at all.

Dave: Have you tried crash-booting a *-virtual kernel in 64MB, experimentally? The possibility is interesting, but it should be more than just a guess.

Revision history for this message
Louis Bouchard (louis) wrote :

I systematically change the setting on all my VMs to 128M as I have had repeated failure with 64M. I think that the default should be raised to 128Mb which is the default on Debian anyway.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Agreed. It's not clear that there is *any* standard Ubuntu kernel configuration that can boot in 64MB. And having that as a default is worse than useless, because the crash-kernel's OOM prevents the system from recovering automatically after a kernel crash.

Revision history for this message
Bryan Quigley (bryanquigley) wrote :

@Daniel
I'm wrong, not fixed in 13.04+

I ran some tests:
@Chiluk first up vm images do only need 64 mb reserved:
Trusty 512 MB (m1.tiny) image on OpenStack: allocted 64 MB of ram for crashdump and it worked. (dump size of 23 mb)

Desktop images fail
Trusty 2047 MB (Vagrant/Virtualbox desktop image) - allocated 64 MB of ram for crashdump and hung.
Reran with 100MB for crashdump, worked (dump size of 32 mb)

Is there any way to not be guessing here and generate the right number for the kernel you are using and modules that have been loaded? (In other words, get current kernel size and add percentage buffer). Per another private case we may have situations where 128 MB is not enough.

If we can't auto-detect this (building upon chiluk's' comment), I would suggest something like (the numbers can change but I think we do need more levels):
< 1.5 GB of RAM: is it virtual kernel? then do 64 mb, if not do 128 mb
1.5 GB to 4 GB: everything do 192 mb
4 GB+: 256 mb
12 GB+: 512 mb (is there a point when we should stop?)

Chris J Arges (arges)
no longer affects: grub2 (Ubuntu)
no longer affects: grub2 (Ubuntu Precise)
no longer affects: grub2 (Ubuntu Raring)
no longer affects: grub2 (Ubuntu Quantal)
no longer affects: grub2 (Ubuntu Saucy)
no longer affects: grub2 (Ubuntu Trusty)
Chris J Arges (arges)
no longer affects: kexec-tools (Ubuntu)
no longer affects: kexec-tools (Ubuntu Precise)
no longer affects: kexec-tools (Ubuntu Quantal)
no longer affects: kexec-tools (Ubuntu Raring)
no longer affects: kexec-tools (Ubuntu Saucy)
no longer affects: kexec-tools (Ubuntu Trusty)
Changed in grub (Ubuntu):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

ubuntu p/q/r/s/t versions of the source package show the same options:

./trusty/grub-0.97/debian/update-grub: extra_opts="$extra_opts crashkernel=384M-2G:64M,2G-:128M"
./raring/grub-0.97/debian/update-grub: extra_opts="$extra_opts crashkernel=384M-2G:64M,2G-:128M"
./quantal/grub-0.97/debian/update-grub: extra_opts="$extra_opts crashkernel=384M-2G:64M,2G-:128M"
./saucy/grub-0.97/debian/update-grub: extra_opts="$extra_opts crashkernel=384M-2G:64M,2G-:128M"
./precise/grub-0.97/debian/update-grub: extra_opts="$extra_opts crashkernel=384M-2G:64M,2G-:128M"

Which are from this changelog entry:
grub (0.97-29ubuntu24) intrepid; urgency=low

  * update-grub: Use a range of sizes when setting up crashkernel=
    - Less then 512M of physical memory, no crashkernel areas
    - 512M -> 2G use 64M for crashkernel
    - > 2G use 128M for crashkernel

 -- Ben Collins <email address hidden> Thu, 19 Jun 2008 11:21:26 -0400

debian's grub source package doesn't have this change.

Revision history for this message
Louis Bouchard (louis) wrote :

btw, this bug will need to be re-targetted to kexec-tools; the crashkernel= definition has been moved to kexec-tools recently :

http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/trusty/kexec-tools/trusty-proposed/revision/63

kexec-tools (1:2.0.3-4ubuntu2) trusty; urgency=low
 2
 3 * Add "crashkernel=384M-2G:64M,2G-:128M" to GRUB command line for
 4 non-recovery entries. This supersedes an Ubuntu-specific patch
 5 previously carried in grub2 itself.
 6
 7 -- Colin Watson <email address hidden> Tue, 12 Nov 2013 17:03:24 +0000

Revision history for this message
Louis Bouchard (louis) wrote :
Download full text (3.2 KiB)

== Test on a VM with 384Mb of memory ==

1)Original setup - crashkernel=384M-2G:64M,2G-:128M
ubuntu@SaucyS:~$ free
             total used free shared buffers cached
Mem: 306572 144412 162160 0 21900 73716
-/+ buffers/cache: 48796 257776
Swap: 1044476 0 1044476
ubuntu@SaucyS:~$ dmesg | grep crash
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=384M-2G:64M,2G-:128M console=ttyS0,115200
[ 0.000000] Reserving 64MB of memory at 288MB for crashkernel (System RAM: 384MB)

2) Proposed setup - crashkernel=384-:128M
root@SaucyS:~# free
             total used free shared buffers cached
Mem: 241036 146528 94508 0 21888 74100
-/+ buffers/cache: 50540 190496
Swap: 1044476 0 1044476
root@SaucyS:~# dmesg | grep crash
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=384M-:128M console=ttyS0,115200
[ 0.000000] Reserving 128MB of memory at 224MB for crashkernel (System RAM: 384MB)

3) Proposed setup on a 285Mb VM
cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=384M-:128M console=ttyS0,115200

part of boot log:
 * Loading crashkernel... Memory for crashkernel is not reserved
Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel
Then try loading kdump kernel [fail]

= Alternative tests =

1) Forcing 128Mb reservation on 285Mb VM

root@SaucyS:~# free
             total used free shared buffers cached
Mem: 140684 125108 15576 0 7988 67296
-/+ buffers/cache: 49824 90860
Swap: 1044476 0 1044476
root@SaucyS:~# dmesg | grep crash
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=128M console=ttyS0,115200
[ 0.000000] Reserving 128MB of memory at 128MB for crashkernel (System RAM: 284MB)
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=128M console=ttyS0,115200

2) Forcing 128Mb reservation on 255Mb VM
root@SaucyS:~# free
             total used free shared buffers cached
Mem: 113048 99292 13756 0 6624 44132
-/+ buffers/cache: 48536 64512
Swap: 1044476 0 1044476
root@SaucyS:~# dmesg | grep crash
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=128M console=ttyS0,115200
[ 0.000000] Reserving 128MB of memory at 96MB for crashkernel (System RAM: 255MB)
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.11.0-12-generic root=/dev/mapper/SaucyS--vg-root ro crashkernel=128M console=ttyS0,115200

So even for a VM with 256Mb of RAM, it could boot with crashkernel=128M but I think it is safe to re...

Read more...

Chris J Arges (arges)
Changed in kexec-tools (Ubuntu):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

Pushed changes into kexec-tools trusty. Let's test these changes first, then modify the grub packages in earlier releases via SRU if trusty is working fine.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kexec-tools - 1:2.0.3-4ubuntu3

---------------
kexec-tools (1:2.0.3-4ubuntu3) trusty; urgency=low

  * Increase memory parameter crashkernel command line to 128M to avoid
    OOM kernel panic. (LP: #785394)
 -- Chris J Arges <email address hidden> Wed, 18 Dec 2013 10:54:08 -0600

Changed in kexec-tools (Ubuntu):
status: In Progress → Fix Released
Chris J Arges (arges)
no longer affects: kexec-tools (Ubuntu Precise)
no longer affects: kexec-tools (Ubuntu Quantal)
no longer affects: kexec-tools (Ubuntu Raring)
no longer affects: kexec-tools (Ubuntu Saucy)
Changed in grub (Ubuntu):
status: In Progress → Fix Released
assignee: Chris J Arges (arges) → nobody
importance: Medium → Undecided
Changed in grub (Ubuntu Precise):
importance: Undecided → Medium
assignee: nobody → Chris J Arges (arges)
Changed in grub (Ubuntu Quantal):
assignee: nobody → Chris J Arges (arges)
Changed in grub (Ubuntu Raring):
assignee: nobody → Chris J Arges (arges)
Changed in grub (Ubuntu Saucy):
assignee: nobody → Chris J Arges (arges)
Changed in grub (Ubuntu Precise):
status: New → In Progress
Changed in grub (Ubuntu Quantal):
status: New → In Progress
Changed in grub (Ubuntu Raring):
status: New → In Progress
Changed in grub (Ubuntu Saucy):
status: New → In Progress
Changed in grub (Ubuntu Quantal):
importance: Undecided → Medium
Changed in grub (Ubuntu Raring):
importance: Undecided → Medium
Changed in grub (Ubuntu Saucy):
importance: Undecided → Medium
Revision history for this message
Bryan Quigley (bryanquigley) wrote :

I have at least one example where I had to it bumped to above 128M (I picked 256M out of my head).
Summary of hardware, 16 total cores, 24G ram, 37 drives, mix of xfs/nfs, drive size mostly 2 TBs.
I'll see if we can't get some internal examples.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Bryan: Are you saying 256MB was needed in order for the crash kernel to boot, that 128MB was not enough?

(I'm not sure that there is any advantage to reserving more memory than needed, aside from the kernel one day growing to need 129MB)

Revision history for this message
Bryan Quigley (bryanquigley) wrote :

128MB didn't work
256MB did
Nothing else was tested on this machine.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Ah, okay, that's an issue. Not only do we not have an easy way of measuring how much memory a kernel needs to boot, we don't know how that requirement varies depending on the system configuration...

Revision history for this message
Peter Matulis (petermatulis) wrote :

RFE

Let there be a separate tool/command that would provide an estimate of the memory required for the localhost.

Revision history for this message
Louis Bouchard (louis) wrote :

Please keep in mind the 128Mb is some kind of catchall for defined when memory sizes did not go much higher than 16Gb. I will inquire in the ML that covers makedumpfile to see if there is a way to come up with an estimate of the required memory and will report back.

In the meantime, I don't see any issue with having to setup the value manually for servers that have large amount of memory.

Chris J Arges (arges)
no longer affects: grub (Ubuntu Raring)
no longer affects: grub (Ubuntu Quantal)
no longer affects: grub (Ubuntu Saucy)
Changed in grub (Ubuntu Precise):
assignee: Chris J Arges (arges) → nobody
status: In Progress → Confirmed
importance: Medium → Undecided
Revision history for this message
Mohammed Naser (mnaser) wrote :

I'd just like to report in that I couldn't get it to work, it only did after raising the limit to 256M on a server with 256GB of memory.

Revision history for this message
Steve Langasek (vorlon) wrote :

The Precise Pangolin has reached end of life, so this bug will not be fixed for that release

Changed in grub (Ubuntu Precise):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.