kexec fails with OOM killer with the current crashkernel=128 value

Bug #1496317 reported by Louis Bouchard on 2015-09-16
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kexec-tools (Ubuntu)
High
Unassigned
Wily
High
Unassigned
makedumpfile (Ubuntu)
Undecided
Unassigned
Wily
Undecided
Unassigned

Bug Description

[SRU justification]
This modification is required to fix memory shortage in the kexec booted kernel that blocks the creation of kernel crash dumps

[Impact]
Without this fix, the kernel crash dump mechanism is unusable without applying a workaround

[Fix]
Implement use of smaller initrd.img files located in /var/lib/kdump and used with the definition of KDUMP_KERNEL / KDUMP_INITRD in the config file.

[Test Case]
- Create a system running the standard Wily release
- install the linux-crashdump metapackage
- Enable the kdump mechanism by setting USE_KDUMP=1 in /etc/default/kdump-tools
- Reboot to activate the crashkernel= kernel variable
- execute the following as root to trigger a kernel crash :
  echo c > /proc/sysrq-trigger

Without the fix, the kernel crash dump procedure will not complete (see Original description) and will be stopped by the OOM killer.

With the fix, the kernel crash dump procedure will complete normally.

[Regression]
This is a new implementation, taken from upstream's 1.5.9 version. In the eventuality that the package is installed without using the new variable definitions in /etc/default/kdump-tools, the old method will be used and will lead to the original failure.

[Original description of the problem]
With a very basic Wily image, just trigger a crash with sysrq-trigger after installing and setting up linux-crashdump and the kexec will fail with the following :

[ 0.334592] Trying to unpack rootfs image as initramfs...
[ 0.649440] swapper/0 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[ 0.650332] swapper/0 cpuset=/ mems_allowed=0
[ 0.650856] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.2.0-7-generic #7-Ubuntu
[ 0.651788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 0.652691] ffff88002fd906c8 ffff88002fcf76a8 ffffffff817b0465 000000000000000f
[ 0.653556] ffff88002fd90000 ffff88002fcf7738 ffffffff817ae2b5 ffff88002ffc0e28
[ 0.654418] ffff880033c164c0 ffff880033c16530 ffff88002fd90068 ffff88002fcf7748
...
[ 0.682339] Mem-Info: [756/1897]
[ 0.682599] active_anon:0 inactive_anon:0 isolated_anon:0
[ 0.682599] active_file:2023 inactive_file:0 isolated_file:0
[ 0.682599] unevictable:14845 dirty:0 writeback:0 unstable:0
[ 0.682599] slab_reclaimable:1722 slab_unreclaimable:632
[ 0.682599] mapped:0 shmem:0 pagetables:0 bounce:0
[ 0.682599] free:0 free_pcp:2 free_cma:0
[ 0.685982] Node 0 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inact
ive_file:0kB unevictable:540kB isolated(anon):0kB isolated(file):0kB present:632kB managed:548kB mlocked:0kB dirty:
0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB
unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:540 all_unreclaimab
le? yes
[ 0.690450] lowmem_reserve[]: 0 0 0 0
[ 0.690928] Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:8092kB
inactive_file:0kB unevictable:58840kB isolated(anon):0kB isolated(file):0kB present:130420kB managed:78064kB mlocke
d:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:6880kB slab_unreclaimable:2528kB kernel_stack:4
64kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:8kB local_pcp:8kB free_cma:0kB writeback_tmp:0kB pages_scanned
:58840 all_unreclaimable? yes
[ 0.695547] lowmem_reserve[]: 0 0 0 0
[ 0.696049] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0k
B
[ 0.697335] Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =
0kB
[ 0.698641] 16868 total pagecache pages
[ 0.699075] 0 pages in swap cache
[ 0.699448] Swap cache stats: add 0, delete 0, find 0/0
[ 0.700029] Free swap = 0kB
[ 0.700356] Total swap = 0kB
[ 0.700679] 32763 pages RAM
[ 0.700991] 0 pages HighMem/MovableOnly
[ 0.701418] 13110 pages reserved
[ 0.701778] 0 pages cma reserved
[ 0.702139] 0 pages hwpoisoned
[ 0.702481] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 0.703423] Kernel panic - not syncing: Out of memory and no killable processes...

Louis Bouchard (louis) on 2015-09-16
Changed in kexec-tools (Ubuntu):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Louis Bouchard (louis-bouchard)
Louis Bouchard (louis) wrote :
Download full text (5.1 KiB)

With crashkernel=145, I get a bit more information :
[ 1.351621] Mem-Info:
[ 1.351897] active_anon:3939 inactive_anon:12 isolated_anon:0
[ 1.351897] active_file:1120 inactive_file:1335 isolated_file:0
[ 1.351897] unevictable:18546 dirty:0 writeback:0 unstable:0
[ 1.351897] slab_reclaimable:2235 slab_unreclaimable:1684
[ 1.351897] mapped:1440 shmem:16 pagetables:248 bounce:0
[ 1.351897] free:472 free_pcp:5 free_cma:0
[ 1.355454] Node 0 DMA free:480kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB ina
ctive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:632kB managed:548kB mlocked:0kB dirty:
0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:56kB kernel_stack:0kB pagetables:0kB
 unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimabl
e? yes
[ 1.359904] lowmem_reserve[]: 0 121 121 121
[ 1.360451] Node 0 DMA32 free:1408kB min:1408kB low:1760kB high:2112kB active_anon:15756kB inactive_anon:48kB ac
tive_file:4480kB inactive_file:5340kB unevictable:74184kB isolated(anon):0kB isolated(file):0kB present:147828kB ma
naged:126836kB mlocked:1716kB dirty:0kB writeback:0kB mapped:5760kB shmem:64kB slab_reclaimable:8940kB slab_unrecla
imable:6680kB kernel_stack:1008kB pagetables:992kB unstable:0kB bounce:0kB free_pcp:20kB local_pcp:20kB free_cma:0k
B writeback_tmp:0kB pages_scanned:72640 all_unreclaimable? yes
[ 1.365436] lowmem_reserve[]: 0 0 0 0
[ 1.365914] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*
2048kB 0*4096kB = 472kB
[ 1.367484] Node 0 DMA32: 2*4kB (UM) 1*8kB (U) 1*16kB (M) 3*32kB (UEM) 0*64kB 2*128kB (EM) 2*256kB (UE) 1*512kB
(U) 0*1024kB 0*2048kB 0*4096kB = 1408kB
[ 1.369238] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 1.370177] 20930 total pagecache pages
[ 1.370604] 0 pages in swap cache
[ 1.370984] Swap cache stats: add 0, delete 0, find 0/0
[ 1.371563] Free swap = 0kB
[ 1.371887] Total swap = 0kB
[ 1.372222] 37115 pages RAM
[ 1.372542] 0 pages HighMem/MovableOnly
[ 1.372970] 5269 pages reserved
[ 1.373322] 0 pages cma reserved
[ 1.373684] 0 pages hwpoisoned
[ 1.374028] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 1.374968] [ 97] 0 97 1169 217 8 3 0 0 udev
[ 1.375899] [ 99] 0 99 6870 791 17 3 0 -1000 systemd-udevd
[ 1.376932] [ 105] 0 105 6612 641 18 3 0 0 udevadm
[ 1.377895] [ 106] 0 106 6738 774 16 3 0 0 systemd-udevd
[ 1.378920] [ 107] 0 107 6804 818 16 3 0 0 systemd-udevd
[ 1.379934] [ 108] 0 108 6804 848 17 3 0 0 systemd-udevd
[ 1.380967] [ 109] 0 109 6804 821 16 3 0 0 systemd-udevd
[ 1.381990] [ 110] 0 110 6804 843 16 3 ...

Read more...

Louis Bouchard (louis) wrote :

One workaround is to edit /etc/initramfs-tools/initramfs.conf and to change :

MODULES=most

for

MODULES=dep

Louis Bouchard (louis) on 2015-09-23
Changed in kexec-tools (Ubuntu):
status: Confirmed → In Progress
Louis Bouchard (louis) wrote :

First proposal for a fix to be discussed

tags: added: patch
Louis Bouchard (louis) on 2015-11-30
Changed in kexec-tools (Ubuntu Wily):
status: New → In Progress
assignee: nobody → Louis Bouchard (louis-bouchard)
importance: Undecided → High
Changed in kexec-tools (Ubuntu):
status: In Progress → Fix Released
assignee: Louis Bouchard (louis-bouchard) → nobody
Louis Bouchard (louis) on 2015-11-30
description: updated
Chris J Arges (arges) wrote :

Accepted https://launchpad.net/ubuntu/+source/makedumpfile/1:1.5.8-4ubuntu1 into wily-proposed. Please verify and make verification-done if this upload fixes this issue. Thanks

Changed in makedumpfile (Ubuntu):
status: New → Fix Released
tags: added: verification-needed
Changed in makedumpfile (Ubuntu Wily):
status: New → Fix Committed
Louis Bouchard (louis) on 2015-12-08
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.5.8-4ubuntu1

---------------
makedumpfile (1:1.5.8-4ubuntu1) wily; urgency=medium

  * kdump-tools now make use of smaller initrd.img files created in
    /var/lib/kdump. This avoid optential OOM when the size of the initrd.img
    becomes larger (LP: #1496317). Implementation details are :

      - New kernel hooks are added to create smaller initrd.img files when new
        kernel packages are installed :
          /etc/kernel/postrm.d/kdump-tools
          /etc/kernel/postinst.d/kdump-tools

      - kdump-config is responsible for the maintenance of symbolic links used
        to point to the appropriate vmlinuz & initrd files. Link maintenance
        is done at each boot by kdump-config. New links will point to the
        files named after the running kernel.

      - The KDUMP_KERNEL and KDUMP_INITRD configuration variables are now
        now enabled and use the symbolic links :
          /var/lib/kdump/vmlinuz
          /var/lib/kdump/initrd.img

      o kdump-config has been adapted to show the symbolic link information

 -- Louis Bouchard <email address hidden> Thu, 26 Nov 2015 15:38:15 +0100

Changed in makedumpfile (Ubuntu Wily):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for makedumpfile has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Louis Bouchard (louis) on 2016-01-25
Changed in kexec-tools (Ubuntu Wily):
status: In Progress → Fix Released
assignee: Louis Bouchard (louis-bouchard) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers