Improve sysctl handling on kdump-tools

Bug #1845048 reported by Dan Streetman
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kdump-tools (Debian)
New
Unknown
kdump-tools (Ubuntu)
Fix Released
Undecided
Unassigned
makedumpfile (Ubuntu)
Xenial
Opinion
Medium
Guilherme G. Piccoli
Bionic
In Progress
Medium
Guilherme G. Piccoli
Eoan
Won't Fix
Medium
Guilherme G. Piccoli
Focal
In Progress
Medium
Guilherme G. Piccoli
Groovy
Fix Released
Medium
Guilherme G. Piccoli

Bug Description

[impact]

Documentation, and past behavior, for kdump-tools was that the KDUMP_SYSCTL variable in the /etc/default/kdump-tools file would be applied to the system kernel params at kdump 'load'. However this is no longer true, and those params are no longer applied to the system's kernel param settings.

[test case]

install linux-crashdump (and kdump-tools).

Edit the /etc/default/kdump-tools file to set the KDUMP_SYSCTL param to something other than default, e.g.:

KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"

reboot, or unload/reload kdump, to pick up the changes to the file.

Check if the panic_on_warn param is set:

$ cat /proc/sys/kernel/panic_on_warn
0

the problem does not seem to be with sysctl, as manually calling it does work:

$ KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"
$ cat /proc/sys/kernel/panic_on_warn
0
$ sudo sysctl -w $KDUMP_SYSCTL
kernel.panic_on_oops = 1
kernel.panic_on_warn = 1
$ cat /proc/sys/kernel/panic_on_warn
1

[regression potential]

TBD

Tags: sts
Revision history for this message
Dan Streetman (ddstreet) wrote :

Also to clarify, manually unloading/reloading kdump doesn't set the params properly:

ubuntu@makedumpfile-e:~$ grep . /proc/sys/kernel/panic_on_warn
0
ubuntu@makedumpfile-e:~$ sudo kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1 kernel.panic_on_warn=1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0xaa000000
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-5.3.0-10-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-5.3.0-10-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-5.3.0-10-generic root=PARTUUID=7975ae09-1bd6-4c5f-b460-5f5218cc2327 ro console=tty1 console=ttyS0 systemd.unit=kdump-tools-dump.service nr_cpus=1 irqpoll nousb ata_piix.prefer_ms_hyperv=0" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
ubuntu@makedumpfile-e:~$ sudo kdump-config unload
 * unloaded kdump kernel
ubuntu@makedumpfile-e:~$ sudo kdump-config load
 * Creating symlink /var/lib/kdump/vmlinuz
 * Creating symlink /var/lib/kdump/initrd.img
 * loaded kdump kernel
ubuntu@makedumpfile-e:~$ grep . /proc/sys/kernel/panic_on_warn
0

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

This was somehow removed between kdump-tools 1.5.5 and 1.5.9. I'll see if I can pinpoint the exact version in history and try to find some justification. But this is an old regression.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

commit 9a77ef188b24742b595f69891b97d48dc997f5e2
Author: Louis Bouchard <email address hidden>
Date: Thu Jan 29 16:03:04 2015 +0100

    [debian] Fix panic_on_oops faulty handling

    Signed-off-by: Louis Bouchard <email address hidden>

diff --git a/debian/kdump-config b/debian/kdump-config
index 7ea26cab1c95..64489a74757f 100755
--- a/debian/kdump-config
+++ b/debian/kdump-config
@@ -310,9 +310,10 @@ function fadump_register()
        logger -t $NAME "fadump registered successfully"

        # Last step: make sure panic_on_oops is enabled
- if [ -x /sbin/sysctl -a "$KDUMP_SYSCTL" != " " ] ; then
- sysctl -w $KDUMP_SYSCTL >/dev/null
- fi
+ PANIC_ON_OOPS=`/sbin/sysctl kernel.panic_on_oops | cut -d" " -f3`
+ if [ $PANIC_ON_OOPS != "1" ] ; then
+ sysctl -w kernel.panic_on_oops=1 >/dev/null
+ fi
 }

 # Returns: none. prints warnings or exit
@@ -393,9 +394,11 @@ function kdump_load()
        fi

        # Last step: make sure panic_on_oops is enabled
- if [ -x /sbin/sysctl -a "$KDUMP_SYSCTL" != " " ] ; then
- sysctl -w $KDUMP_SYSCTL >/dev/null
- fi
+ PANIC_ON_OOPS=`/sbin/sysctl kernel.panic_on_oops | cut -d" " -f3`
+ if [ $PANIC_ON_OOPS != "1" ] ; then
+ sysctl -w kernel.panic_on_oops=1 >/dev/null
+ fi
+
 }

Changed in makedumpfile (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Changed in makedumpfile (Ubuntu Xenial):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in makedumpfile (Ubuntu Bionic):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in makedumpfile (Ubuntu Eoan):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in makedumpfile (Ubuntu Focal):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Low → Medium
Changed in makedumpfile (Ubuntu Eoan):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Bionic):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Xenial):
importance: Undecided → Medium
status: New → Confirmed
Changed in makedumpfile (Ubuntu Bionic):
status: New → Confirmed
Changed in makedumpfile (Ubuntu Eoan):
status: New → Confirmed
Changed in makedumpfile (Ubuntu Focal):
status: Triaged → Confirmed
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks Dan for the report and Cascardo for finding the offender commit that introduced this regression!

I think we have room for improvements here, my considerations are:
a) KDUMP_SYSCTL is a misleading name; it seems to be related with sysctls set _when_ dumping, which is also an important thing to control. Despite this, the option meaning was related to the kdump sysctl _triggers_.

b) Even by re-enabling this option we don't have a fine tuning configuration of sysctls during the dump.

So, my idea to fix/improve the situation is: let's have a KDUMP_PANIC_TRIGGERS variable (that defaults to panic_on_oops) and a KDUMP_SYSCTL_ON_DUMP variable that overrides the desired sysctls set on normal kernel when the kdump kernel boots (and which should defaults to have hugepages disabled, since it's a known problem when kdumping to have hugepages "consuming" the system memory).

Cheers,

Guilherme

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

I've reported a Debian bug with the proposed fixes; the merge report has the information about the approach used to deal with sysctl in kdump: https://salsa.debian.org/debian/makedumpfile/-/merge_requests/2

Cheers,

Guilherme

summary: - /etc/default/kdump-tools KDUMP_SYSCTL does not set sysctl params
+ Improve sysctl handling on kdump-tools
Changed in makedumpfile (Debian):
status: Unknown → New
Changed in makedumpfile (Ubuntu):
status: Confirmed → In Progress
Changed in makedumpfile (Ubuntu Xenial):
status: Confirmed → Opinion
Changed in makedumpfile (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in makedumpfile (Ubuntu Eoan):
status: Confirmed → In Progress
Changed in makedumpfile (Ubuntu Focal):
status: Confirmed → In Progress
tags: added: sts
Revision history for this message
Brian Murray (brian-murray) wrote :

The Eoan Ermine has reached end of life, so this bug will not be fixed for that release

Changed in makedumpfile (Ubuntu Eoan):
status: In Progress → Won't Fix
Mathew Hodson (mhodson)
affects: makedumpfile (Debian) → kdump-tools (Debian)
no longer affects: kdump-tools (Ubuntu Xenial)
no longer affects: kdump-tools (Ubuntu Bionic)
no longer affects: kdump-tools (Ubuntu Eoan)
no longer affects: kdump-tools (Ubuntu Focal)
no longer affects: kdump-tools (Ubuntu Groovy)
Revision history for this message
dann frazier (dannf) wrote :

Guillherme's MR was merged in 1.6.8-1, so marking Fix Released.

Changed in kdump-tools (Ubuntu):
status: New → Fix Released
Changed in makedumpfile (Ubuntu):
status: In Progress → Invalid
Revision history for this message
dann frazier (dannf) wrote :

Correction: looks like it was 1:1.6.7-4. git-debrebase confused me.

Changed in makedumpfile (Ubuntu Groovy):
status: In Progress → Fix Released
Mathew Hodson (mhodson)
no longer affects: makedumpfile (Ubuntu)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.