ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3g4:ubuntu1604:'kdump-config show' shows Not ready to kdump even after doing kdump setup.

Bug #1769927 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Incomplete
Critical
Canonical Kernel Team
makedumpfile (Ubuntu)
Incomplete
Critical
Canonical Kernel Team

Bug Description

Problem Description:
===================

'kdump-config show' shows Not ready to kdump even after doing kdump setup on ubuntu160404 guest.

Steps to re-create:
==================

1. boslcp3 is up with BMC:1.20 & PNOR: 20180420 levels
2. Installed p8 compat guest with ubutnu160404

root@boslcp3g4:~# uname -a
Linux boslcp3g4 4.4.0-122-generic #146-Ubuntu SMP Mon Apr 23 15:33:25 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
root@boslcp3g4:~# uname -r
4.4.0-122-generic

3. Installed linux-crashdump crash dump utility.

4. root@boslcp3g4:~# cat /etc/default/grub.d/kexec-tools.cfg
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=4096M"
root@boslcp3g4:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinux-4.4.0-116-generic root=UUID=8be749b1-a5a8-4ab8-986c-ee4fcc61165e ro
root@boslcp3g4:~#

5. rebooted guest and checked kdump status.

root@boslcp3g4:~# /etc/init.d/kdump-tools status
? kdump-tools.service - Kernel crash dump capture service
  Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
  Active: active (exited) since Sun 2018-05-06 04:18:37 EDT; 7min ago
 Process: 2317 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
Main PID: 2317 (code=exited, status=0/SUCCESS)
   Tasks: 0
  Memory: 0B
     CPU: 0
  CGroup: /system.slice/kdump-tools.service

May 06 04:18:37 boslcp3g4 systemd[1]: Starting Kernel crash dump capture service...
May 06 04:18:37 boslcp3g4 kdump-tools[2317]: Starting kdump-tools: * no crashkernel= parameter in the kernel cmdline
May 06 04:18:37 boslcp3g4 systemd[1]: Started Kernel crash dump capture service.
root@boslcp3g4:~# sed -i 's/USE_KDUMP=0/USE_KDUMP=1/g' /etc/default/kdump-tools

4. kdump state

root@boslcp3g4:~# kdump-config show
* no crashkernel= parameter in the kernel cmdline
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
  /var/lib/kdump/vmlinuz
kdump initrd:
  /var/lib/kdump/initrd.img
current state: Not ready to kdump

kexec command:
 no kexec command recorded

--> It still says 'Not ready to kdump'

--> Becuase of this state not able take dump when guest hangs/crashes

> Tried again changing crashkernel parameter on ubuntu160404 guest- p8 compat ,but still facing same issue

root@boslcp3g4:/kte/tools/setup.d# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinux-4.4.0-122-generic root=UUID=8be749b1-a5a8-4ab8-986c-ee4fcc61165e ro ro splash quiet xmon=on crashkernel=512M crashkernel=4096M
root@boslcp3g4:/kte/tools/setup.d#

root@boslcp3g4:/kte/tools/setup.d# /etc/init.d/kdump-tools status
? kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Sun 2018-05-06 04:59:51 EDT; 14min ago
 Main PID: 2330 (code=exited, status=0/SUCCESS)
    Tasks: 0
   Memory: 0B
      CPU: 0
   CGroup: /system.slice/kdump-tools.service

May 06 04:59:50 boslcp3g4 systemd[1]: Starting Kernel crash dump capture se.....
May 06 04:59:51 boslcp3g4 kdump-tools[2330]: Starting kdump-tools: * Invali...g
May 06 04:59:51 boslcp3g4 kdump-tools[2330]: * Creating symlink /var/lib/kd...g
May 06 04:59:51 boslcp3g4 kdump-tools[2330]: * Invalid symlink : /var/lib/k...z
May 06 04:59:51 boslcp3g4 kdump-tools[2330]: * Creating symlink /var/lib/kd...z
May 06 04:59:51 boslcp3g4 kdump-tools[2330]: * /etc/default/kdump-tools: KD...g
May 06 04:59:51 boslcp3g4 systemd[1]: Started Kernel crash dump capture service.
Hint: Some lines were ellipsized, use -l to show in full.

root@boslcp3g4:/kte/tools/setup.d# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-122-generic
kdump initrd:
   /var/lib/kdump/initrd.img: broken symbolic link to /var/lib/kdump/initrd.img-4.4.0-122-generic
current state: Not ready to kdump

kexec command:
  no kexec command recorded

--> Still kdump state shows 'Not ready to kdump'

Regards,
Indira

> > > Thanks Breno! You point out that 4.4.0-113 is in use. Where did you notice
> > > that? In the initial description, Indira states the guest has:
> > >
> > > root@boslcp3g4:~# uname -r
> > > 4.4.0-122-generic
> > >
> > > yet oddly /proc/cmdline shows:
> > >
> > > root@boslcp3g4:~# cat /proc/cmdline
> > > BOOT_IMAGE=/boot/vmlinux-4.4.0-116-generic
> > > root=UUID=8be749b1-a5a8-4ab8-986c-ee4fcc61165e ro
> >
> > That is where I see this -116 kernel. Is it -116 a symlink to -122?
>
> Hi Breno,
>
> I have installed ubtunu160404 using iso file which came up with
> '4.4.0-116-generic' And then did apt-get update/upgrade where kernel
> upgraded to '4.4.0-122-generic'
>
> Installed 'linux-crashdump' , updated crashkernel parameter in
> /etc/default/grub file & then issued reboot. Once system came up
> 'kdump-config show' command shows 'Not ready to dump'.

This is an interesting sequence that is probably not covered in kdump scripts.
Post installation scripts are triggered to build kdump initrd, in /var/lib/kdump
directory, for the booted kernel version while installing kdump-tools and for
every kernel installation after kdump-tools package is installed.

Since kernel version 4.4.0-122-generic was installed before kdump-tools was
installed and kdump-tools package was installed while the system was up
with 4.4.0-116-generic kernel version, kdump-tools postinstall script was
triggered only for kernel version 4.4.0-116-generic generating

    /var/lib/kdump/initrd.img-4.4.0-116-generic
    but NOT
    /var/lib/kdump/initrd.img-4.4.0-122-generic

Subsequently, when system booted witt kernel version 4.4.0-122-generic,
/var/lib/kdump/initrd.img-4.4.0-122-generic was missing and AFAICS, kdump-tools
doesn't seem to have a mechanism to workaround this. I manually triggered
kdump-tools post install script on kernel version 4.4.0-122-generic

    $ /etc/kernel/postinst.d/kdump-tools 4.4.0-122-generic

which ensured /var/lib/kdump/initrd.img-4.4.0-122-generic is created and was
able to load kdump kernel on 4.4.0-122-generic, after running the above
command with

    $ systemctl restart kdump-tools.service

To workaround this problem:
    A) Trigger kdump-tools post install script on every available kernel
       while installing kdump-tools package.
    B) Document the bevaior and provide the workaround commands.
    C) Install kdump-tools as the basic package to avoid the issue altogether.

Please mirror the bug to Canonical to pick an appropriate approach to resolve
this problem.

Thanks
Hari

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-167483 severity-critical targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → makedumpfile (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
tags: added: triage-g
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-05-09 12:56 EDT-------
(In reply to comment #17)
> (In reply to comment #16)
> > HI Luciano,
> >
> > Bug is set to NEEDINFO state, may i know the pending item from my end.
> >
> > Regards,
> > Indira
>
> Hi Indira,
>
> Yes, please confirm that the workaround Hari provided indeed allows kdump to
> generate a vmcore successfully. We have a couple of other bugs this is
> dependent on working to pursue gathering data for crashes reported.
>
> echo c > /proc/sysrq-trigger
>
> should tell us.

Hi,

Have tried the workaround

1. root@boslcp3g4:~#/etc/kernel/postinst.d/kdump-tools 4.4.0-122-generic
root@boslcp3g4:~# systemctl restart kdump-tools.service
[ 216.964332] kdump-tools[8867]: Stopping kdump-tools: * unloaded kdump kernel
[ 217.271290] kdump-tools[8895]: Starting kdump-tools: Modified cmdline:BOOT_IMAGE=/boot/vmlinux-4.4.0-122-generic root=UUID=8be749b1-a5a8-4ab8-986c-ee4fcc61165e ro ro splash quiet xmon=on irqpoll noirqdistrib nr_cpus=1 nousb systemd.unit=kdump-tools.service elfcorehdr=156800K
[ 218.122135] kdump-tools[8895]: * loaded kdump kernel

2. root@boslcp3g4:~# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:

/var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-122-generic
kdump initrd:
/var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.4.0-122-generic
current state: ready to kdump

kexec command:
/sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.4.0-122-generic root=UUID=8be749b1-a5a8-4ab8-986c-ee4fcc61165e ro ro splash quiet xmon=on irqpoll noirqdistrib nr_cpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

3. root@boslcp3g4:~# kdump-config status
current state : ready to kdump

root@boslcp3g4:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinux-4.4.0-122-generic root=UUID=8be749b1-a5a8-4ab8-986c-ee4fcc61165e ro ro splash quiet crashkernel=512M crashkernel=4096M

4. Triggered crash & it logged crash logs

root@boslcp3g4:~# ls -lr /var/crash
total 364
-rw-r----- 1 root root 345515 May 7 06:19 _usr_bin_rsync.0.crash
-rw-r----- 1 root root 13690 May 9 12:45 linux-image-4.4.0-122-generic-201805091244.crash
-rw-r--r-- 1 root root 269 May 9 12:45 kexec_cmd
drwxr-xr-x 2 root root 4096 May 9 12:44 201805091244
root@boslcp3g4:~# date
Wed May 9 12:46:40 EDT 2018

Regards,
Indira

Manoj Iyer (manjo)
Changed in makedumpfile (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
status: New → Triaged
Revision history for this message
Manoj Iyer (manjo) wrote :

Could you please try your testing with linux-hwe kernel (4.13) ?

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

The kernel version is not the issue with this use case. This is already fixed with kdump-tools on bionic. Please test it on bionic and confirm it's fixed there.

Thanks.
Cascardo.

Frank Heimes (fheimes)
Changed in makedumpfile (Ubuntu):
status: Triaged → Incomplete
Changed in ubuntu-power-systems:
status: Triaged → Incomplete
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

I have reported this and fixed it on bionic. Now, should have just been made available on xenial and artful.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-07-04 04:33 EDT-------
(In reply to comment #26)
> I have reported this and fixed it on bionic. Now, should have just been made
> available on xenial and artful.

Thanks, Cascardo.
With kdump-tools version 1:1.6.3-2ubuntu1 on bionic,
I don't hit this bug anymore.

Thanks
Hari

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-03-18 17:02 EDT-------
No response from submitter in a long time. Rejecting.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.