Out of memory running crashkernel in ubuntu18.04.1. (Regression/qla2xxx/ubuntu18.04.1/BostonLC)(Documentation?)

Bug #1860519 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Medium
Ubuntu on IBM Power Systems Bug Triage

Bug Description

== Comment: #0 - Naresh Bannoth <email address hidden> - 2018-06-06 01:49:02 ==
---Problem Description---
getting the following kernel panic message while trying to dump the crash over local directory of ubuntu18.04.1.

"Kernel panic - not syncing: Out of memory and no killable processes..."

This is a Regression BUG, as it worked fine in Ubuntu18.04 for local dump.

configuration details are as follows,

root@ltciofvtr-bostonlc1:~# free -g
              total used free shared buff/cache available
Mem: 123 20 97 0 5 101
Swap: 1 0 1
root@ltciofvtr-bostonlc1:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 59G 0 59G 0% /dev
tmpfs 13G 21M 13G 1% /run
/dev/sde2 5.5T 98G 5.1T 2% /
tmpfs 62G 0 62G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 62G 0 62G 0% /sys/fs/cgroup
tmpfs 13G 0 13G 0% /run/user/0
tmpfs 128K 0 128K 0% /var/lib/lxd/shmounts
tmpfs 128K 0 128K 0% /var/lib/lxd/devlxd
root@ltciofvtr-bostonlc1:~#

root@ltciofvtr-bostonlc1:~# service kdump-tools restart
root@ltciofvtr-bostonlc1:~# service kdump-tools status
? kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Tue 2018-06-05 06:51:45 EDT; 6s ago
  Process: 18478 ExecStop=/etc/init.d/kdump-tools stop (code=exited, status=0/SUCCESS)
  Process: 18519 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
 Main PID: 18519 (code=exited, status=0/SUCCESS)

Jun 05 06:51:44 ltciofvtr-bostonlc1 systemd[1]: Starting Kernel crash dump capture service...
Jun 05 06:51:44 ltciofvtr-bostonlc1 kdump-tools[18519]: Starting kdump-tools: * Creating symlink /var/lib/kdump/vmlinuz
Jun 05 06:51:44 ltciofvtr-bostonlc1 kdump-tools[18519]: * Creating symlink /var/lib/kdump/initrd.img
Jun 05 06:51:44 ltciofvtr-bostonlc1 kdump-tools[18519]: Modified cmdline:root=UUID=6e1afd6a-a199-4bc5-a324-b65d5607d03b ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb
Jun 05 06:51:45 ltciofvtr-bostonlc1 kdump-tools[18519]: * loaded kdump kernel
Jun 05 06:51:45 ltciofvtr-bostonlc1 systemd[1]: Started Kernel crash dump capture service.
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# cat /proc/cmdline
root=UUID=6e1afd6a-a199-4bc5-a324-b65d5607d03b ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M@128M
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.15.0-23-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.15.0-23-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p --command-line="root=UUID=6e1afd6a-a199-4bc5-a324-b65d5607d03b ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# cat /sys/kernel/kexec_crash_loaded
1
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# cat /etc/default/kdump-tools | grep -i crash
KDUMP_COREDIR="/var/crash"
KDUMP_COREDIR="/var/crash"
# the crash dump. The syntax must be {HOSTNAME}:{MOUNTPOINT}
# (e.g. remote:/var/crash)
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#

==============>>>>>>>> Snippet of Error logs,

[ OK ] Reached target System Time Synchronized.
[ 32.167186] Out of memory: Kill process 2958 (systemd-udevd) score 1 or sacrifice child
[ 32.167270] Killed process 2958 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1856kB, shmem-rss:0kB
[ 32.180509] Out of memory: Kill process 432 (systemd-network) score 1 or sacrifice child
[ 32.180565] Killed process 432 (systemd-network) total-vm:19520kB, anon-rss:0kB, file-rss:3520kB, shmem-rss:0kB
[ 32.188216] Out of memory: Kill process 2942 (systemd-udevd) score 1 or sacrifice child
[ 32.188278] Killed process 2942 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1408kB, shmem-rss:0kB
[ 32.196253] Out of memory: Kill process 2975 (systemd-udevd) score 1 or sacrifice child
[ 32.196317] Killed process 2975 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:1984kB, shmem-rss:0kB
[ 32.204652] Out of memory: Kill process 2949 (systemd-udevd) score 1 or sacrifice child
[ 32.204721] Killed process 2949 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1984kB, shmem-rss:0kB
[ 32.212555] Out of memory: Kill process 2956 (systemd-udevd) score 1 or sacrifice child
[ 32.212621] Killed process 2956 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1984kB, shmem-rss:0kB
[ 32.220553] Out of memory: Kill process 2944 (systemd-udevd) score 1 or sacrifice child
[ 32.220625] Killed process 2944 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:1664kB, shmem-rss:0kB
[ 32.229344] Out of memory: Kill process 2966 (systemd-udevd) score 1 or sacrifice child
[ 32.229403] Killed process 2966 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:1600kB, shmem-rss:0kB
[ 32.237213] Out of memory: Kill process 2950 (systemd-udevd) score 1 or sacrifice child
[ 32.237270] Killed process 3054 (sg_inq) total-vm:3776kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.245328] Out of memory: Kill process 2950 (systemd-udevd) score 1 or sacrifice child
[ 32.245387] Killed process 2950 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1664kB, shmem-rss:0kB
[ 32.252264] Out of memory: Kill process 2967 (systemd-udevd) score 1 or sacrifice child
[ 32.252323] Killed process 2967 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1600kB, shmem-rss:0kB
[ 32.260355] Out of memory: Kill process 397 (systemd-journal) score 1 or sacrifice child
[ 32.260416] Killed process 397 (systemd-journal) total-vm:33088kB, anon-rss:0kB, file-rss:3136kB, shmem-rss:0kB
[ 32.268815] Out of memory: Kill process 3055 (systemd-udevd) score 1 or sacrifice child
[ 32.268874] Killed process 3055 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:512kB, shmem-rss:0kB
[ 32.276999] Out of memory: Kill process 2844 (systemd-timesyn) score 0 or sacrifice child
[ 32.277062] Killed process 2844 (systemd-timesyn) total-vm:90624kB, anon-rss:0kB, file-rss:448kB, shmem-rss:0kB
[ 32.285044] Out of memory: Kill process 307 (plymouthd) score 0 or sacrifice child
[ 32.285119] Killed process 307 (plymouthd) total-vm:6272kB, anon-rss:0kB, file-rss:2432kB, shmem-rss:0kB
[ 32.293051] Out of memory: Kill process 2725 (openibd) score 0 or sacrifice child
[ 32.293109] Killed process 2725 (openibd) total-vm:9344kB, anon-rss:0kB, file-rss:2176kB, shmem-rss:0kB
[ 32.301016] Out of memory: Kill process 2697 (lvmetad) score 0 or sacrifice child
[ 32.301074] Killed process 2697 (lvmetad) total-vm:154816kB, anon-rss:0kB, file-rss:1664kB, shmem-rss:0kB
[ 32.309056] Out of memory: Kill process 2925 (find) score 0 or sacrifice child
[ 32.309115] Killed process 2925 (find) total-vm:9536kB, anon-rss:0kB, file-rss:448kB, shmem-rss:0kB
[ 32.317071] Out of memory: Kill process 2723 (apparmor) score 0 or sacrifice child
[ 32.317129] Killed process 2924 (apparmor) total-vm:3520kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 32.325087] Out of memory: Kill process 2723 (apparmor) score 0 or sacrifice child
[ 32.325147] Killed process 2723 (apparmor) total-vm:3520kB, anon-rss:0kB, file-rss:704kB, shmem-rss:0kB
[ 32.331819] Out of memory: Kill process 608 (modprobe) score 0 or sacrifice child
[ 32.331882] Killed process 608 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.340009] Out of memory: Kill process 781 (modprobe) score 0 or sacrifice child
[ 32.340082] Killed process 781 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.347991] Out of memory: Kill process 1828 (modprobe) score 0 or sacrifice child
[ 32.348054] Killed process 1828 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.355660] Out of memory: Kill process 561 (modprobe) score 0 or sacrifice child
[ 32.355726] Killed process 561 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.364656] Out of memory: Kill process 679 (modprobe) score 0 or sacrifice child
[ 32.364715] Killed process 679 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.371720] Out of memory: Kill process 1130 (modprobe) score 0 or sacrifice child
[ 32.371782] Killed process 1130 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.379795] Out of memory: Kill process 2926 (wc) score 0 or sacrifice child
[ 32.379860] Killed process 2926 (wc) total-vm:6336kB, anon-rss:0kB, file-rss:384kB, shmem-rss:0kB
[ 32.380553] infiniband mlx5_5: Couldn't open port 1
[ 32.388026] Kernel panic - not syncing: Out of memory and no killable processes...
[ 32.388026]
[ 32.388098] CPU: 1 PID: 3057 Comm: kworker/u8:18 Not tainted 4.15.0-23-generic #25-Ubuntu
[ 32.388182] Workqueue: mlx5_page_allocator pages_work_handler [mlx5_core]
[ 32.388233] Call Trace:
[ 32.388257] [c0000000e9e4b760] [c000000008cdeb7c] dump_stack+0xb0/0xf4 (unreliable)
[ 32.388320] [c0000000e9e4b7a0] [c00000000810d320] panic+0x148/0x328
[ 32.388372] [c0000000e9e4b840] [c0000000082e5730] out_of_memory+0x400/0x710
[ 32.388425] [c0000000e9e4b8e0] [c0000000082ed5dc] __alloc_pages_nodemask+0xfbc/0x1070
[ 32.388509] [c0000000e9e4bad0] [c008000004e6d080] give_pages+0x2d8/0x8c0 [mlx5_core]
[ 32.388593] [c0000000e9e4bc10] [c008000004e6da80] pages_work_handler+0x58/0x110 [mlx5_core]
[ 32.388655] [c0000000e9e4bc90] [c0000000081341f8] process_one_work+0x298/0x5a0
[ 32.388716] [c0000000e9e4bd20] [c000000008134598] worker_thread+0x98/0x630
[ 32.388767] [c0000000e9e4bdc0] [c00000000813d1c8] kthread+0x1a8/0x1b0
[ 32.388819] [c0000000e9e4be30] [c00000000800b658] ret_from_kernel_thread+0x5c/0x84
[ 33.5464[ 5732.108124690,5] OPAL: Reboot request...
25] Rebooting in 10 seconds..

---uname output---
Linux ltciofvtr-bostonlc1 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:00 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = Boston-LC

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 1. configure the kdump over local directory and trigger the kdump using following command
 echo c > /proc/sysrq-trigger

Contact Information = <email address hidden>,<email address hidden>

Stack trace output:
 no

Oops output:
 no

Userspace tool common name: kdump

The userspace tool has the following bit modes: 64 bits

Userspace rpm: NA

System Dump Info:
  The system is not configured to capture a system dump.

Userspace tool obtained from project website: na

*Additional Instructions for <email address hidden>,<email address hidden>:
-Post a private note with access information to the machine that the bug is occuring on.
-Attach ltrace and strace of userspace application.
-Attach sysctl -a output output to the bug.

== Comment: #34 - Hari Krishna Bathini <email address hidden> - 2019-10-10 05:38:59 ==

Can we get the the following documented below *Crash Kernel recommendations* section
in https://wiki.ubuntu.com/ppc64el/Recommendations

--
*Configuring Dump Capturing Support (KDump/FADump) on a system*

Since the memory required to boot capture Kernel is a moving target that depends
on many factors like hardware attached to the system, kernel and modules in use,
packages installed and services enabled, there is no one-size-fits-all. So,
please take the above recommendations with a pinch of salt and remember to try
capturing dump a few times to confirm that the system is configured successfully
with dump capturing support. Remember to retry dump capturing whenever:

    a) a kernel is updated
    b) new packages are installed
    c) new services are enabled
    d) boot/sysctl parameters are changed

Note that the above recommendation applies to both KDump and FADump
dump capturing mechanisms.
--

Thanks
Hari

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-168595 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → ubuntu-docs (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

This is not a bug about the Ubuntu desktop guide. Please stop using ubuntu-docs for bugs like this.

no longer affects: ubuntu-docs (Ubuntu)
Revision history for this message
Frank Heimes (fheimes) wrote :

I just added the requested section to the wiki:
https://wiki.ubuntu.com/ppc64el/Recommendations#Configuring_Dump_Capturing_Support_.28KDump.2FFADump.29
and closing this bug.

no longer affects: linux (Ubuntu)
Changed in ubuntu-power-systems:
status: New → Fix Released
importance: Undecided → Medium
assignee: Canonical Kernel Team (canonical-kernel-team) → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-01-22 07:30 EDT-------
thought 'ubuntu-docs' is for Ubuntu documentation. Please change to generic Ubuntu for the updating the documentation please

Revision history for this message
Frank Heimes (fheimes) wrote :

Since it's already done (see LP comment #2) I removed the affected package entirely
and just marked the remaining project entry as Fix Released.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-01-23 00:18 EDT-------
(In reply to comment #42)
> I just added the requested section to the wiki:
> https://wiki.ubuntu.com/ppc64el/
> Recommendations#Configuring_Dump_Capturing_Support_.28KDump.2FFADump.29
> and closing this bug.

Does it take time for the change to reflect?
I don't find the proposed text yet on
https://wiki.ubuntu.com/ppc64el/Recommendations

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-01-23 00:21 EDT-------
Moreover, last edited for the page shows this:

"ppc64el/Recommendations (last edited 2018-07-28 20:59:48 by mranweil)"

Revision history for this message
Frank Heimes (fheimes) wrote :

I took a while, but I can see it over here:
https://wiki.ubuntu.com/ppc64el/Recommendations#Configuring_Dump_Capturing_Support_.28KDump.2FFADump.29

And I also see an entry in the page history, revision #57, and the 'last edit' changed, too.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-01-23 03:00 EDT-------
thanks

Revision history for this message
Frank Heimes (fheimes) wrote :

your're welcome

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers