Out of memory running crashkernel in ubuntu18.04.1. (Regression/qla2xxx/ubuntu18.04.1/BostonLC)(Documentation?)

Bug #1860519 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Medium
Ubuntu on IBM Power Systems Bug Triage

Bug Description

== Comment: #0 - Naresh Bannoth <email address hidden> - 2018-06-06 01:49:02 ==
---Problem Description---
getting the following kernel panic message while trying to dump the crash over local directory of ubuntu18.04.1.

"Kernel panic - not syncing: Out of memory and no killable processes..."

This is a Regression BUG, as it worked fine in Ubuntu18.04 for local dump.

configuration details are as follows,

root@ltciofvtr-bostonlc1:~# free -g
              total used free shared buff/cache available
Mem: 123 20 97 0 5 101
Swap: 1 0 1
root@ltciofvtr-bostonlc1:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 59G 0 59G 0% /dev
tmpfs 13G 21M 13G 1% /run
/dev/sde2 5.5T 98G 5.1T 2% /
tmpfs 62G 0 62G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 62G 0 62G 0% /sys/fs/cgroup
tmpfs 13G 0 13G 0% /run/user/0
tmpfs 128K 0 128K 0% /var/lib/lxd/shmounts
tmpfs 128K 0 128K 0% /var/lib/lxd/devlxd
root@ltciofvtr-bostonlc1:~#

root@ltciofvtr-bostonlc1:~# service kdump-tools restart
root@ltciofvtr-bostonlc1:~# service kdump-tools status
? kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Tue 2018-06-05 06:51:45 EDT; 6s ago
  Process: 18478 ExecStop=/etc/init.d/kdump-tools stop (code=exited, status=0/SUCCESS)
  Process: 18519 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
 Main PID: 18519 (code=exited, status=0/SUCCESS)

Jun 05 06:51:44 ltciofvtr-bostonlc1 systemd[1]: Starting Kernel crash dump capture service...
Jun 05 06:51:44 ltciofvtr-bostonlc1 kdump-tools[18519]: Starting kdump-tools: * Creating symlink /var/lib/kdump/vmlinuz
Jun 05 06:51:44 ltciofvtr-bostonlc1 kdump-tools[18519]: * Creating symlink /var/lib/kdump/initrd.img
Jun 05 06:51:44 ltciofvtr-bostonlc1 kdump-tools[18519]: Modified cmdline:root=UUID=6e1afd6a-a199-4bc5-a324-b65d5607d03b ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb
Jun 05 06:51:45 ltciofvtr-bostonlc1 kdump-tools[18519]: * loaded kdump kernel
Jun 05 06:51:45 ltciofvtr-bostonlc1 systemd[1]: Started Kernel crash dump capture service.
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# cat /proc/cmdline
root=UUID=6e1afd6a-a199-4bc5-a324-b65d5607d03b ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M@128M
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.15.0-23-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.15.0-23-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p --command-line="root=UUID=6e1afd6a-a199-4bc5-a324-b65d5607d03b ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# cat /sys/kernel/kexec_crash_loaded
1
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~# cat /etc/default/kdump-tools | grep -i crash
KDUMP_COREDIR="/var/crash"
KDUMP_COREDIR="/var/crash"
# the crash dump. The syntax must be {HOSTNAME}:{MOUNTPOINT}
# (e.g. remote:/var/crash)
root@ltciofvtr-bostonlc1:~#
root@ltciofvtr-bostonlc1:~#

==============>>>>>>>> Snippet of Error logs,

[ OK ] Reached target System Time Synchronized.
[ 32.167186] Out of memory: Kill process 2958 (systemd-udevd) score 1 or sacrifice child
[ 32.167270] Killed process 2958 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1856kB, shmem-rss:0kB
[ 32.180509] Out of memory: Kill process 432 (systemd-network) score 1 or sacrifice child
[ 32.180565] Killed process 432 (systemd-network) total-vm:19520kB, anon-rss:0kB, file-rss:3520kB, shmem-rss:0kB
[ 32.188216] Out of memory: Kill process 2942 (systemd-udevd) score 1 or sacrifice child
[ 32.188278] Killed process 2942 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1408kB, shmem-rss:0kB
[ 32.196253] Out of memory: Kill process 2975 (systemd-udevd) score 1 or sacrifice child
[ 32.196317] Killed process 2975 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:1984kB, shmem-rss:0kB
[ 32.204652] Out of memory: Kill process 2949 (systemd-udevd) score 1 or sacrifice child
[ 32.204721] Killed process 2949 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1984kB, shmem-rss:0kB
[ 32.212555] Out of memory: Kill process 2956 (systemd-udevd) score 1 or sacrifice child
[ 32.212621] Killed process 2956 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1984kB, shmem-rss:0kB
[ 32.220553] Out of memory: Kill process 2944 (systemd-udevd) score 1 or sacrifice child
[ 32.220625] Killed process 2944 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:1664kB, shmem-rss:0kB
[ 32.229344] Out of memory: Kill process 2966 (systemd-udevd) score 1 or sacrifice child
[ 32.229403] Killed process 2966 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:1600kB, shmem-rss:0kB
[ 32.237213] Out of memory: Kill process 2950 (systemd-udevd) score 1 or sacrifice child
[ 32.237270] Killed process 3054 (sg_inq) total-vm:3776kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.245328] Out of memory: Kill process 2950 (systemd-udevd) score 1 or sacrifice child
[ 32.245387] Killed process 2950 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1664kB, shmem-rss:0kB
[ 32.252264] Out of memory: Kill process 2967 (systemd-udevd) score 1 or sacrifice child
[ 32.252323] Killed process 2967 (systemd-udevd) total-vm:23616kB, anon-rss:0kB, file-rss:1600kB, shmem-rss:0kB
[ 32.260355] Out of memory: Kill process 397 (systemd-journal) score 1 or sacrifice child
[ 32.260416] Killed process 397 (systemd-journal) total-vm:33088kB, anon-rss:0kB, file-rss:3136kB, shmem-rss:0kB
[ 32.268815] Out of memory: Kill process 3055 (systemd-udevd) score 1 or sacrifice child
[ 32.268874] Killed process 3055 (systemd-udevd) total-vm:23744kB, anon-rss:0kB, file-rss:512kB, shmem-rss:0kB
[ 32.276999] Out of memory: Kill process 2844 (systemd-timesyn) score 0 or sacrifice child
[ 32.277062] Killed process 2844 (systemd-timesyn) total-vm:90624kB, anon-rss:0kB, file-rss:448kB, shmem-rss:0kB
[ 32.285044] Out of memory: Kill process 307 (plymouthd) score 0 or sacrifice child
[ 32.285119] Killed process 307 (plymouthd) total-vm:6272kB, anon-rss:0kB, file-rss:2432kB, shmem-rss:0kB
[ 32.293051] Out of memory: Kill process 2725 (openibd) score 0 or sacrifice child
[ 32.293109] Killed process 2725 (openibd) total-vm:9344kB, anon-rss:0kB, file-rss:2176kB, shmem-rss:0kB
[ 32.301016] Out of memory: Kill process 2697 (lvmetad) score 0 or sacrifice child
[ 32.301074] Killed process 2697 (lvmetad) total-vm:154816kB, anon-rss:0kB, file-rss:1664kB, shmem-rss:0kB
[ 32.309056] Out of memory: Kill process 2925 (find) score 0 or sacrifice child
[ 32.309115] Killed process 2925 (find) total-vm:9536kB, anon-rss:0kB, file-rss:448kB, shmem-rss:0kB
[ 32.317071] Out of memory: Kill process 2723 (apparmor) score 0 or sacrifice child
[ 32.317129] Killed process 2924 (apparmor) total-vm:3520kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 32.325087] Out of memory: Kill process 2723 (apparmor) score 0 or sacrifice child
[ 32.325147] Killed process 2723 (apparmor) total-vm:3520kB, anon-rss:0kB, file-rss:704kB, shmem-rss:0kB
[ 32.331819] Out of memory: Kill process 608 (modprobe) score 0 or sacrifice child
[ 32.331882] Killed process 608 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.340009] Out of memory: Kill process 781 (modprobe) score 0 or sacrifice child
[ 32.340082] Killed process 781 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.347991] Out of memory: Kill process 1828 (modprobe) score 0 or sacrifice child
[ 32.348054] Killed process 1828 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.355660] Out of memory: Kill process 561 (modprobe) score 0 or sacrifice child
[ 32.355726] Killed process 561 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.364656] Out of memory: Kill process 679 (modprobe) score 0 or sacrifice child
[ 32.364715] Killed process 679 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.371720] Out of memory: Kill process 1130 (modprobe) score 0 or sacrifice child
[ 32.371782] Killed process 1130 (modprobe) total-vm:7680kB, anon-rss:0kB, file-rss:320kB, shmem-rss:0kB
[ 32.379795] Out of memory: Kill process 2926 (wc) score 0 or sacrifice child
[ 32.379860] Killed process 2926 (wc) total-vm:6336kB, anon-rss:0kB, file-rss:384kB, shmem-rss:0kB
[ 32.380553] infiniband mlx5_5: Couldn't open port 1
[ 32.388026] Kernel panic - not syncing: Out of memory and no killable processes...
[ 32.388026]
[ 32.388098] CPU: 1 PID: 3057 Comm: kworker/u8:18 Not tainted 4.15.0-23-generic #25-Ubuntu
[ 32.388182] Workqueue: mlx5_page_allocator pages_work_handler [mlx5_core]
[ 32.388233] Call Trace:
[ 32.388257] [c0000000e9e4b760] [c000000008cdeb7c] dump_stack+0xb0/0xf4 (unreliable)
[ 32.388320] [c0000000e9e4b7a0] [c00000000810d320] panic+0x148/0x328
[ 32.388372] [c0000000e9e4b840] [c0000000082e5730] out_of_memory+0x400/0x710
[ 32.388425] [c0000000e9e4b8e0] [c0000000082ed5dc] __alloc_pages_nodemask+0xfbc/0x1070
[ 32.388509] [c0000000e9e4bad0] [c008000004e6d080] give_pages+0x2d8/0x8c0 [mlx5_core]
[ 32.388593] [c0000000e9e4bc10] [c008000004e6da80] pages_work_handler+0x58/0x110 [mlx5_core]
[ 32.388655] [c0000000e9e4bc90] [c0000000081341f8] process_one_work+0x298/0x5a0
[ 32.388716] [c0000000e9e4bd20] [c000000008134598] worker_thread+0x98/0x630
[ 32.388767] [c0000000e9e4bdc0] [c00000000813d1c8] kthread+0x1a8/0x1b0
[ 32.388819] [c0000000e9e4be30] [c00000000800b658] ret_from_kernel_thread+0x5c/0x84
[ 33.5464[ 5732.108124690,5] OPAL: Reboot request...
25] Rebooting in 10 seconds..

---uname output---
Linux ltciofvtr-bostonlc1 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:00 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = Boston-LC

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 1. configure the kdump over local directory and trigger the kdump using following command
 echo c > /proc/sysrq-trigger

Contact Information = <email address hidden>,<email address hidden>

Stack trace output:
 no

Oops output:
 no

Userspace tool common name: kdump

The userspace tool has the following bit modes: 64 bits

Userspace rpm: NA

System Dump Info:
  The system is not configured to capture a system dump.

Userspace tool obtained from project website: na

*Additional Instructions for <email address hidden>,<email address hidden>:
-Post a private note with access information to the machine that the bug is occuring on.
-Attach ltrace and strace of userspace application.
-Attach sysctl -a output output to the bug.

== Comment: #34 - Hari Krishna Bathini <email address hidden> - 2019-10-10 05:38:59 ==

Can we get the the following documented below *Crash Kernel recommendations* section
in https://wiki.ubuntu.com/ppc64el/Recommendations

--
*Configuring Dump Capturing Support (KDump/FADump) on a system*

Since the memory required to boot capture Kernel is a moving target that depends
on many factors like hardware attached to the system, kernel and modules in use,
packages installed and services enabled, there is no one-size-fits-all. So,
please take the above recommendations with a pinch of salt and remember to try
capturing dump a few times to confirm that the system is configured successfully
with dump capturing support. Remember to retry dump capturing whenever:

    a) a kernel is updated
    b) new packages are installed
    c) new services are enabled
    d) boot/sysctl parameters are changed

Note that the above recommendation applies to both KDump and FADump
dump capturing mechanisms.
--

Thanks
Hari

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-168595 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → ubuntu-docs (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

This is not a bug about the Ubuntu desktop guide. Please stop using ubuntu-docs for bugs like this.

no longer affects: ubuntu-docs (Ubuntu)
Revision history for this message
Frank Heimes (fheimes) wrote :

I just added the requested section to the wiki:
https://wiki.ubuntu.com/ppc64el/Recommendations#Configuring_Dump_Capturing_Support_.28KDump.2FFADump.29
and closing this bug.

no longer affects: linux (Ubuntu)
Changed in ubuntu-power-systems:
status: New → Fix Released
importance: Undecided → Medium
assignee: Canonical Kernel Team (canonical-kernel-team) → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-01-22 07:30 EDT-------
thought 'ubuntu-docs' is for Ubuntu documentation. Please change to generic Ubuntu for the updating the documentation please

Revision history for this message
Frank Heimes (fheimes) wrote :

Since it's already done (see LP comment #2) I removed the affected package entirely
and just marked the remaining project entry as Fix Released.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-01-23 00:18 EDT-------
(In reply to comment #42)
> I just added the requested section to the wiki:
> https://wiki.ubuntu.com/ppc64el/
> Recommendations#Configuring_Dump_Capturing_Support_.28KDump.2FFADump.29
> and closing this bug.

Does it take time for the change to reflect?
I don't find the proposed text yet on
https://wiki.ubuntu.com/ppc64el/Recommendations

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-01-23 00:21 EDT-------
Moreover, last edited for the page shows this:

"ppc64el/Recommendations (last edited 2018-07-28 20:59:48 by mranweil)"

Revision history for this message
Frank Heimes (fheimes) wrote :

I took a while, but I can see it over here:
https://wiki.ubuntu.com/ppc64el/Recommendations#Configuring_Dump_Capturing_Support_.28KDump.2FFADump.29

And I also see an entry in the page history, revision #57, and the 'last edit' changed, too.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-01-23 03:00 EDT-------
thanks

Revision history for this message
Frank Heimes (fheimes) wrote :

your're welcome

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.