Ubuntu
linux package

Comment 11 for bug 1764246

Revision history for this message

C de-Avillez (hggdh2) wrote on 2019-10-01:

#11

I had two up-to-date Azure instances (16.04 and 18.04) where Linux-crashdump was installed. Both failed, with the following being shown at the end of the serial console:

[ OK ] Started Dispatch Password Requests to Console Directory Watch.
[ OK ] Reached target Local Encrypted Volumes.
[ 18.407975] Out of memory: Kill process 496 (cloud-init) score 127 or sacrifice child
[ 18.417791] Killed process 496 (cloud-init) total-vm:66840kB, anon-rss:12720kB, file-rss:0kB, shmem-rss:0kB
[FAILED] Failed to start Initial cloud-init job (pre-networking).
See 'systemctl status cloud-init-local.service' for details.
[ OK ] Reached target Network (Pre).
         Starting Network Service...
[ OK ] Started Network Service.
[ OK ] Reached target Network.
         Starting Wait for Network to be Configured...
[ OK ] Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
[ 37.269354] Out of memory: Kill process 657 (snap) score 40 or sacrifice child
[ 37.282042] Killed process 657 (snap) total-vm:95160kB, anon-rss:4008kB, file-rss:0kB, shmem-rss:0kB
[* ] (2 of 3) A start job is running for… to be Configured (32s / no limit)[ 47.413761] Out of memory: Kill process 656 (snap) score 41 or sacrifice child
[ 47.422837] Killed process 656 (snap) total-vm:242624kB, anon-rss:4072kB, file-rss:0kB, shmem-rss:0kB
[ OK ] Found device Virtual_Disk 1.
         Starting File System Check on /dev/disk/cloud/azure_resource-part1...
[ *** ] (3 of 3) A start job is running for…re_resource-part1 (44s / no limit)[ 56.870810] Out of memory: Kill process 662 (snap) score 41 or sacrifice child
[ 56.887294] Killed process 662 (snap) total-vm:242880kB, anon-rss:4116kB, file-rss:0kB, shmem-rss:0kB
[ OK ] Started File System Check Daemon to report status.
[ 71.643108] Out of memory: Kill process 660 (snap) score 42 or sacrifice child
[ 71.652187] Killed process 660 (snap) total-vm:242880kB, anon-rss:4220kB, file-rss:0kB, shmem-rss:0kB
[ OK ] Started File System Check on /dev/disk/cloud/azure_resource-part1.
[ TIME ] Timed out waiting for device dev-disk-by\x2dlabel-UEFI.device.
[DEPEND] Dependency failed for /boot/efi.
[DEPEND] Dependency failed for Local File Systems.
[ OK ] Started Emergency Shell.
[ OK ] Reached target Emergency Mode.
[ 114.872426] Out of memory: Kill process 655 (snap) score 49 or sacrifice child
[ 114.881579] Killed process 655 (snap) total-vm:243936kB, anon-rss:4844kB, file-rss:0kB, shmem-rss:0kB
         Starting Tell Plymouth To Write Out Runtime Data...
         Starting Create Volatile Files and Directories...
         Starting AppArmor initialization...
You are in emergency mode. After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, "systemctl default" or "exit"
to boot into default mode.
Press Enter for maintenance
(or press Control-D to continue):

So, no more kernel panics, but
(1) still no kdump saved;
(2) the servers end waiting for user intervention (which is quite bad for a cloud instance).

Both servers had a cmdline with "crashkernel=512M-:192M". I edited /etc/default/grub.d/kdump-tools.cfg, changed crashkernel to 256M, and ran update-grub.

Re-tested, and it worked.

I had two up-to-date Azure instances (16.04 and 18.04) where Linux-crashdump was installed. Both failed, with the following being shown at the end of the serial console:

[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
[   18.407975] Out of memory: Kill process 496 (cloud-init) score 127 or sacrifice child
[   18.417791] Killed process 496 (cloud-init) total-vm:66840kB, anon-rss:12720kB, file-rss:0kB, shmem-rss:0kB
[FAILED] Failed to start Initial cloud-init job (pre-networking).
See 'systemctl status cloud-init-local.service' for details.
[  OK  ] Reached target Network (Pre).
         Starting Network Service...
[  OK  ] Started Network Service.
[  OK  ] Reached target Network.
         Starting Wait for Network to be Configured...
[  OK  ] Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
[   37.269354] Out of memory: Kill process 657 (snap) score 40 or sacrifice child
[   37.282042] Killed process 657 (snap) total-vm:95160kB, anon-rss:4008kB, file-rss:0kB, shmem-rss:0kB
[*     ] (2 of 3) A start job is running for… to be Configured (32s / no limit)[   47.413761] Out of memory: Kill process 656 (snap) score 41 or sacrifice child
[   47.422837] Killed process 656 (snap) total-vm:242624kB, anon-rss:4072kB, file-rss:0kB, shmem-rss:0kB
[  OK  ] Found device Virtual_Disk 1.
         Starting File System Check on /dev/disk/cloud/azure_resource-part1...
[ ***  ] (3 of 3) A start job is running for…re_resource-part1 (44s / no limit)[   56.870810] Out of memory: Kill process 662 (snap) score 41 or sacrifice child
[   56.887294] Killed process 662 (snap) total-vm:242880kB, anon-rss:4116kB, file-rss:0kB, shmem-rss:0kB
[  OK  ] Started File System Check Daemon to report status.
[   71.643108] Out of memory: Kill process 660 (snap) score 42 or sacrifice child
[   71.652187] Killed process 660 (snap) total-vm:242880kB, anon-rss:4220kB, file-rss:0kB, shmem-rss:0kB
[  OK  ] Started File System Check on /dev/disk/cloud/azure_resource-part1.
[ TIME ] Timed out waiting for device dev-disk-by\x2dlabel-UEFI.device.
[DEPEND] Dependency failed for /boot/efi.
[DEPEND] Dependency failed for Local File Systems.
[  OK  ] Started Emergency Shell.
[  OK  ] Reached target Emergency Mode.
[  114.872426] Out of memory: Kill process 655 (snap) score 49 or sacrifice child
[  114.881579] Killed process 655 (snap) total-vm:243936kB, anon-rss:4844kB, file-rss:0kB, shmem-rss:0kB
         Starting Tell Plymouth To Write Out Runtime Data...
         Starting Create Volatile Files and Directories...
         Starting AppArmor initialization...
You are in emergency mode. After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, "systemctl default" or "exit"
to boot into default mode.
Press Enter for maintenance
(or press Control-D to continue):

So, no more kernel panics, but
  (1) still no kdump saved;
  (2) the servers end waiting for user intervention (which is quite bad for a cloud instance).

Both servers had a cmdline with "crashkernel=512M-:192M". I edited /etc/default/grub.d/kdump-tools.cfg, changed crashkernel to 256M, and ran update-grub.

Re-tested, and it worked.

Ubuntulinux package

Comment 11 for bug 1764246

Ubuntu
linux package