Core dump is not consistently generated for processes killed in a container

Bug #1892951 reported by Ghada Khalil
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Michel Thebeau [WIND]

Bug Description

Brief Description
-----------------
Core dump is not consistently generated for processes running in a container.

Severity
--------
Minor

Steps to Reproduce
------------------
1. run a pod
kubectl run ng --image=nginx
2. login to the console of the pod
kubectl exec ng -it – /bin/bash
3. find a nginx process and kill it.
root@ng:/# ls -l /proc | grep nginx
dr-xr-xr-x 9 nginx nginx 0 Jul 28 15:54 60
root@ng:/# kill -6 60
4. check if a core dump file is generated under '/var/lib/systemd/coredump/' on host.

Expected Behavior
------------------
A coredump file is consistently generated

Actual Behavior
----------------
A coredump file is not always generated

Reproducibility
---------------
Intermittent

System Configuration
--------------------
any

Branch/Pull Time/Commit
-----------------------
stx master as of 2020-06-28, but likely a day 1 issue

Last Pass
---------
N/A - new test-case

Timestamp/Logs
--------------
N/A

Test Activity
-------------
Other

Workaround
----------
For each kubernetes node (host):
echo '|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %e' > /proc/sys/kernel/core_pattern

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Michel Thebeau [WIND] (mthebeau)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.5.0 / medium priority - would be good to fix to help w/ debugging container issues

Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
tags: added: stx.5.0 stx.distro.other
Revision history for this message
Michel Thebeau [WIND] (mthebeau) wrote :

The fix seems straight forward, use %P instead of %p in /proc/sys/kernel/core_pattern, so that systemd-coredump looks for the correct process from the host's perspective. The inconsistency of reproducing, which I have not witnessed, probably comes from randomly dumping process information for PIDs that happen to match the PID from container namespace (%p). The package to fix is 'stx-extensions'; see /etc/sysctl.d/50-coredump.conf

The workaround for debugging sessions is to fix /proc/sys/kernel/core_pattern on the hosts (kubernetes nodes) when one is debugging:
<code>
# cat /proc/sys/kernel/core_pattern
|/usr/lib/systemd/systemd-coredump %p %u %g %s %t %e
# echo '|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %e' > /proc/sys/kernel/core_pattern
</code>

description: updated
Revision history for this message
Michel Thebeau [WIND] (mthebeau) wrote :

On the subject of why systemd-coredump appears to succeed intermittently: systemd-coredump coredumps when the PID indicated by %p exists on the host. So basically, when a pod first runs the PIDs in the pod's namespace are all low, and more likely match PIDs in the host's namespace. systemd-coredump is probably coredumping when it tries to make a stack trace.

When the coredump does succeed, %p does not match a PID on the host. dmesg still shows "Failed to get COMM" and "Failed to get EXE". Both comm (process name) and exe (exe file path) are in /proc filesystem. The core file looks ok.

When using %P, the pid refers to the host's view of the process, so it exists, and systemd-coredump does not fail or print errors in dmesg.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (master)

Fix proposed to branch: master
Review: https://review.opendev.org/748569

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
Michel Thebeau [WIND] (mthebeau) wrote :

Sample segfault in kernel log, when the pid %p matches a process in host's namespace:
2020-08-28T12:47:59.803 worker-0 kernel: info [48389.206167] systemd-coredum[800292]: segfault at 0 ip 000055c9546d9b98 sp 00007fff58e9c400 error 6 in systemd-coredump[55c9546cd000+1e000]

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to utilities (master)

Reviewed: https://review.opendev.org/748569
Committed: https://git.openstack.org/cgit/starlingx/utilities/commit/?id=584b23bd1527bb2666952c6427013ce4e7728934
Submitter: Zuul
Branch: master

commit 584b23bd1527bb2666952c6427013ce4e7728934
Author: Michel Thebeau <email address hidden>
Date: Wed Aug 26 16:06:12 2020 -0400

    stx-extensions: use the host's PID for coredump

    systemd-coredump running as root on the host wants to get process
    information using the process' PID, but %p refers to "the PID namespace
    in which the process resides" (the container). systemd-coredump
    coredumps. Dmesg shows "Failed to get EXE" and the segfault (sic
    'systemd-coredum').

    This segfault of systemd-coredump is intermittent, and so on occasion a
    core file may actually be dumped for the containerized process. This
    happens when the %p does not match a process from the host's
    perspective. Dmesg shows "Failed to get COMM..." and "Failed to get
    EXE". This becomes more likely as the PIDs in container's namespace
    become larger - the host's PIDs are more sparse as the numbers increase.

    Use %P instead, "as seen in the initial PID namespace" (host).

    Convert the package to use PKG_GITREVCOUNT for release increment.

    Closes-Bug: 1892951
    Change-Id: Ifa5017d5997d12891893fc97fac4487ddfbbbbb8
    Signed-off-by: Michel Thebeau <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/c/starlingx/utilities/+/792213

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.