mtcAgent seen to core dump on process exit
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Low
|
Eric MacDonald |
Bug Description
A mtcAgent core dump was observed during unit testing of a fix for another issue that involved rebooting the active controller
controller-0:~$ ls -lrt /var/lib/
-rw-r----- 1 root root 275348 Jun 6 19:56 core.mtcAgent.
Debug of the coredump showed that it occurred inside the nodeLinkClass destructor.
#0 0x00007fcf640b34af in _int_free () from /lib64/libc.so.6
#1 0x0000000000452312 in std::_List_
#2 0x00000000004a1a82 in nodeLinkClass:
#3 0x00007fcf6406bb69 in __run_exit_handlers () from /lib64/libc.so.6
#4 0x00007fcf6406bbb7 in exit () from /lib64/libc.so.6
#5 0x0000000000414411 in daemon_exit() ()
#6 0x00000000004ad4e5 in daemon_
#7 0x0000000000417a5b in daemon_
#8 0x0000000000406585 in main ()
Severity
--------
Minor: not service impacting
Steps to Reproduce
------------------
reboot the active controller
Expected Behavior
------------------
no core dump
Actual Behavior
----------------
occasional core dump
Reproducibility
---------------
Intermittent: 1 in 10
System Configuration
-------
Any
Branch/Pull Time/Commit
-------
SW_VERSION="19.01" rebase as of "2019-06-05 18:32:46"
Last Pass
---------
Unknown
Changed in starlingx: | |
assignee: | nobody → Eric MacDonald (rocksolidmtce) |
status: | New → In Progress |
Ran hundreds of mtcAgent process kills -INT and -TERM overnight nd did not get any core dumps
while true bin/mtcAgent -l -a systemd/ coredump/
do
sm-unmanage service mtc-agent
pkill -term mtcAgent
sleep 3
date
/usr/local/
s=$((1 + RANDOM % 30))
echo "sleeping $s seconds"
sleep $s
ps -efL | grep mtcAgent
ls /var/lib/
sleep 2
done
and another variation that restarted mtcAgent by SM.