libvirtd core dump files generated after system setup

Bug #1841987 reported by Peng Peng on 2019-08-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Medium
Bin Yang

Bug Description

Brief Description
-----------------
After SX system initial and sanity test, few libvirtd core dump files generated. First one was after system setup. two of them was generated during host unlock

Severity
--------
Major

Steps to Reproduce
------------------
Bring up SX system lab
host lock/unlock

TC-name: installation & sanity

Expected Behavior
------------------
no coredump files

Actual Behavior
----------------
coredump files generated

Reproducibility
---------------
Seen once

System Configuration
--------------------
One node system

Lab-name: SM-3

Branch/Pull Time/Commit
-----------------------
Load: 2019-08-28_00-10-00
Job: Titanium_R6_build

Last Pass
---------
Load: 20190828T013000Z
Job: STX_build_master_master

Timestamp/Logs
--------------
controller-0:/var/lib/systemd/coredump$ ls -l
total 3596
-rw-r----- 1 root root 956760 Aug 29 15:54 core.libvirtd.0.111cc7803ce045e9bbff8d5f15c43ba8.161704.1567094052000000.xz
-rw-r----- 1 root root 909172 Aug 29 08:07 core.libvirtd.0.238c43b105d14f1787846cb334fef597.409081.1567066026000000.xz
-rw-r----- 1 root root 915116 Aug 29 08:37 core.libvirtd.0.dfb8f0b6261449a7aa67e13fe96d0673.131394.1567067829000000.xz
-rw-r-----+ 1 root root 876800 Aug 29 07:58 core.libvirtd.42425.238c43b105d14f1787846cb334fef597.713586.1567065512000000.xz

[2019-08-29 07:03:23,224] 630 INFO MainThread fresh_install_helper.run_lab_setup:: running lab_setup.sh
[2019-08-29 07:58:03,367] 1288 INFO MainThread fresh_install_helper.wait_for_hosts_ready:: Checking floating ip: 128.224.150.81 connectivity ...

[2019-08-29 08:06:12,597] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0

[2019-08-29 08:36:13,804] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

Test Activity
-------------
Sanity

Peng Peng (ppeng) wrote :
Numan Waheed (nwaheed) on 2019-08-30
tags: added: stx.retestneeded
Frank Miller (sensfan22) wrote :

Marking as medium priority and stx.3.0 gating as core dumps files are unexpected.

Changed in starlingx:
status: New → Triaged
importance: Undecided → Medium
tags: added: stx.3.0
Frank Miller (sensfan22) wrote :

Assigning to distro.other PL; Cindy please determine who should investigate.

Cindy Xie (xxie1) on 2019-08-31
Changed in starlingx:
assignee: nobody → Bin Yang (byangintel)
Bin Yang (byangintel) wrote :

Based on etc/build.info, it is not from cengn.

I tried to get a similar libvirt debuginfo from http://mirror.starlingx.cengn.ca/mirror/starlingx/release/2.0.0/centos/outputs/RPMS/std/libvirt-debuginfo-4.7.0-1.tis.101.x86_64.rpm

Based on kernel log:
./var/log/kern.log:2019-08-29T07:58:32.687 controller-0 kernel: info [ 4068.847731] libvirtd[713586]: segfault at 8 ip 00007f050e93c73c sp 00007ffea29a2130 error 4 in libvirt.so.0.4007.0[7f050e8b7000+334000]
./var/log/kern.log:2019-08-29T08:07:06.779 controller-0 kernel: info [ 4582.879492] libvirtd[409081]: segfault at 8 ip 00007f88be31c73c sp 00007ffc26addbe0 error 4 in libvirt.so.0.4007.0[7f88be297000+334000]
./var/log/kern.log:2019-08-29T08:37:09.127 controller-0 kernel: info [ 1624.857680] libvirtd[131394]: segfault at 8 ip 00007f9033c3773c sp 00007ffe948ed2b0 error 4 in libvirt.so.0.4007.0[7f9033bb2000+334000]
./var/log/kern.log:2019-08-29T15:54:12.410 controller-0 kernel: info [26047.774762] libvirtd[161704]: segfault at 8 ip 00007fc3f4d2b73c sp 00007ffd31a71cf0 error 4 in libvirt.so.0.4007.0[7fc3f4ca6000+334000]

All of them were crashed at same position: 8573c @ libvirt.so.0.4007.0

Based on the disassembly of libvirt.so.0.4007.0 (it might not exactly same as your private build):

util/virauth.c:64
int
virAuthGetConfigFilePathURI(virURIPtr uri,
                            char **path)
{
    size_t i;
    const char *authenv = getenv("LIBVIRT_AUTH_FILE");
    VIR_AUTOFREE(char *) userdir = NULL;
... ...
    if (uri) {
        for (i = 0; i < uri->paramsCount; i++) {
            if (STREQ_NULLABLE(uri->params[i].name, "authfile") &&
                uri->params[i].value) {
                VIR_DEBUG("Using path from URI '%s'", uri->params[i].value);
line64 ==> if (VIR_STRDUP(*path, uri->params[i].value) < 0)
                    return -1;
                return 0;
            }
        }

@peng peng
Could you please provide the debuginfo rpm? It needs to double check whether above disassembly is match?

Bin Yang (byangintel) wrote :

additional info:

based on segfault at 8 ip 00007f9033c3773c, it was caused by memory access at 8.

Look into uri->params[i].value;
struct _virURIParam {
    char *name; /* Name (unescaped). */
    char *value; /* Value (unescaped). */
    bool ignore; /* Ignore this field in virURIFormatParams */
};

If uri->params[i] is NULL, it will access memory at 8.

It looks the uri is from remoteConnectOpen(). Did your tests have any specific configuration about libvirtd?

So far, I cannot reproduce it on my machine.

Bin Yang (byangintel) wrote :

Hi Peng peng,
    It needs your help for further debug. Could you please help to provide the corresponding debuginfo rpm?

thanks

Bin Yang (byangintel) on 2019-09-17
Changed in starlingx:
status: Triaged → Incomplete
Bin Yang (byangintel) wrote :

@Peng peng

Could you please help to provide necessary info for further debug? If it cannot be reproduced, I will close this LP.

thanks

Peng Peng (ppeng) on 2019-09-30
tags: removed: stx.retestneeded
Yang Liu (yliu12) on 2019-10-02
tags: added: stx.retestneeded
Peng Peng (ppeng) wrote :

With recent SX sanity run, like Load: 2019-10-29_20-00-00, we did not see this issue again.

tags: removed: stx.retestneeded
Bin Yang (byangintel) wrote :

so can we close it?

Yang Liu (yliu12) on 2019-10-31
Changed in starlingx:
status: Incomplete → Invalid
Peng Peng (ppeng) wrote :

Issue was reproduced on DX system
Lab: WCP_76_77
Load: 2019-11-04_20-00-00

Peng Peng (ppeng) wrote :
Changed in starlingx:
status: Invalid → Confirmed
summary: - libvirtd core dump files generated after SX system setup
+ libvirtd core dump files generated after system setup
Bin Yang (byangintel) wrote :

please provide libvirt rpms, debuginfo rpms and srpm corresponding to this core file.

Changed in starlingx:
status: Confirmed → Incomplete
Peng Peng (ppeng) wrote :
Peng Peng (ppeng) wrote :
Peng Peng (ppeng) wrote :
Changed in starlingx:
status: Incomplete → Confirmed
To post a comment you must log in.