libvirtd core dump files generated after system setup

Bug #1841987 reported by Peng Peng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Confirmed
Low
zhao.shuai

Bug Description

Brief Description
-----------------
After SX system initial and sanity test, few libvirtd core dump files generated. First one was after system setup. two of them was generated during host unlock

Severity
--------
Major

Steps to Reproduce
------------------
Bring up SX system lab
host lock/unlock

TC-name: installation & sanity

Expected Behavior
------------------
no coredump files

Actual Behavior
----------------
coredump files generated

Reproducibility
---------------
Seen once

System Configuration
--------------------
One node system

Lab-name: SM-3

Branch/Pull Time/Commit
-----------------------
Load: 2019-08-28_00-10-00
Job: Titanium_R6_build

Last Pass
---------
Load: 20190828T013000Z
Job: STX_build_master_master

Timestamp/Logs
--------------
controller-0:/var/lib/systemd/coredump$ ls -l
total 3596
-rw-r----- 1 root root 956760 Aug 29 15:54 core.libvirtd.0.111cc7803ce045e9bbff8d5f15c43ba8.161704.1567094052000000.xz
-rw-r----- 1 root root 909172 Aug 29 08:07 core.libvirtd.0.238c43b105d14f1787846cb334fef597.409081.1567066026000000.xz
-rw-r----- 1 root root 915116 Aug 29 08:37 core.libvirtd.0.dfb8f0b6261449a7aa67e13fe96d0673.131394.1567067829000000.xz
-rw-r-----+ 1 root root 876800 Aug 29 07:58 core.libvirtd.42425.238c43b105d14f1787846cb334fef597.713586.1567065512000000.xz

[2019-08-29 07:03:23,224] 630 INFO MainThread fresh_install_helper.run_lab_setup:: running lab_setup.sh
[2019-08-29 07:58:03,367] 1288 INFO MainThread fresh_install_helper.wait_for_hosts_ready:: Checking floating ip: 128.224.150.81 connectivity ...

[2019-08-29 08:06:12,597] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0

[2019-08-29 08:36:13,804] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

Test Activity
-------------
Sanity

Revision history for this message
Peng Peng (ppeng) wrote :
Revision history for this message
Peng Peng (ppeng) wrote :
Revision history for this message
Peng Peng (ppeng) wrote :
Numan Waheed (nwaheed)
tags: added: stx.retestneeded
Revision history for this message
Frank Miller (sensfan22) wrote :

Marking as medium priority and stx.3.0 gating as core dumps files are unexpected.

Changed in starlingx:
status: New → Triaged
importance: Undecided → Medium
tags: added: stx.3.0
Revision history for this message
Frank Miller (sensfan22) wrote :

Assigning to distro.other PL; Cindy please determine who should investigate.

Cindy Xie (xxie1)
Changed in starlingx:
assignee: nobody → Bin Yang (byangintel)
Revision history for this message
Bin Yang (byangintel) wrote :

Based on etc/build.info, it is not from cengn.

I tried to get a similar libvirt debuginfo from http://mirror.starlingx.cengn.ca/mirror/starlingx/release/2.0.0/centos/outputs/RPMS/std/libvirt-debuginfo-4.7.0-1.tis.101.x86_64.rpm

Based on kernel log:
./var/log/kern.log:2019-08-29T07:58:32.687 controller-0 kernel: info [ 4068.847731] libvirtd[713586]: segfault at 8 ip 00007f050e93c73c sp 00007ffea29a2130 error 4 in libvirt.so.0.4007.0[7f050e8b7000+334000]
./var/log/kern.log:2019-08-29T08:07:06.779 controller-0 kernel: info [ 4582.879492] libvirtd[409081]: segfault at 8 ip 00007f88be31c73c sp 00007ffc26addbe0 error 4 in libvirt.so.0.4007.0[7f88be297000+334000]
./var/log/kern.log:2019-08-29T08:37:09.127 controller-0 kernel: info [ 1624.857680] libvirtd[131394]: segfault at 8 ip 00007f9033c3773c sp 00007ffe948ed2b0 error 4 in libvirt.so.0.4007.0[7f9033bb2000+334000]
./var/log/kern.log:2019-08-29T15:54:12.410 controller-0 kernel: info [26047.774762] libvirtd[161704]: segfault at 8 ip 00007fc3f4d2b73c sp 00007ffd31a71cf0 error 4 in libvirt.so.0.4007.0[7fc3f4ca6000+334000]

All of them were crashed at same position: 8573c @ libvirt.so.0.4007.0

Based on the disassembly of libvirt.so.0.4007.0 (it might not exactly same as your private build):

util/virauth.c:64
int
virAuthGetConfigFilePathURI(virURIPtr uri,
                            char **path)
{
    size_t i;
    const char *authenv = getenv("LIBVIRT_AUTH_FILE");
    VIR_AUTOFREE(char *) userdir = NULL;
... ...
    if (uri) {
        for (i = 0; i < uri->paramsCount; i++) {
            if (STREQ_NULLABLE(uri->params[i].name, "authfile") &&
                uri->params[i].value) {
                VIR_DEBUG("Using path from URI '%s'", uri->params[i].value);
line64 ==> if (VIR_STRDUP(*path, uri->params[i].value) < 0)
                    return -1;
                return 0;
            }
        }

@peng peng
Could you please provide the debuginfo rpm? It needs to double check whether above disassembly is match?

Revision history for this message
Bin Yang (byangintel) wrote :

additional info:

based on segfault at 8 ip 00007f9033c3773c, it was caused by memory access at 8.

Look into uri->params[i].value;
struct _virURIParam {
    char *name; /* Name (unescaped). */
    char *value; /* Value (unescaped). */
    bool ignore; /* Ignore this field in virURIFormatParams */
};

If uri->params[i] is NULL, it will access memory at 8.

It looks the uri is from remoteConnectOpen(). Did your tests have any specific configuration about libvirtd?

So far, I cannot reproduce it on my machine.

Revision history for this message
Bin Yang (byangintel) wrote :

Hi Peng peng,
    It needs your help for further debug. Could you please help to provide the corresponding debuginfo rpm?

thanks

Bin Yang (byangintel)
Changed in starlingx:
status: Triaged → Incomplete
Revision history for this message
Bin Yang (byangintel) wrote :

@Peng peng

Could you please help to provide necessary info for further debug? If it cannot be reproduced, I will close this LP.

thanks

Peng Peng (ppeng)
tags: removed: stx.retestneeded
Yang Liu (yliu12)
tags: added: stx.retestneeded
Revision history for this message
Peng Peng (ppeng) wrote :

With recent SX sanity run, like Load: 2019-10-29_20-00-00, we did not see this issue again.

tags: removed: stx.retestneeded
Revision history for this message
Bin Yang (byangintel) wrote :

so can we close it?

Yang Liu (yliu12)
Changed in starlingx:
status: Incomplete → Invalid
Revision history for this message
Peng Peng (ppeng) wrote :

Issue was reproduced on DX system
Lab: WCP_76_77
Load: 2019-11-04_20-00-00

Revision history for this message
Peng Peng (ppeng) wrote :
Changed in starlingx:
status: Invalid → Confirmed
summary: - libvirtd core dump files generated after SX system setup
+ libvirtd core dump files generated after system setup
Revision history for this message
Bin Yang (byangintel) wrote :

please provide libvirt rpms, debuginfo rpms and srpm corresponding to this core file.

Changed in starlingx:
status: Confirmed → Incomplete
Revision history for this message
Peng Peng (ppeng) wrote :
Revision history for this message
Peng Peng (ppeng) wrote :
Revision history for this message
Peng Peng (ppeng) wrote :
Changed in starlingx:
status: Incomplete → Confirmed
Ghada Khalil (gkhalil)
tags: added: stx.distro.other
Revision history for this message
Ghada Khalil (gkhalil) wrote :

As per agreement with the community, moving unresolved medium priority bugs (< 100 days OR recently reproduced) from stx.3.0 to stx.4.0

tags: added: stx.4.0
removed: stx.3.0
yong hu (yhu6)
Changed in starlingx:
assignee: Bin Yang (byangintel) → Ambarish Das (hiambar)
Revision history for this message
Austin Sun (sunausti) wrote :

@Ambarish :
   any update for this issue ?

Thanks.
BR
Austin Sun.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Lowering the priority as there doesn't appear to be any system impact. This was also reported in previous stx releases and, therefore, will not hold up stx.4.0.

tags: removed: stx.4.0
Changed in starlingx:
importance: Medium → Low
Austin Sun (sunausti)
Changed in starlingx:
assignee: Ambarish Das (hiambar) → zhao.shuai (zhao.shuai.neusoft)
Revision history for this message
zhao.shuai (zhao.shuai.neusoft) wrote :

We have used the following versions of ISO for deployment testing:
    -> 20200529T033909Z
    -> 20200610 (local compiled ISO)
    -> 20200617T021347Z(latest_green_build)
The coredump files did not occur in test enviroment path (/var/lib/systemd/coredump) .
So recommend closing this ticket.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.