stx-openstack: guestAgent core dumps on Debian

Bug #2000168 reported by Thales Elero Cervi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Rafael Falcão

Bug Description

Brief Description
-----------------
First Automated Sanity execution with STX-Openstack showed that now on Debian, when the application is applied guestAgent core dumps [1] are constantly generated. This did not happen for Sanity tests without stx-openstack applied.

[1] /var/lib/systemd/coredump/core.guestAgent.*.zst

Severity
--------
Minor: System/Feature is usable but several test result are wrongly marked as Failed due to these generated files

Steps to Reproduce
------------------
* Install the latest_build Debian ISO
* Upload stx-openstack (Debian stx)
* Apply stx-openstack
* Run stx-openstack Sanity (test automation)

Expected Behavior
------------------
No core dumps are generated

Actual Behavior
----------------
Several guestAgent coredumps can be found at /var/lib/systemd/coredump/core.guestAgent.*.zst

Reproducibility
---------------
Reproducible

System Configuration
--------------------
AIO-DX

Branch/Pull Time/Commit
-----------------------
master:
* starlingx/master/debian/monolithic/20221218T070000Z

Last Pass
---------
N/A

Timestamp/Logs
--------------
Teardown started:
***Failure at test teardown: <...>stx-sanity-duplex/CGCSAuto/testfixtures/verify_fixtures.py:XYZ: AssertionError:
Core dump or crash found on controller-0 :
[[
'-rw-r----- 1 root root 124142 2022-12-14_21-24-26 core.guestAgent.0.b18325c0f61344409cc4f14f7da70223.2617040.1671053064000000.zst',
'-rw-r----- 1 root root 124464 2022-12-14_21-24-59 core.guestAgent.0.b18325c0f61344409cc4f14f7da70223.2621249.1671053095000000.zst'
], []]
-----------
Test Failed at test teardown

Test Activity
-------------
Sanity

Workaround
----------
Skip the core dumps check by the end of each test case.

description: updated
Changed in starlingx:
assignee: nobody → Rafael Falcão (rafaelvfalc)
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.distro.openstack
Revision history for this message
Rafael Falcão (rafaelvfalc) wrote :

We found out that the issue is related to a segfault that happens in each deamon initialization (or restart).

[sysadmin@controller-0 ~(keystone_admin)]$ dmesg | grep -i 180638 [ 1224.468971] guestAgent[180638]: segfault at 0 ip 00007fbae5f74208 sp 00007ffcfd7ec830 error 4 in libc-2.31.so[7fbae5f0f000+14b000]

We will be trying to find a solution for this issue in the nfv service files. Another option can be deactivate the service in the system (if the service is no longer needed). We tried to remove the puppet provision and deprovision commands but it was not successful in order to start the platform without the service. Another modifications in other repos might be needed to deactivate the service.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/869474

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/869817

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tools (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/tools/+/869818

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nfv (master)

Reviewed: https://review.opendev.org/c/starlingx/nfv/+/869817
Committed: https://opendev.org/starlingx/nfv/commit/bfded2ded62263695ec37fb6214eda7b191c1cbc
Submitter: "Zuul (22348)"
Branch: master

commit bfded2ded62263695ec37fb6214eda7b191c1cbc
Author: Rafael Falcao <email address hidden>
Date: Tue Jan 10 17:18:26 2023 -0300

    Deactivate guest related services

    The guest services are currently not being used since we
    went to containerized openstack. Since the service is no
    longer being used and it is currently causing coredump
    issues [1] in debian environment we are deactivating all
    related guest services. The service can be reactivated in
    the future if needed (we created a Storyboard [2] to keep
    track of this modification).

    [1] https://bugs.launchpad.net/starlingx/+bug/2000168
    [2] https://storyboard.openstack.org/#!/story/2010520

    Test Plan:
    PASS: Generate the debian image without the code that
    adds all the guest related services.
    PASS: Perform a install with the created debian image
    and check that no guest related services are being
    installed.

    Partial-Bug: 2000168

    Signed-off-by: Rafael Falcao <email address hidden>
    Change-Id: I55ed75a7dc02cb517b1343562c84cbaa5c685dd6

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tools (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/tools/+/870433

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tools (master)

Reviewed: https://review.opendev.org/c/starlingx/tools/+/870433
Committed: https://opendev.org/starlingx/tools/commit/5dcb14f2ab241f4cdfc7d0b4f53c2c90fc827400
Submitter: "Zuul (22348)"
Branch: master

commit 5dcb14f2ab241f4cdfc7d0b4f53c2c90fc827400
Author: Yue Tao <email address hidden>
Date: Sun Jan 15 15:42:58 2023 +0800

    Debian: remove the mtce-guest from stx-std.lst

    The https://review.opendev.org/c/starlingx/nfv/+/869817
    deactivates the mtce-guest services. Since we haven't finished
    the job of cleaning up stx-std.lst, still need to remove them from
    stx-std.lst as well.

    Partial-Bug: 2000168

    Signed-off-by: Yue Tao <email address hidden>
    Change-Id: I99b0376258d146e2bc7b758a214e02050f82fcf4

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tools (master)

Change abandoned by "Rafael Vieira Falcão <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/tools/+/869818
Reason: Duplicate of https://review.opendev.org/c/starlingx/tools/+/870433

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/869474
Committed: https://opendev.org/starlingx/stx-puppet/commit/3df98fe5b59fc5e5e65c5c725204f1821503d746
Submitter: "Zuul (22348)"
Branch: master

commit 3df98fe5b59fc5e5e65c5c725204f1821503d746
Author: Rafael Falcao <email address hidden>
Date: Fri Jan 6 09:34:18 2023 -0300

    Deactivate provision of the guest-agent service

    The guest-agent service it is currently being activated
    in setups where stx-openstack is applied but it's not
    being used since we went to containerized openstack.
    Since this service is no longer being used and it is
    currently causing coredump issues [1] in debian environment
    we are deactivating the provision of the service.
    The service can be reactivated in the future if needed (we
    created a StoryBoard [2] to keep track of this modification).

    [1] https://bugs.launchpad.net/starlingx/+bug/2000168
    [2] https://storyboard.openstack.org/#!/story/2010520

    Test Plan:
    PASS: Generate the debian image without the code that
    performes the provision of the guest-agent service.
    PASS: Verify that after applying the modification in a host
    WITH stx-openstack applied the guest-agent service it is not
    being provisioned.

    Depends-on: https://review.opendev.org/c/starlingx/tools/+/870433
    Depends-on: https://review.opendev.org/c/starlingx/nfv/+/869817

    Partial-Bug: 2000168

    Signed-off-by: Rafael Falcao <email address hidden>
    Change-Id: I914ad7caa241bcd42ae1393d943b533a4ad32c06

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Low → Medium
tags: added: stx.8.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.