hbsAgent cluster delete swerr log

Bug #1931911 reported by Eric MacDonald
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Eric MacDonald

Bug Description

Brief Description
-----------------
The mtcAgent does not track the stopped or started heartbeat state of a host, that is left to the heartbeat service itself in response to the mtcAgent commanding heartbeat start and stop based on current running state.

Therefore heartbeat stop command is sometimes called against a host that is already in the stopped state.

The stop command results in a call to delete a host from the heartbeat cluster (hbs_cluster_del) which can result in this Swerr (Software Error) log if its not already in the cluster ; i.e. already in the heartbeat stop state.

The product should not have success path Swerr logs.

Severity
--------
Minor ; no customer impact

Steps to Reproduce
------------------
Stop heartbeat of a host

Expected Behavior
------------------
No success path Swerr logs

Actual Behavior
----------------
Success path Swerr log

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Any

Branch/Pull Time/Commit
-----------------------
stx 4.0

Last Pass
---------
N/A

Timestamp/Logs
--------------
<date> [96280.03388] <hostname> hbsAgent hbs hbsCluster.cpp ( 367) hbs_cluster_del :Swerr : compute-0 not found in cluster list

Test Activity
-------------
Developer Testing

Workaround
----------
None ; not required

Tags: stx.metal
summary: - Remove swerr log in hbsAgent cluster delete
+ swerr log in hbsAgent cluster delete
summary: - swerr log in hbsAgent cluster delete
+ hbsAgent cluster delete swerr log
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to metal (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/metal/+/796324

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (master)

Reviewed: https://review.opendev.org/c/starlingx/metal/+/796324
Committed: https://opendev.org/starlingx/metal/commit/d6932f49d7b57eb48ac3f281a4ce00bade6c5287
Submitter: "Zuul (22348)"
Branch: master

commit d6932f49d7b57eb48ac3f281a4ce00bade6c5287
Author: Eric MacDonald <email address hidden>
Date: Mon Jun 14 19:04:33 2021 -0400

    Remove swerr log in hbsAgent cluster delete

    The mtcAgent does not track the stopped or started
    heartbeat state of a host, that is left to the
    heartbeat service itself in response to the mtcAgent
    commanding heartbeat start and stop based on current
    running state.

    Therefore heartbeat stop command is sometimes called
    against a host that is already in the stopped state.

    The heartbeat stop command results in a call in the
    hbsAgent to delete a host from the heartbeat cluster;
    hbs_cluster_del.

    If that host is not already in the cluster then this
    call can result in a Swerr (Software Error) log.

    This update removes this success path Swerr log.

    Change-Id: Idb96a791a932827749e329a123f60006ff7c48ec
    Closes-Bug: 1931911
    Signed-off-by: Eric MacDonald <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
assignee: nobody → Eric MacDonald (rocksolidmtce)
tags: added: stx.metal
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.