mtcClient BMC provisioning is not cleared when BMC is deprovisioned with system command

Bug #2067925 reported by Eric MacDonald
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Eric MacDonald

Bug Description

Brief Description
-----------------
The mtcClient on controllers is provisioned with its peer controller's BMC address and credentials when that controller is provisioned through the system CLI.

However, when that controller is deprovisioned from the system CLI that same mtcClient is not deprovisioned.

Impact: If an administrator provisioned the BMC for a controller and the peer controller's mtcClient gets that BMC provisioning data and the system administrator decides to deprovision that controller's BMC then the mtcClient on that controller retains the previous provisioning info. If then SM experiences a peer controller issue that leads to wanting to reset it via the BMC through the mtcClient then that reset attempt will proceed. If the admin deprovisions a BMC then it should be deprovisioned everywhere, including the peer controller's mtcClient. Effectively under a controller failure case that unprovisioned BMC data may be used to reset that controller where the system administrator expected it not to.

Severity
--------
Minor

Steps to Reproduce
------------------
provision the BMC of both controllers and then deprovision the BMC

Expected Behavior
------------------
BMC should get deprovisioned from the mtcClient

Actual Behavior
----------------
BMC is not deprovisioned from the mtcClient

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Any

Branch/Pull Time/Commit
-----------------------
Any load built prior to the resolution of this bug report

Last Pass
---------
Test escape

Timestamp/Logs
--------------
deprovisioned case

2024-06-03T15:26:11.697 [7648.00164] controller-0 mtcClient --- mtcNodeComp.cpp (2448) load_mtcInfo_msg : Info : controller-1 is my peer [host:192.168.204.3 bmc:none:none:none]

Test Activity
-------------
Normal use.

Workaround
----------
sudo pmon-restart mtcClient

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to metal (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/metal/+/921301

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (master)

Reviewed: https://review.opendev.org/c/starlingx/metal/+/921301
Committed: https://opendev.org/starlingx/metal/commit/508b619400a7ce23ce5298bdbd5e2703e07bb4ec
Submitter: "Zuul (22348)"
Branch: master

commit 508b619400a7ce23ce5298bdbd5e2703e07bb4ec
Author: Eric MacDonald <email address hidden>
Date: Tue Jun 4 15:35:10 2024 +0000

    Deprovision mtcClient bmc info when bmc for node is deprovisioned

    A node's BMC is provisioned and deprovisioned through the system CLI.

    Maintenance shares controller node BMC provisioning info with
    the mtcClient on each controller node. The mtcClient uses this
    BMC provisioning info to reset its peer controller when it sees
    the appropriate signal from SM (a flag file).

    However, when a controller node's BMC is deprovisioned from the
    system CLI, the mtcAgent does not send a the deprovisioned data
    to the mtcClient. Without getting the deprovisioning data the
    mtcClient will continue to use the previous provisioning data.
    This is incorrect and the reason for this fix.

    This update fixes this by having the mtcAgent periodically share
    controller node BMC provisioning data to each controller's mtcClient
    regardless of its provisioning state. The BMC provisioning data update
    period remains the same as it was while the BMCs were provisioned.

    This update also offers the followiong messaging/logging improvements.

     - restrict the updates to the management network only.
       There is no need to send the same data over the pxeboot.

     - stop logging while the BMC is deprovisioned. The absence/presence
       of the logs is sufficient to know what the provisioning state is
       without needlessly logging when the BMCs are not provisioned.

     - bypasses sending the bmc provisioning data to the controller-0
       mtcClient in an SX system. The data is only needed in a DX system.

    Test Plan:

    PASS: Verify mtcClient gets BMC deprovisioning data ; fix for this bug.
    PASS: Verify mtcClient periodically logs valid BMC provisioning data.
    PASS: Verify mtcClient doesn't log unprovisioned BMC provisioning data.
    PASS: Verify mtcAgent does not send bmc provision data on SX systems.
    PASS: Verify mtcAgent does send bmc provision data on DX systems.
    PASS: Verify worker and storage never receive bmc provisioning data.

    Closes-Bug: 2067925
    Change-Id: I29e5eb0b072ee38358d99d682555c466de322f2d
    Signed-off-by: Eric MacDonald <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.10.0 stx.metal
Changed in starlingx:
assignee: nobody → Eric MacDonald (rocksolidmtce)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.