Patching orchestration fails due to alarm not cleared
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Jessica Castelino |
Bug Description
Brief Description
-----------------
On a AIO-DX+ system, patching orchestration successfully works for patching controller-0, but when patching controller-1, it gets stuck waiting for the host to be "patch current" and the orchestration does not proceed for other hosts.
Even manually running the "sudo sw-patch host-install controller-1" and getting a success response did not change the controller-1 state which keeps "Patch Current=NO".
Severity
-----------------
<Critical: System/Feature is not usable after the defect>
Steps to Reproduce
-----------------
Install a AIO-DX+ system( AIO+ 10 worker nodes)
Configure patching orchestration strategy via command line to patch the entire system using a RR patch.
sw-manager patch-strategy create --controller-
Apply the strategy
sw-manager patch-strategy apply
Monitor the patching orchestration procedure via /var/log/
Expected Behavior
-----------------
All nodes are patched successfully
Actual Behavior
-----------------
Controller-1 is patched successfully
Controller-0 is not patched and orchestration is stuck after the sw-patch host-install command is sent to controller-0
Compute nodes are not patched
Eventually the patching strategy times out and fails
Reproducibility
-----------------
Intermittent with high frequency
It happened 2 out of 3 attempts
System Configuration
-----------------
AIO-DX+
Load info (eg: 2022-03-
-----------------
[sysadmin@
2022-11-28_18-00-09
Last Pass
-----------------
Not sure, but this scenario passed fine on system test execution for 22.06 release.
Timestamp/Logs
-----------------
NA
Alarms
-----------------
NA
Test Activity
-----------------
System Test
Workaround
-----------------
NA
Changed in starlingx: | |
assignee: | nobody → Jessica Castelino (jcasteli) |
Changed in starlingx: | |
importance: | Undecided → Medium |
tags: | added: stx.8.0 stx.update |
Fix proposed to branch: master /review. opendev. org/c/starlingx /update/ +/867311
Review: https:/