DC Upgrade Orchestration cannot recover from failed subcloud lock or failed upgrade activate
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Tee Ngo |
Bug Description
Brief Description
-----------------
Unable to retry DC upgrade orchestration following a failed subcloud lock or failed upgrade activation.
Severity
--------
Critical
Steps to Reproduce
------------------
Bring up a DC system with at least one subcloud
Upgrade the system controller
On the subcloud unmanage and shutdown vim to fail a host lock
Perform subcloud upgrade using dcmanager orchestration
Subcloud upgrade would fail at host-lock step
Restore vim service on the subcloud
Delete the failed upgrade strategy and try again by creating and a apply a new one
Expected Behavior
------------------
Upgrade orchestration can be recovered and completes
Actual Behavior
----------------
Upgrade orchestration retry failed
Reproducibility
---------------
100% reproducible
System Configuration
-------
Distributed Cloud
Branch/Pull Time/Commit
-------
April 5th StartingX master build
Last Pass
---------
These 2 particular failed scenario was
Timestamp/Logs
--------------
orchestartor log on the system controller
2021-04-07 20:48:04.579 554197 ERROR dcmanager.
2021-04-07 20:48:04.579 554197 ERROR dcmanager.
On subcloud:
[sysadmin@
-------
Alarm ID Reason Text Entity ID Severity Time Stamp
-------
900.005 System Upgrade in progress. host=controller minor 2021-04-12T
-------
Test Activity
-------------
Developer Testing
Workaround
----------
None
tags: | added: stx.cherrypickneeded |
tags: |
added: in-r-stx50 removed: stx.cherrypickneeded |
Fix proposed to branch: master /review. opendev. org/c/starlingx /distcloud/ +/786688
Review: https:/