RVMC timeout for power on/off step is too short
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Li Zhu |
Bug Description
Brief Description
-----------------
Some ZT servers can take minutes to complete the immediate shutdown and/or power on steps. The current rvmc timeout for these steps is too short (60 polls with 1 second sleep in between). Furthermore, the error log is also misleading. It should log the requested state as opposed to the current state.
Severity
--------
Major
Steps to Reproduce
------------------
Deploy a new Redfish capable subcloud or perform an orchestrated upgrade of a subcloud which involves remote install.
Expected Behavior
------------------
Deployment/upgrade completes successfully
Actual Behavior
----------------
Subcloud remote install fails intermittently on certain ZT servers due to timeout on power on/off.
Reproducibility
---------------
Intermittent
System Configuration
-------
Distributed Cloud
Branch/Pull Time/Commit
-------
Sept 30, 2023 master build
Last Pass
---------
Never did. The remote install test would have failed before if the ZT server was in a state that exhibits a delayed shut down or power on.
Timestamp/Logs
--------------
To be provided
Test Activity
-------------
Other - normal usage
Workaround
----------
None
Changed in starlingx: | |
status: | New → In Progress |
tags: | added: stx.9.0 stx.metal |
Changed in starlingx: | |
importance: | Undecided → Medium |
assignee: | nobody → Li Zhu (lzhu1) |
Reviewed: https:/ /review. opendev. org/c/starlingx /metal/ +/896657 /opendev. org/starlingx/ metal/commit/ bfbaba57310a03e 60f30d44b6dc127 de5c77f960
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit bfbaba57310a03e 60f30d44b6dc127 de5c77f960
Author: Li Zhu <email address hidden>
Date: Mon Sep 25 14:04:21 2023 -0400
Set longer shutdown time and fix power state error log
1.Extended the timeout to 14mins to accommodate the longer shutdown time.
2.Fixed the power state error log so that it logs the requested state
instead of the current power_state.
Test Plan:
PASS: Verify logged version is 2.2
PASS: Verify success path with no FIT delay ; HP and ZT servers
PASS: Verify timing of the loop with timeout of 14 minutes
PASS: Verify shutdown timeout handling when shutdown exceeds 14
minutes.
PASS: Verify install completes successfully when Power Off takes
close to but less than 14 minutes
PASS: Verify power state failure log reports proper state
Closes-Bug: 2038484
Signed-off-by: Li Zhu <email address hidden> cae43b20e00d8eb cb127a80560
Change-Id: Ic99a06dca9962f