VIM can mark all VMs in error state after a swact
Bug #1838810 reported by
Frank Miller
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Bart Wensley |
Bug Description
After a controller swact, if the newly active controller is busy a race condition can occur where VIM sets VMs to error state due to a logic bug in its audit.
Bart analyzed logs from such a scenario and determined that as mtce is coming up on the newly active controller there is a delay before it reports a compute host is enabled. If the audit runs at this time the audit only checks if the host state is enabled and if not enabled sets the VMs to error. The audit should instead check if the host is "disabled" as the host could be in "unknown" state for a short period of time after the swact when maintenance and VIM processes are starting up.
Changed in starlingx: | |
status: | Triaged → In Progress |
tags: | added: in-r-stx20 |
To post a comment you must log in.
Setting priority to medium - while the race condition is low likelihood to occur, the impact is severe when it occurs since all VMs are marked in error.