Locking a controller takes a finite amount of time, resulting in a
brief window between issuing a lock command toward the inactive
controller and the controller actually entering the locked state.
Typically, this window lasts only a few seconds. However, during
periods of high system activity or when VMs or other migrations are
occurring, it can extend to a minute or longer before the controller
enters the locked state.
In some cases, initiating a 'system host-swact' command while the
inactive controller is in this 'Locking but not yet Locked' state has
led to a switch of activity to a locked controller.
The current pre-swact semantic check is inadequate in preventing
this race condition, which could result in a locked active controller.
This update adds a precheck of a list of in-progress actions, any of
which will now reject a swact request.
Test Plan:
PASS: Verify sysinv package build.
PASS: Verify swact is rejected for any of the in-progress actions
listed in the precheck.
PASS: Verify swact reject handling and output text.
PASS: Verify pep8 of changed lines.
Regression:
PASS: Verify swact handling when task is empty
PASS: Verify swact handling when task is not empty and not Locking
PASS: Verify Swact soak (10x)
Closes-Bug: 2064347
Change-Id: I78238fa649c330d7b908dbcf50f654c004205ee6
Signed-off-by: Eric MacDonald <email address hidden>
Reviewed: https:/ /review. opendev. org/c/starlingx /config/ +/917791 /opendev. org/starlingx/ config/ commit/ f29cc84ba3dcfd6 34e338660d0657d 3ec557287e
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit f29cc84ba3dcfd6 34e338660d0657d 3ec557287e
Author: Eric MacDonald <email address hidden>
Date: Tue Apr 30 21:16:01 2024 +0000
Prevent swacting to a 'Locking' controller
Locking a controller takes a finite amount of time, resulting in a
brief window between issuing a lock command toward the inactive
controller and the controller actually entering the locked state.
Typically, this window lasts only a few seconds. However, during
periods of high system activity or when VMs or other migrations are
occurring, it can extend to a minute or longer before the controller
enters the locked state.
In some cases, initiating a 'system host-swact' command while the
inactive controller is in this 'Locking but not yet Locked' state has
led to a switch of activity to a locked controller.
The current pre-swact semantic check is inadequate in preventing
this race condition, which could result in a locked active controller.
This update adds a precheck of a list of in-progress actions, any of
which will now reject a swact request.
Test Plan:
PASS: Verify sysinv package build.
PASS: Verify swact is rejected for any of the in-progress actions
listed in the precheck.
PASS: Verify swact reject handling and output text.
PASS: Verify pep8 of changed lines.
Regression:
PASS: Verify swact handling when task is empty
PASS: Verify swact handling when task is not empty and not Locking
PASS: Verify Swact soak (10x)
Closes-Bug: 2064347 d7b908dbcf50f65 4c004205ee6
Change-Id: I78238fa649c330
Signed-off-by: Eric MacDonald <email address hidden>