This commit includes:
1. Rebuild the region one patch cache during the service reload.
2. Ignore the patch in progress alarm as this could be a patch
orchestration retry.
3. Re-create workers if they are cleard after a service restart.
4. Properly handle reboot-required patching in case both system
controller and subcloud patch orchestration is done in one single
strategy.
With these improvement, the patch strategy can continue after a
service reload.
Test plan(passed):
1. Verify successful patch orchestration of an RR patch when both
system controller and subclouds are patched in the same strategy.
2. Induce a 300.005 alarm (mgmt-affecting) in a subcloud, verify
that orchestrated patching fails for that subcloud.
3. Induce a 900.001 alarm by partially apply a patch in a subcloud
beforehand, verify that orchestrated patching completes for that
subcloud.
4. Induce process restart in the middle of a subcloud patch
orchestration, verify that transitional strategy steps are set to
failed and the subclouds still in "initial" state can continue.
5. Induce process restart in the middle of a system controller patch
orchestration, verify that system controller patching can resume and
complete.
Reviewed: https:/ /review. opendev. org/c/starlingx /distcloud/ +/847059 /opendev. org/starlingx/ distcloud/ commit/ 29fe24acb313cf9 ff3c434dd535e31 4800503e4f
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 29fe24acb313cf9 ff3c434dd535e31 4800503e4f
Author: Yuxing Jiang <email address hidden>
Date: Tue Jun 21 11:57:02 2022 -0400
Enhance handling on-going patch strategy
This commit includes:
1. Rebuild the region one patch cache during the service reload.
2. Ignore the patch in progress alarm as this could be a patch
orchestration retry.
3. Re-create workers if they are cleard after a service restart.
4. Properly handle reboot-required patching in case both system
controller and subcloud patch orchestration is done in one single
strategy.
With these improvement, the patch strategy can continue after a
service reload.
Test plan(passed):
1. Verify successful patch orchestration of an RR patch when both
system controller and subclouds are patched in the same strategy.
2. Induce a 300.005 alarm (mgmt-affecting) in a subcloud, verify
that orchestrated patching fails for that subcloud.
3. Induce a 900.001 alarm by partially apply a patch in a subcloud
beforehand, verify that orchestrated patching completes for that
subcloud.
4. Induce process restart in the middle of a subcloud patch
orchestration, verify that transitional strategy steps are set to
failed and the subclouds still in "initial" state can continue.
5. Induce process restart in the middle of a system controller patch
orchestration, verify that system controller patching can resume and
complete.
Closes-Bug: 1979097 be6301f011baff2 97502b9108b
Signed-off-by: Yuxing Jiang<email address hidden>
Change-Id: I1b70d14b77c3e1