Fix stale data when checking if configuration out-of-date
When checking if the current configuration matches
the latest configuration, the values used are passed
in. Because the are passed in, they can be stale.
To mitigate this, the latest ihost values are now
always fetched before doing any configuration checking.
However, the underlying race condition still exists.
To fix the race condition a much larger change is required.
To verify is this defect has happened:
sysadmin@controller-0:~$ grep 'system config alarm' /var/log/sysinv.log
sysinv 2023-02-15 14:35:37.982 41066 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 645903ac-2a08-41b6-a0cd-5edaaa7860f0 vs. target: 38f223d6-49eb-45d6-9fc5-b3ae40a8cea0.
sysinv 2023-02-15 14:35:40.880 41066 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 645903ac-2a08-41b6-a0cd-5edaaa7860f0 vs. target: 2ad1c950-7cd6-4d4d-a214-cafdd30456b7.
sysinv 2023-02-15 14:42:17.444 70736 INFO sysinv.conductor.manager [-] SYS_I Clear system config alarm: controller-0 target config 2ad1c950-7cd6-4d4d-a214-cafdd30456b7
sysinv 2023-02-15 14:44:17.989 87173 INFO sysinv.conductor.manager [-] SYS_I Clear system config alarm: controller-0 target config 2ad1c950-7cd6-4d4d-a214-cafdd30456b7 <--- This shouldn't be here
If there are any consecutive clears then this defect has occurred.
TEST PLAN
PASS: Verify fixed after bootstrap AIO-SX
* Apply fix before bootstrap, verify after unlock
PASS: Verify fixed after restore AIO-SX
* Apply fix before restore, verify after unlock
PASS: Verify fixed after optimized restore AIO-SX
* Apply fix before restore, verify after unlock
Reviewed: https:/ /review. opendev. org/c/starlingx /config/ +/874115 /opendev. org/starlingx/ config/ commit/ 378c68fdb39c50f 6d00d9c0cd46262 80046af2c8
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 378c68fdb39c50f 6d00d9c0cd46262 80046af2c8
Author: Joshua Kraitberg <email address hidden>
Date: Wed Feb 15 18:51:30 2023 -0500
Fix stale data when checking if configuration out-of-date
When checking if the current configuration matches
the latest configuration, the values used are passed
in. Because the are passed in, they can be stale.
An example of where this is also guaranteed to occur is: /github. com/starlingx/ config/ blob/2400ae204e 36a68c9627e3446 ee7f9e4a194dd54 /sysinv/ sysinv/ sysinv/ sysinv/ conductor/ manager. py#L5931- L5935
https:/
This is a long lived function that has a long interval between
when it queries it's ihost reference and when it uses it
to update the alarms.
To mitigate this, the latest ihost values are now
always fetched before doing any configuration checking.
However, the underlying race condition still exists.
To fix the race condition a much larger change is required.
To verify is this defect has happened: controller- 0:~$ grep 'system config alarm' /var/log/sysinv.log conductor. manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 645903ac- 2a08-41b6- a0cd-5edaaa7860 f0 vs. target: 38f223d6- 49eb-45d6- 9fc5-b3ae40a8ce a0. conductor. manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 645903ac- 2a08-41b6- a0cd-5edaaa7860 f0 vs. target: 2ad1c950- 7cd6-4d4d- a214-cafdd30456 b7. conductor. manager [-] SYS_I Clear system config alarm: controller-0 target config 2ad1c950- 7cd6-4d4d- a214-cafdd30456 b7 conductor. manager [-] SYS_I Clear system config alarm: controller-0 target config 2ad1c950- 7cd6-4d4d- a214-cafdd30456 b7 <--- This shouldn't be here
sysadmin@
sysinv 2023-02-15 14:35:37.982 41066 WARNING sysinv.
sysinv 2023-02-15 14:35:40.880 41066 WARNING sysinv.
sysinv 2023-02-15 14:42:17.444 70736 INFO sysinv.
sysinv 2023-02-15 14:44:17.989 87173 INFO sysinv.
If there are any consecutive clears then this defect has occurred.
TEST PLAN
PASS: Verify fixed after bootstrap AIO-SX
* Apply fix before bootstrap, verify after unlock
PASS: Verify fixed after restore AIO-SX
* Apply fix before restore, verify after unlock
PASS: Verify fixed after optimized restore AIO-SX
* Apply fix before restore, verify after unlock
Closes-Bug: 2007455 e904550207c1310 7ec72a0c09a
Signed-off-by: Joshua Kraitberg <email address hidden>
Change-Id: I8910f4dd4aaf88