Comment 2 for bug 2007455

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/874115
Committed: https://opendev.org/starlingx/config/commit/378c68fdb39c50f6d00d9c0cd4626280046af2c8
Submitter: "Zuul (22348)"
Branch: master

commit 378c68fdb39c50f6d00d9c0cd4626280046af2c8
Author: Joshua Kraitberg <email address hidden>
Date: Wed Feb 15 18:51:30 2023 -0500

    Fix stale data when checking if configuration out-of-date

    When checking if the current configuration matches
    the latest configuration, the values used are passed
    in. Because the are passed in, they can be stale.

    An example of where this is also guaranteed to occur is:
    https://github.com/starlingx/config/blob/2400ae204e36a68c9627e3446ee7f9e4a194dd54/sysinv/sysinv/sysinv/sysinv/conductor/manager.py#L5931-L5935
    This is a long lived function that has a long interval between
    when it queries it's ihost reference and when it uses it
    to update the alarms.

    To mitigate this, the latest ihost values are now
    always fetched before doing any configuration checking.

    However, the underlying race condition still exists.
    To fix the race condition a much larger change is required.

    To verify is this defect has happened:
    sysadmin@controller-0:~$ grep 'system config alarm' /var/log/sysinv.log
    sysinv 2023-02-15 14:35:37.982 41066 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 645903ac-2a08-41b6-a0cd-5edaaa7860f0 vs. target: 38f223d6-49eb-45d6-9fc5-b3ae40a8cea0.
    sysinv 2023-02-15 14:35:40.880 41066 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 645903ac-2a08-41b6-a0cd-5edaaa7860f0 vs. target: 2ad1c950-7cd6-4d4d-a214-cafdd30456b7.
    sysinv 2023-02-15 14:42:17.444 70736 INFO sysinv.conductor.manager [-] SYS_I Clear system config alarm: controller-0 target config 2ad1c950-7cd6-4d4d-a214-cafdd30456b7
    sysinv 2023-02-15 14:44:17.989 87173 INFO sysinv.conductor.manager [-] SYS_I Clear system config alarm: controller-0 target config 2ad1c950-7cd6-4d4d-a214-cafdd30456b7 <--- This shouldn't be here

    If there are any consecutive clears then this defect has occurred.

    TEST PLAN
    PASS: Verify fixed after bootstrap AIO-SX
      * Apply fix before bootstrap, verify after unlock
    PASS: Verify fixed after restore AIO-SX
      * Apply fix before restore, verify after unlock
    PASS: Verify fixed after optimized restore AIO-SX
      * Apply fix before restore, verify after unlock

    Closes-Bug: 2007455
    Signed-off-by: Joshua Kraitberg <email address hidden>
    Change-Id: I8910f4dd4aaf88e904550207c13107ec72a0c09a