Comment 4 for bug 2064347

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/917791
Committed: https://opendev.org/starlingx/config/commit/f29cc84ba3dcfd634e338660d0657d3ec557287e
Submitter: "Zuul (22348)"
Branch: master

commit f29cc84ba3dcfd634e338660d0657d3ec557287e
Author: Eric MacDonald <email address hidden>
Date: Tue Apr 30 21:16:01 2024 +0000

    Prevent swacting to a 'Locking' controller

    Locking a controller takes a finite amount of time, resulting in a
    brief window between issuing a lock command toward the inactive
    controller and the controller actually entering the locked state.

    Typically, this window lasts only a few seconds. However, during
    periods of high system activity or when VMs or other migrations are
    occurring, it can extend to a minute or longer before the controller
    enters the locked state.

    In some cases, initiating a 'system host-swact' command while the
    inactive controller is in this 'Locking but not yet Locked' state has
    led to a switch of activity to a locked controller.

    The current pre-swact semantic check is inadequate in preventing
    this race condition, which could result in a locked active controller.

    This update adds a precheck of a list of in-progress actions, any of
    which will now reject a swact request.

    Test Plan:

    PASS: Verify sysinv package build.
    PASS: Verify swact is rejected for any of the in-progress actions
          listed in the precheck.
    PASS: Verify swact reject handling and output text.
    PASS: Verify pep8 of changed lines.

    Regression:

    PASS: Verify swact handling when task is empty
    PASS: Verify swact handling when task is not empty and not Locking
    PASS: Verify Swact soak (10x)

    Closes-Bug: 2064347
    Change-Id: I78238fa649c330d7b908dbcf50f654c004205ee6
    Signed-off-by: Eric MacDonald <email address hidden>