Comment 13 for bug 1775934

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/756404
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c895d3e6bca562225d70e8f81255f38970f7fcda
Submitter: Zuul
Branch: stable/stein

commit c895d3e6bca562225d70e8f81255f38970f7fcda
Author: Matt Riedemann <email address hidden>
Date: Fri Sep 20 17:07:35 2019 -0400

    Sanity check instance mapping during scheduling

    mnaser reported a weird case where an instance was found
    in both cell0 (deleted there) and in cell1 (not deleted
    there but in error state from a failed build). It's unclear
    how this could happen besides some weird clustered rabbitmq
    issue where maybe the schedule and build request to conductor
    happens twice for the same instance and one picks a host and
    tries to build and the other fails during scheduling and is
    buried in cell0.

    To avoid a split brain situation like this, we add a sanity
    check in _bury_in_cell0 to make sure the instance mapping is
    not pointing at a cell when we go to update it to cell0.
    Similarly a check is added in the schedule_and_build_instances
    flow (the code is moved to a private method to make it easier
    to test).

    Worst case is this is unnecessary but doesn't hurt anything,
    best case is this helps avoid split brain clustered rabbit
    issues.

    Closes-Bug: #1775934

    Change-Id: I335113f0ec59516cb337d34b6fc9078ea202130f
    (cherry picked from commit 5b552518e1abdc63fb33c633661e30e4b2fe775e)
    (cherry picked from commit efc35b1c5293c7c6c85f8cf9fd9d8cd8de71d1d5)