Comment 12 for bug 1825537

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/666959
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=eaa1fc6159ca4437a1e0cbaa77a3da779afb8cb2
Submitter: Zuul
Branch: stable/stein

commit eaa1fc6159ca4437a1e0cbaa77a3da779afb8cb2
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 11:54:07 2019 -0400

    Add functional recreate test for regression bug 1825537

    Change I2d9ab06b485f76550dbbff46f79f40ff4c97d12f in Rocky
    (and backported through to Pike) added error handling to
    the resize_instance and finish_resize methods to revert
    allocations in placement when a failure occurs.

    This is OK for resize_instance, which runs on the source
    compute, as long as the instance.host/node values have not
    yet been changed to the dest host/node before RPC casting
    to the finish_resize method on the dest compute. It's OK
    because the instance is still on the source compute and the
    DB says so, so any attempt to recover the instance via hard
    reboot or rebuild will be on the source host.

    This is not OK for finish_resize because if we fail there
    and revert the allocations, the instance host/node values
    are already pointing at the dest compute and by reverting
    the allocations in placement, placement will be incorrectly
    tracking the instance usage with the old flavor against the
    source node resource provider rather than the new flavor
    against the dest node resource provider - where the instance
    is actually running and the nova DB says the instance lives.

    This change adds a simple functional regression test to
    recreate the bug with a multi-host resize. There is already
    a same-host resize functional test marked here which will
    need to be fixed as well.

    NOTE(mriedem): The test needed to be modified from Train
    since we have to rely on waiting for the task_state to
    change to None rather than the migration status changing
    to "error" since change Id6c0a0ee41520dd974052d7cdd17ca35d688f6b0
    is not in Stein.

    Change-Id: Ie9e294db7e24d0e3cbe83eee847f0fbfb7478900
    Related-Bug: #1825537
    (cherry picked from commit f4bb67210602914e1b9a678419cf22cfbeaf1431)