finish_resize failures incorrectly revert allocations

Bug #1825537 reported by Matt Riedemann on 2019-04-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Matt Riedemann
Pike
Medium
Unassigned
Queens
Medium
Unassigned
Rocky
Medium
Matt Riedemann
Stein
Medium
Matt Riedemann

Bug Description

While triaging bug 1821594 it got me thinking about handling placement allocations during resize when something fails, which got me thinking about an older fix:

https://review.openstack.org/#/c/543971/6/nova/compute/manager.py@4457

Looking back on that now, I think the revert during resize_instance is OK as long as the instance host/node has not changed, but I think doing it when finish_resize fails was probably a mistake because the instance.host in the nova db won't match where the allocations exist in placement. Before Pike this was fine since the ResourceTracker would heal the allocations in the update_available_resource periodic task, but we don't have that anymore.

So this could result in an instance reported as being on the dest host in the nova database with the new flavor, which is where it will get rebuilt/rebooted/etc, but placement will be tracking the instance resource allocations using the old flavor against the source host, which is not where the instance is.

Furthermore, if finish_resize fails, the instance should be in ERROR status and the user would likely try to hard reboot the instance to correct that status, which would happen on the dest host.

Matt Riedemann (mriedem) wrote :

Gerrit is down but I've written a functional regression test to recreate the bug and attached it as a patch for now.

description: updated
Changed in nova:
status: New → Triaged
importance: Undecided → Medium
Matt Riedemann (mriedem) wrote :

Thinking of options to fix this when it happens:

1. Rather than revert the allocations, we just drop the allocations on the source node (which are held by the migration record since Queens). That's pretty straight-forward I think and should be OK for backports to Rocky and maybe Queens. Pike gets a bit hairy though since the allocations were doubled up on the instance consumer in Pike so we'd have to remove the allocations against the source node resource provider specifically in that case. The code might already handle that in Pike, I haven't looked yet. This does have the downside of leaving residue on the source compute when the resize fails, but that's kind of always been an issue for a failed resize.

2. Change the instance.host/node back to the source host/node if finish_resize fails. I think this is probably not really an option because finish_resize has never done this on failure and we don't really know what state the instance is in, not to mention it's storage (volumes) or networking at that point, so it's likely best to just leave the DB pointing at the dest host/node and the server can be recovered there via hard reboot (or worst case rebuild).

3. An alternative to #2 is somehow trigger a revert resize to get back to the source host. Note that you can't attempt to revert a resize on an instance in ERROR status, which is what the state of the server is in when finish_resize fails. Additionally, the revert_resize API method is looking for a migration record with status 'finished' but the migration status would actually be 'error' when finish_resize fails, so trying to revert the resize from the API wouldn't work with the code as it exists today. The compute manager finish_resize method could call the compute manager revert_resize method (avoiding the API) but it's questionable at best how graceful that would be depending on how finish_resize failed.

Matt Riedemann (mriedem) wrote :

I've got a fix locally for option 1.

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
Matt Riedemann (mriedem) wrote :

Update on the functional recreate test.

Matt Riedemann (mriedem) wrote :

This is the fix.

Related fix proposed to branch: master
Review: https://review.opendev.org/654066

Changed in nova:
status: Triaged → In Progress

Reviewed: https://review.opendev.org/654066
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f4bb67210602914e1b9a678419cf22cfbeaf1431
Submitter: Zuul
Branch: master

commit f4bb67210602914e1b9a678419cf22cfbeaf1431
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 11:54:07 2019 -0400

    Add functional recreate test for regression bug 1825537

    Change I2d9ab06b485f76550dbbff46f79f40ff4c97d12f in Rocky
    (and backported through to Pike) added error handling to
    the resize_instance and finish_resize methods to revert
    allocations in placement when a failure occurs.

    This is OK for resize_instance, which runs on the source
    compute, as long as the instance.host/node values have not
    yet been changed to the dest host/node before RPC casting
    to the finish_resize method on the dest compute. It's OK
    because the instance is still on the source compute and the
    DB says so, so any attempt to recover the instance via hard
    reboot or rebuild will be on the source host.

    This is not OK for finish_resize because if we fail there
    and revert the allocations, the instance host/node values
    are already pointing at the dest compute and by reverting
    the allocations in placement, placement will be incorrectly
    tracking the instance usage with the old flavor against the
    source node resource provider rather than the new flavor
    against the dest node resource provider - where the instance
    is actually running and the nova DB says the instance lives.

    This change adds a simple functional regression test to
    recreate the bug with a multi-host resize. There is already
    a same-host resize functional test marked here which will
    need to be fixed as well.

    Change-Id: Ie9e294db7e24d0e3cbe83eee847f0fbfb7478900
    Related-Bug: #1825537

Reviewed: https://review.opendev.org/654067
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ea297d6ffba81c5dc982afe6519de09ff3744cad
Submitter: Zuul
Branch: master

commit ea297d6ffba81c5dc982afe6519de09ff3744cad
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 12:28:34 2019 -0400

    Drop source node allocations if finish_resize fails

    By the time finish_resize runs on the dest host, the instance
    host/node values are already pointing at the dest (they are
    set by resize_instance on the source compute before casting to
    finish_resize on the dest). If finish_resize fails, the instance
    is essentially stuck on the dest host so rather than revert the
    allocations (which will drop the new flavor allocations against
    the dest host where the instance now lives) we should just drop
    the old flavor allocations on the source node resource provider,
    which is what this change does.

    The functional regression recreate test is updated to show this
    working.

    Change-Id: I52c8d038118c858004e17e71b2fba9e9e2714815
    Closes-Bug: #1825537

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/666959
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=eaa1fc6159ca4437a1e0cbaa77a3da779afb8cb2
Submitter: Zuul
Branch: stable/stein

commit eaa1fc6159ca4437a1e0cbaa77a3da779afb8cb2
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 11:54:07 2019 -0400

    Add functional recreate test for regression bug 1825537

    Change I2d9ab06b485f76550dbbff46f79f40ff4c97d12f in Rocky
    (and backported through to Pike) added error handling to
    the resize_instance and finish_resize methods to revert
    allocations in placement when a failure occurs.

    This is OK for resize_instance, which runs on the source
    compute, as long as the instance.host/node values have not
    yet been changed to the dest host/node before RPC casting
    to the finish_resize method on the dest compute. It's OK
    because the instance is still on the source compute and the
    DB says so, so any attempt to recover the instance via hard
    reboot or rebuild will be on the source host.

    This is not OK for finish_resize because if we fail there
    and revert the allocations, the instance host/node values
    are already pointing at the dest compute and by reverting
    the allocations in placement, placement will be incorrectly
    tracking the instance usage with the old flavor against the
    source node resource provider rather than the new flavor
    against the dest node resource provider - where the instance
    is actually running and the nova DB says the instance lives.

    This change adds a simple functional regression test to
    recreate the bug with a multi-host resize. There is already
    a same-host resize functional test marked here which will
    need to be fixed as well.

    NOTE(mriedem): The test needed to be modified from Train
    since we have to rely on waiting for the task_state to
    change to None rather than the migration status changing
    to "error" since change Id6c0a0ee41520dd974052d7cdd17ca35d688f6b0
    is not in Stein.

    Change-Id: Ie9e294db7e24d0e3cbe83eee847f0fbfb7478900
    Related-Bug: #1825537
    (cherry picked from commit f4bb67210602914e1b9a678419cf22cfbeaf1431)

tags: added: in-stable-stein

Reviewed: https://review.opendev.org/667155
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e6c6178d22db43959103da1400017fa49b18f514
Submitter: Zuul
Branch: stable/stein

commit e6c6178d22db43959103da1400017fa49b18f514
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 12:28:34 2019 -0400

    Drop source node allocations if finish_resize fails

    By the time finish_resize runs on the dest host, the instance
    host/node values are already pointing at the dest (they are
    set by resize_instance on the source compute before casting to
    finish_resize on the dest). If finish_resize fails, the instance
    is essentially stuck on the dest host so rather than revert the
    allocations (which will drop the new flavor allocations against
    the dest host where the instance now lives) we should just drop
    the old flavor allocations on the source node resource provider,
    which is what this change does.

    The functional regression recreate test is updated to show this
    working.

    Change-Id: I52c8d038118c858004e17e71b2fba9e9e2714815
    Closes-Bug: #1825537
    (cherry picked from commit ea297d6ffba81c5dc982afe6519de09ff3744cad)

Reviewed: https://review.opendev.org/669361
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9a977cb28c1c4f2e8d950476fa373326f636dfd6
Submitter: Zuul
Branch: stable/rocky

commit 9a977cb28c1c4f2e8d950476fa373326f636dfd6
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 11:54:07 2019 -0400

    Add functional recreate test for regression bug 1825537

    Change I2d9ab06b485f76550dbbff46f79f40ff4c97d12f in Rocky
    (and backported through to Pike) added error handling to
    the resize_instance and finish_resize methods to revert
    allocations in placement when a failure occurs.

    This is OK for resize_instance, which runs on the source
    compute, as long as the instance.host/node values have not
    yet been changed to the dest host/node before RPC casting
    to the finish_resize method on the dest compute. It's OK
    because the instance is still on the source compute and the
    DB says so, so any attempt to recover the instance via hard
    reboot or rebuild will be on the source host.

    This is not OK for finish_resize because if we fail there
    and revert the allocations, the instance host/node values
    are already pointing at the dest compute and by reverting
    the allocations in placement, placement will be incorrectly
    tracking the instance usage with the old flavor against the
    source node resource provider rather than the new flavor
    against the dest node resource provider - where the instance
    is actually running and the nova DB says the instance lives.

    This change adds a simple functional regression test to
    recreate the bug with a multi-host resize. There is already
    a same-host resize functional test marked here which will
    need to be fixed as well.

    Conflicts:
          nova/tests/functional/test_servers.py
          nova/virt/fake.py

    NOTE(mriedem): The test_servers conflict is due to not having
    change If6aa37d9b6b48791e070799ab026c816fda4441c in Rocky. As
    a result, the new regression test also had to be modified for
    the call to assertFlavorMatchesAllocation. The fake module
    conflict is due to not having change
    Iefff121640e04abdbb6a4ae546c447f168dc8af9 in Rocky.

    Change-Id: Ie9e294db7e24d0e3cbe83eee847f0fbfb7478900
    Related-Bug: #1825537
    (cherry picked from commit f4bb67210602914e1b9a678419cf22cfbeaf1431)
    (cherry picked from commit eaa1fc6159ca4437a1e0cbaa77a3da779afb8cb2)

tags: added: in-stable-rocky

Reviewed: https://review.opendev.org/669393
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cbf6a46d8fcd3236bff44439c8d143a77167dfe2
Submitter: Zuul
Branch: stable/rocky

commit cbf6a46d8fcd3236bff44439c8d143a77167dfe2
Author: Matt Riedemann <email address hidden>
Date: Fri Apr 19 12:28:34 2019 -0400

    Drop source node allocations if finish_resize fails

    By the time finish_resize runs on the dest host, the instance
    host/node values are already pointing at the dest (they are
    set by resize_instance on the source compute before casting to
    finish_resize on the dest). If finish_resize fails, the instance
    is essentially stuck on the dest host so rather than revert the
    allocations (which will drop the new flavor allocations against
    the dest host where the instance now lives) we should just drop
    the old flavor allocations on the source node resource provider,
    which is what this change does.

    The functional regression recreate test is updated to show this
    working.

    Conflicts:
          nova/tests/functional/regressions/test_bug_1825537.py
          nova/tests/functional/test_servers.py

    NOTE(mriedem): The functional test conflicts are due to not having
    change If6aa37d9b6b48791e070799ab026c816fda4441c in Rocky.
    The code fix is also different because we don't have change
    I0851e2d54a1fdc82fe3291fb7e286e790f121e92 in Rocky and cannot use
    the _delete_allocation_after_move method as documented inline.
    This also means we have to deal with migration-based and legacy
    tracked allocations (where the source and dest node allocations
    are both on the instance consumer). As such, similar logic to
    _delete_allocation_after_move is used to try and delete the
    migration-based allocations first and failing that, attempt
    to cleanup the source node allocations the legacy way by removing
    them from the "doubled up" allocations created by the scheduler.
    Since this is new logic in the backport, a functional test is
    added to cover the legacy cleanup flow.

    Change-Id: I52c8d038118c858004e17e71b2fba9e9e2714815
    Closes-Bug: #1825537
    (cherry picked from commit ea297d6ffba81c5dc982afe6519de09ff3744cad)
    (cherry picked from commit e6c6178d22db43959103da1400017fa49b18f514)

This issue was fixed in the openstack/nova 19.0.2 release.

This issue was fixed in the openstack/nova 18.2.2 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers