Periodic task cause errors in _finish_resize

Bug #1321298 reported by Lance Bragstad
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Lance Bragstad

Bug Description

In the event that an end user sets resize_confirm_window to something small (say 1 in this example) there is a possibility that the periodic task can run in nova/compute/manager.py:ComputeManager._finish_resize() after the migration has been updated but before the instances has been updated.

http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py#n3570

One possible solution to this would be to reverse the order, and update the instance before updating the migration, in which case the migration will get updated in _confirm_resize: http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py#n5018

Tags: compute resize
Revision history for this message
Matt Riedemann (mriedem) wrote :

If you swap the order of the migration/instance updates in _finish_resize, then if _poll_unconfirmed_resizes runs when the instance is updated but before the migration state is updated, this db query will not return the associated migration so it's a no-op in that case, which fixes the race bug:

http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py#n4989

Changed in nova:
status: New → Triaged
tags: added: compute resize
melanie witt (melwitt)
Changed in nova:
assignee: nobody → Melanie Witt (melwitt)
Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in nova:
status: Triaged → In Progress
assignee: Melanie Witt (melwitt) → Lance Bragstad (ldbragst)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/94474
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6dd5cc503cc05c00c5f9d831480539c67f6e2a48
Submitter: Jenkins
Branch: master

commit 6dd5cc503cc05c00c5f9d831480539c67f6e2a48
Author: Lance Bragstad <email address hidden>
Date: Tue May 20 17:58:51 2014 +0000

    Fix migration and instance resize update order

    This commit switches the order in which migrations and instances are
    updated when resizing an instance. Previously, if you set
    resize_confirm_window to some small value, your resize instance could go
    into error state because the periodic task was run after the migration
    was updated but before the instance object was updated. This change makes it
    so the instance is always updated before the migration.

    The _test_finish_resize test case has also been refactored to enforce
    saving the instance object before the migration object. In the event
    that the instance object is saved after the migration, like the previous
    implementation, the test cases will fail since the states of the
    migrations will be out of sync in the _mig_save assertions.

    Closes-Bug: #1321298
    Change-Id: I0490fc4b4e03b36eb01a06a95ea761f4cf8df469

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Matt Riedemann (mriedem) wrote :

This caused a race failure in the gate as soon as it merged, see bug 1326778. There is a revert up now for the patch.

Looks like confirm-resize needs to be synchronized with finish-resize (revert-resize would need to be synchronized also for that matter).

Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.