Restarting destination compute manager during resize migration can cause instance data loss

Bug #1330503 reported by David McNally
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
jichenjc

Bug Description

During compute manager startup init_host is called. One of the functions there is to delete instance data that doesn't belong to this host i.e. _destroy_evacuated_instances. But this function only checks if the local instance belongs to the host or not. It doesn't check the task_state or vm_state.

If at this time a resize migration is taking place and the destination compute manager is restarted it might destroy the resizing instance. Alternatively, if the resize has completed (vm_state = RESIZED) but has not been confirmed/reverted, then a restart of the source compute manager might destroy the original instance.

A similar bug concerning just the migrating state is outlined here: https://bugs.launchpad.net/nova/+bug/1319797 and a fix is proposed here: https://review.openstack.org/#/c/93903

It was intended to have that fix deal with resize migrating instances as well as those just in the migrating state but as pointed out in a review comment this solution will work for migrating but a fix for resize would require further changes so I have raised this bug to highlight that.

Mark McLoughlin (markmc)
description: updated
Changed in nova:
importance: Undecided → High
status: New → Triaged
tags: added: compute icehouse-backport-potential
jichenjc (jichenjc)
Changed in nova:
assignee: nobody → jichenjc (jichenjc)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101803

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/101803
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=dd6fb1246ff2789bd78b772b45e1fcac21eda67a
Submitter: Jenkins
Branch: master

commit dd6fb1246ff2789bd78b772b45e1fcac21eda67a
Author: jichenjc <email address hidden>
Date: Wed Jun 18 04:14:09 2014 +0800

    Keep resizing&resized instances when compute init

    During compute manager startup init_host is called. One of the
    functions there is to delete instance data that doesn't belong
    to this host i.e. _destroy_evacuated_instances.
    But this function only checks if the local instance belongs to
    the host or not. It doesn't check the task_state or vm_state.

    In Resize function, user may want to revert or confirm the resize
    operations so the instance on source and dest compute node should
    be kept. so for RESIZE_MIGRATING, RESIZE_MIGRATED task states and
    RESIZED vm state instances, they should be kept in compute node
    when the compute restart. This patch adds check for the task
    state and vm state before delete the instances.

    Closes-Bug: #1330503

    Change-Id: I723fa4a8823019391ea83aa189096531032adab1

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.