OpenStack Compute (nova)

Restarting destination compute manager during resize migration can cause instance data loss

Bug #1330503 reported by David McNally on 2014-06-16

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	High	jichenjc	OpenStack Compute (nova) 2014.2 "juno"

Bug Description

During compute manager startup init_host is called. One of the functions there is to delete instance data that doesn't belong to this host i.e. _destroy_evacuated_instances. But this function only checks if the local instance belongs to the host or not. It doesn't check the task_state or vm_state.

If at this time a resize migration is taking place and the destination compute manager is restarted it might destroy the resizing instance. Alternatively, if the resize has completed (vm_state = RESIZED) but has not been confirmed/reverted, then a restart of the source compute manager might destroy the original instance.

A similar bug concerning just the migrating state is outlined here: https://bugs.launchpad.net/nova/+bug/1319797 and a fix is proposed here: https://review.openstack.org/#/c/93903

It was intended to have that fix deal with resize migrating instances as well as those just in the migrating state but as pointed out in a review comment this solution will work for migrating but a fix for resize would require further changes so I have raised this bug to highlight that.

See original description

Tags:

Mark McLoughlin (markmc) on 2014-06-16

description:	updated
Changed in nova:
importance:	Undecided → High
status:	New → Triaged
tags:	added: compute icehouse-backport-potential

jichenjc (jichenjc) on 2014-06-20

Changed in nova:
assignee:	nobody → jichenjc (jichenjc)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-06-23: Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101803

Changed in nova:
status:	Triaged → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-28: Fix merged to nova (master)

Reviewed: https://review.openstack.org/101803
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=dd6fb1246ff2789bd78b772b45e1fcac21eda67a
Submitter: Jenkins
Branch: master

commit dd6fb1246ff2789bd78b772b45e1fcac21eda67a
Author: jichenjc <email address hidden>
Date: Wed Jun 18 04:14:09 2014 +0800

Keep resizing&resized instances when compute init

    During compute manager startup init_host is called. One of the
    functions there is to delete instance data that doesn't belong
    to this host i.e. _destroy_evacuated_instances.
    But this function only checks if the local instance belongs to
    the host or not. It doesn't check the task_state or vm_state.

    In Resize function, user may want to revert or confirm the resize
    operations so the instance on source and dest compute node should
    be kept. so for RESIZE_MIGRATING, RESIZE_MIGRATED task states and
    RESIZED vm state instances, they should be kept in compute node
    when the compute restart. This patch adds check for the task
    state and vm state before delete the instances.

Closes-Bug: #1330503

Change-Id: I723fa4a8823019391ea83aa189096531032adab1

Changed in nova:
status:	In Progress → Fix Committed

Thierry Carrez (ttx) on 2014-09-05

Changed in nova:
milestone:	none → juno-3
status:	Fix Committed → Fix Released

Thierry Carrez (ttx) on 2014-10-16

Changed in nova:
milestone:	juno-3 → 2014.2

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.