The VM will be destroyed on source host during resizing for Hyper-V

Bug #1208301 reported by Kun Ge
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned
compute-hyperv
Invalid
Undecided
Unassigned

Bug Description

This defect is originally be found in the following scenario:

1. Deploy one vm A with 100g disk and 1 cpu.
2. Resize it with 2 cpu and 200g disk
3. During resizing, the host of vm is down (power off )
4. restart the host

After investigation, I found that in the method of migrate_disk_and_power_off of the migrationops, which is called by the Hyper-V driver, the VM gets removed as its last step.

https://github.com/openstack/nova/blob/master/nova/virt/hyperv/driver.py
https://github.com/openstack/nova/blob/master/nova/virt/hyperv/migrationops.py

Compared with the same scenario in KVM, the previous case works, which means the VM being resized won't be removed and after the host is up again, the resize can resume.

Since I am not familiar with the orginal design and don't know why Hyper-V handle resizing like this. So open this defect for tracking and discussion.

One question I can propose here is: is there a standard behavior among the hypervisors for resizing? if yes, what is it?

Tags: hyper-v
Kun Ge (gekun)
summary: - The VM will be destoried on source host during resize for Hyper-V
+ The VM will be destroyed on source host during resizing for Hyper-V
Revision history for this message
Alessandro Pilotti (alexpilotti) wrote :

Can you post a pastebin with some logs showing the exception traces?

"migrate_disk_and_power_off" copies the disk files to the target location and moves the original files to a temporary location in order to resume the migration in case of errors.

see: https://github.com/openstack/nova/blob/master/nova/virt/hyperv/migrationops.py#L47

If the target host is not accessible, the disk copy step will throw an exception, the VM will NOT be destroyed no data will be lost but Nova will set the instance in an error state.

Generally speaking, error handling in the Nova compute driver operations model (any driver) is a simple binary thing: either the operation works, or the instance is set in an error state.

There's a discussion going on about adding the idea of "warnings" in nova, i.e. raising an exception w/o having to set the instance in a non recoverable error state and still signaling the issue to the user.

Revision history for this message
Alessandro Pilotti (alexpilotti) wrote :

To answer your question, the resize / cold migration feature is hypervisor dependent.

Matt Riedemann (mriedem)
tags: added: hyper-v
Revision history for this message
Kun Ge (gekun) wrote :

hi Alessandro, thanks for you response!

Still have 2 questions here:

1) I noticed you said that resize / cold migration feature was hypervisor dependent, does it mean the current behavour that destory VM during resize is necessary for Hyper-V? is it possible that the VM is maintained till resize_finish and destroyed when resize_confirm?

For the original issue:
1. Deploy one vm A with 100g disk and 1 cpu.
2. Resize it with 2 cpu and 200g disk
3. During resizing, the host of vm is down (power off )
4. restart the host

is this a valid defect as your opinion?
If yes, can we have a possible solution?

Revision history for this message
Kun Ge (gekun) wrote :

Alessandro, you the comment#1

I can post the compute log for you.

From the log, we can find the migrate disk has been successfully operated and no exception raised.

The instance name [instance-00000062], uuid [6ccaf5e1-12e8-4da7-b319-99a609acec3d]

The log is between line 46561 and 46572.

Revision history for this message
Kun Ge (gekun) wrote :

The compute log for the original defect.

Revision history for this message
Kun Ge (gekun) wrote :

Hi Alessandro

Could you pleae answer my previous questions?

1) I noticed you said that resize / cold migration feature was hypervisor dependent, does it mean the current behavour that destory VM during resize is necessary for Hyper-V? is it possible that the VM is maintained till resize_finish and destroyed when resize_confirm?

2) is the original defect a valid one? is there any possible workaround for this test?

Many thanks!

Revision history for this message
Alessandro Pilotti (alexpilotti) wrote :

We are having a hard time trying to reproduce this issue.

What looks like a possible cause is this one:

1) target host is on, bova-compute service is down
2) first step of migration works as the SMB copy can happen (the host is up)
3) the root.vhd, is not deleted but moved to a temporary location waiting for confirm resize and the VM is deleted

A scenario in which the target host is down would lead to an exception during the SMB copy, which in turn would avoid step 3.

In case of failures, the VM should be recovered during revert migration. IMO this looks like another of those migration issues due to the lack of "soft errors", so there's not much that we can do on the driver side except some workaround.

We'll have to investigate more into this, I'll write an update ASAP.

Chuck Short (zulcss)
Changed in nova:
status: New → Confirmed
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (LIBERTY, MITAKA, OCATA, NEWTON).
  Valid example: CONFIRMED FOR: LIBERTY

Changed in nova:
importance: Medium → Undecided
status: Confirmed → Expired
Changed in compute-hyperv:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.