Block live migrations are broken when nova calculates live migration type by itself

Bug #1552303 reported by Pawel Koniszewski
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Pawel Koniszewski

Bug Description

All block live migrations are broken when I want nova to calculate live migration type by specifying {'block_migration': 'auto'} in request body. This happens because block_migration and migrate_data.block_migration flags do not have the same value.

In conductor live migrate task we call checks on destination and source that builds up migrate_data in driver and sends them back to conductor:

https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L156

Here we calculate block migration, this is fine:

https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5554

Then it goes back to conductor and we call compute manager sending both flags - block_migration and migrate_data.block_migration - but we never change value of block_migration to match migrate_data.block_migration:

https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L68

Because down in compute manager (and in drivers) we use both flags that have different values (here block_migration=None, migrate_data.block_migration=True), e.g. at this point block_migration=None:

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5196

We break all block live migrations with:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
    waiter.switch(result)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/utils.py", line 1160, in context_wrapper
    return func(*args, **kwargs)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6095, in _live_migration_operation
    instance=instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6063, in _live_migration_operation
    CONF.libvirt.live_migration_bandwidth)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
    six.reraise(c, e, tb)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
    rv = meth(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: Cannot access storage file '/opt/stack/data/nova/instances/572ad149-b7c5-4b77-85b5-34c1d2d37fcf/disk' (as uid:110, gid:116): No such file or directory

Fast workaround is making sure at compute manager level that block_migration == migrate_data.block_migration, but really we should cleanup all this mess and send only one flag, because it is error-prone and hard to maintain.

description: updated
Changed in nova:
importance: Undecided → Critical
status: New → In Progress
assignee: nobody → Pawel Koniszewski (pawel-koniszewski)
Revision history for this message
Andrea Rosa (andrea-rosa-m) wrote :

I agree that we should fix it properly and It would be nice to have a tempest test for the case reported in this bug.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/287363

Revision history for this message
Eli Qiao (taget-9) wrote :

-@Pawel, thank for quick addressing this issue, I put some comment in the review patch, beside , added tempest case to cover this case.

https://review.openstack.org/287605 Add new live_migration case to support block_migration=auto

-Eli.

Alex Xu (xuhj)
tags: added: mitaka-rc-potential
Changed in nova:
milestone: none → mitaka-rc1
Changed in nova:
assignee: Pawel Koniszewski (pawel-koniszewski) → John Garbutt (johngarbutt)
Changed in nova:
assignee: John Garbutt (johngarbutt) → Pawel Koniszewski (pawel-koniszewski)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/287363
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=4378b4fa43aa966aee8ddee77d5220e9c4476b5a
Submitter: Jenkins
Branch: master

commit 4378b4fa43aa966aee8ddee77d5220e9c4476b5a
Author: Pawel Koniszewski <email address hidden>
Date: Fri Mar 11 16:32:21 2016 +0100

    Use migrate_data.block_migration instead of block_migration

    Since nova can calculate block_migration we might end up with
    a case where block_migration is None while migrate_data.block_migration
    is True.

    Both drivers that support block live migration, Libvirt and Xenapi,
    use migrate_data.block_migration already, so we should switch to use
    it everywhere, instead of in drivers only.

    Change-Id: Iaa8aea3cb58ed0864a6a38d4a163649f52e32c5c
    Closes-Bug: #1552303

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/nova 13.0.0.0rc1

This issue was fixed in the openstack/nova 13.0.0.0rc1 release candidate.

Matt Riedemann (mriedem)
tags: removed: mitaka-rc-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.