Live migrations from Mitaka to Liberty are broken

Bug #1556126 reported by Pawel Koniszewski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Dan Smith

Bug Description

I have an environment consisting of three nodes:

Controller Mitaka, commit id: 59a07f00ad3c527e3b39712220cf9de1e68cd16f
Compute Mitaka, commit id: 59a07f00ad3c527e3b39712220cf9de1e68cd16f
Compute Stable/Liberty, commit id: 184e2552490ecfded61db3cc9ba1cd8d6aac1644

I am able to live migrate VMs from Liberty to Mitaka using this CLI command:

nova --os-compute-api-version 2.24 live-migrate --block-migrate instance

But when I want to move VM from Mitaka to Liberty I'm ending up with an error:

ERROR nova.virt.libvirt.driver [req-aea14ab8-204d-4e7f-a591-e81a5b5fde4b admin demo] [instance: d509c4af-2003-4075-9c2f-2c4fbce727ea] Live Migration failure: internal error:
 process exited while connecting to monitor: 2016-03-11T13:36:33.803180Z qemu-system-x86_64: -vnc None:0: Failed to start VNC server on `(null)': address resolution failed for None:5900: Temporary
failure in name resolution

Traceback:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
    waiter.switch(result)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/utils.py", line 1145, in context_wrapper
    return func(*args, **kwargs)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6059, in _live_migration_operation
    instance=instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6027, in _live_migration_operation
    CONF.libvirt.live_migration_bandwidth)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
    six.reraise(c, e, tb)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
    rv = meth(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: internal error: process exited while connecting to monitor: 2016-03-11T13:20:07.165053Z qemu-system-x86_64: -vnc None:0: Failed to start VNC server on `(null)': address resolution failed for None:5900: Temporary failure in name resolution

Same happens when VM is volume-backed or on shared storage.

This happens because pre_live_migration, that is executed on destination (Liberty), returns dict that looks like:

{u'volume': {}, u'serial_listen_addr': u'127.0.0.1', u'graphics_listen_addrs': {u'vnc': u'10.0.0.3', u'spice': u'127.0.0.1'}}

When it comes back to rpc api of new compute node here: https://github.com/openstack/nova/blob/a3cf38a3ec0fd57679320688bd815225c2bf053f/nova/compute/rpcapi.py#L680

We just pass this data to .from_legacy_dict() migrate_data object method, but it does expect that all this data will be nested under 'pre_live_migration_result' key: https://github.com/openstack/nova/blob/7832f6ea816b1b79251d06abbf38772894e74e2f/nova/objects/migrate_data.py#L195

Because of this we don't really convert data coming from pre_live_migration phase and pass migrate_data object with Nones that are needed for setting up VNC.

Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

Forgot to mention that I have applied these upgrade levels on controller and computes:

compute=4.5
network=1.15

Dan Smith (danms)
Changed in nova:
importance: Undecided → High
tags: added: mitaka-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/291759

Changed in nova:
assignee: nobody → Dan Smith (danms)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/291759
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fb6f37f7f91c47aced8b19e527e2771743911d0a
Submitter: Jenkins
Branch: master

commit fb6f37f7f91c47aced8b19e527e2771743911d0a
Author: Dan Smith <email address hidden>
Date: Fri Mar 11 07:15:38 2016 -0800

    Fix pre_live_migration result processing from legacy computes

    The objectification of the live migration data ended up with a bug in
    the compatibility path for the pre_live_migration result where we
    convert a legacy dict response back to an object. The legacy dict
    processing looks for 'pre_live_migration_result' in the input to
    trigger handling of those keys, but we weren't properly wrapping the
    result like that so it was never digesting those keys.

    Change-Id: I5142e1cb9d526c92529fc24ee0441b5730931160
    Closes-Bug: #1556126

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/nova 13.0.0.0rc1

This issue was fixed in the openstack/nova 13.0.0.0rc1 release candidate.

Matt Riedemann (mriedem)
tags: removed: mitaka-rc-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.