Block migration fails with volume size 2 GB and more

Bug #1368844 reported by Tushar Patil
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

Tested on master code with commit id : fd72c308fc6adc1f5d07c5287c1db5bfc12328fc

volume_driver = cinder.volume.drivers.lvm.LVMISCSIDriver

Case 1: Instance is booted using volume

Steps to reproduce:
1. Create a bootable volume of size 2 GB using an image.
2. Boot an instance with this volume on host 1.
3. Block migrate the instance on host 2
4. Instance will not be migrated and migration fails with below mentioned error log message on the source compute node

Case 2: Instance is booted using image then attach a volume to this newly booted instance

Steps to reproduce:
1. Create a volume of size 2 GB.
2. Boot an instance using image on host 1.
3. Attach a 2 GB volume to this instance.
3. Block migrate the instance on host 2.
4. Instance will not be migrated and migration fails with below mentioned error log message on the source compute node

Error Log message on the source compute node:
{{{
2014-09-11 02:42:41.884 ERROR nova.virt.libvirt.driver [-] [instance: ca59bee5-bae5-4c61-9e01-f76a1df3d324]
Live Migration failure: operation failed: migration job: unexpectedly failed
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/poll.py", line 115, in wait
    listener.cb(fileno)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 212, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5128, in _live_migration
    recover_method(context, instance, dest, block_migration)
  File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5122, in _live_migration
    CONF.libvirt.live_migration_bandwidth)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
    six.reraise(c, e, tb)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
    rv = meth(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1582, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: operation failed: migration job: unexpectedly failed
Removing descriptor: 19
}}}

Sean Dague (sdague)
tags: added: libvirt
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Changed in nova:
assignee: nobody → Timofey Durakov (tdurakov)
status: Confirmed → In Progress
Revision history for this message
Timofey Durakov (tdurakov) wrote :

@tpatil block migration with attached volume is not valid since https://bugs.launchpad.net/nova/+bug/1398999 So, second case is not valid. Could you provide more info about first one: nova.conf from compute node, libvirt logs, instance state in nova, networking used. It will help to find source of the problem. Tested on similar env, instances booted from 3GB volumes migrates well.

Revision history for this message
Timofey Durakov (tdurakov) wrote :

Checked live migration of instance booted from volume(3 GB). Everything is OK. If the problem was in block-migration of volume-booted instance, first case also could be marked as invalid. Mark bug as Incomplete, since issue is not reproduced.

Changed in nova:
status: In Progress → Incomplete
Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

So I had similar problem. In my case the problem was that during LM of multiple VMs network was overloaded and therefore QEMU/libvirt was unable to send/receive LM heartbeats. Solution was to tweak LM heartbeat configuration in libvirt/qemu.conf.

Tushar if you can still reproduce this issue can you upload libvirt/libvirtd.log somewhere? So I will be able to confirm that maybe we had the same problem and mark this bug as invalid if that's true.

tags: added: live-migrate
Paul Murray (pmurray)
tags: added: live-migration
removed: live-migrate
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

Comments #1 + #2 show that the issue couldn't be reproduced.
Comment #3 shows a hypervisor configuration which resolves the potential
root cause. Feel free to reopen the bug by providing the requested
information and set the bug status back to ''New''.

Changed in nova:
status: Incomplete → Invalid
assignee: Timofey Durakov (tdurakov) → nobody
importance: Medium → Undecided
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.