live migrations doesn't work for VMs with more than 2 network interfaces

Bug #1819735 reported by Lukasz
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Description
===========
When trying to migrate VMs with more than 2 network interfaces (ports) migrations fails.
On source machine I get following errors:

2019-03-12 15:25:50.809 16855 WARNING nova.compute.resource_tracker [req-b089fd63-ba03-4f0d-ad6d-0585692810ee 5f620a8543154df1930316a7bf15f47e bd582a2363584a868a2966f1c4a1e56c - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:25:52.312 16855 WARNING nova.compute.manager [req-e1435cec-abb5-4490-979c-9240c377b5be 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-unplugged-4078a1e8-0102-4a52-bb88-c3df517a5a1b for instance with vm_state active and task_state migrating.
2019-03-12 15:25:55.001 16855 WARNING nova.compute.manager [req-6db93de5-aaa3-4838-b622-d4ceb9faed4b 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-plugged-4078a1e8-0102-4a52-bb88-c3df517a5a1b for instance with vm_state active and task_state migrating.
2019-03-12 15:25:57.020 16855 WARNING nova.compute.manager [req-600c156c-ed9a-4c98-b6f9-91788cd72b78 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-unplugged-ab0b7818-ae3f-4b42-9d52-6fea23dd521c for instance with vm_state active and task_state migrating.
2019-03-12 15:25:57.070 16855 WARNING nova.compute.manager [req-daa4ff0d-c18e-47bc-958d-4a07c6128ab3 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-plugged-4078a1e8-0102-4a52-bb88-c3df517a5a1b for instance with vm_state active and task_state migrating.
2019-03-12 15:25:57.124 16855 WARNING nova.compute.manager [req-a5c2e84a-f828-43a4-803c-c8208fe42ab5 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-plugged-ab0b7818-ae3f-4b42-9d52-6fea23dd521c for instance with vm_state active and task_state migrating.
2019-03-12 15:25:59.008 16855 WARNING nova.compute.manager [req-660a5963-bab7-44f4-9856-4a6ff1268b42 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-unplugged-c16c6463-4ef6-4969-9d7f-30c3a6152933 for instance with vm_state active and task_state migrating.
2019-03-12 15:25:59.490 16855 ERROR nova.virt.libvirt.driver [req-04b22e72-a9a4-423e-a904-3396dcf7beb2 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Live Migration failure: Cannot get interface MTU on 'brqb010f6ce-84': No such device: libvirtError: Cannot get interface MTU on 'brqb010f6ce-84': No such device
2019-03-12 15:25:59.790 16855 WARNING nova.compute.manager [req-6bb36e82-7e4e-4d6c-bd83-2e9d8e9a5688 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-plugged-c16c6463-4ef6-4969-9d7f-30c3a6152933 for instance with vm_state active and task_state migrating.
2019-03-12 15:26:00.424 16855 WARNING nova.compute.manager [req-953b231f-4736-486f-96c1-ee0b99e88e54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-unplugged-f29b05df-0acd-4b7b-9b3a-e985561567fc for instance with vm_state active and task_state migrating.
2019-03-12 15:26:08.790 16855 WARNING nova.compute.manager [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Received unexpected event network-vif-plugged-f29b05df-0acd-4b7b-9b3a-e985561567fc for instance with vm_state active and task_state migrating.
2019-03-12 15:26:50.832 16855 WARNING nova.compute.resource_tracker [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:27:50.953 16855 WARNING nova.compute.resource_tracker [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:28:51.788 16855 WARNING nova.compute.resource_tracker [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:29:52.893 16855 WARNING nova.compute.resource_tracker [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:30:52.852 16855 WARNING nova.compute.resource_tracker [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [req-6d54487a-e65b-4d46-8f21-4a3337b1fb54 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Live migration failed.: MigrationError: Migration error: Timeout waiting for VIF plugging events, canceling migration
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Traceback (most recent call last):
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6070, in _do_live_migration
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] block_migration, migrate_data)
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6892, in live_migration
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] migrate_data)
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 7394, in _live_migration
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] raise exception.MigrationError(reason=msg)
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] MigrationError: Migration error: Timeout waiting for VIF plugging events, canceling migration
2019-03-12 15:30:59.034 16855 ERROR nova.compute.manager [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f]

and on destination compute node:

2019-03-12 15:26:03.507 19925 WARNING nova.compute.resource_tracker [req-b089fd63-ba03-4f0d-ad6d-0585692810ee 5f620a8543154df1930316a7bf15f47e bd582a2363584a868a2966f1c4a1e56c - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:27:05.511 19925 WARNING nova.compute.resource_tracker [req-b089fd63-ba03-4f0d-ad6d-0585692810ee 5f620a8543154df1930316a7bf15f47e bd582a2363584a868a2966f1c4a1e56c - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:28:05.567 19925 WARNING nova.compute.resource_tracker [req-b089fd63-ba03-4f0d-ad6d-0585692810ee 5f620a8543154df1930316a7bf15f47e bd582a2363584a868a2966f1c4a1e56c - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:29:05.680 19925 WARNING nova.compute.resource_tracker [req-b089fd63-ba03-4f0d-ad6d-0585692810ee 5f620a8543154df1930316a7bf15f47e bd582a2363584a868a2966f1c4a1e56c - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.
2019-03-12 15:30:07.848 19925 WARNING nova.compute.resource_tracker [req-b089fd63-ba03-4f0d-ad6d-0585692810ee 5f620a8543154df1930316a7bf15f47e bd582a2363584a868a2966f1c4a1e56c - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Instance not resizing, skipping migration.

Steps to reproduce
==================
openstack server show 981fcdcd-5121-46f8-b36c-35a20c3e651f -c addresses
+-----------+--------------------------------------------------------------------------------------------+
| Field | Value |
+-----------+--------------------------------------------------------------------------------------------+
| addresses | EXTERNAL=x.x.x.x; CROSSCONNECT=y.y.y.y; production=w.w.w.w; test=z.z.z.z |
+-----------+--------------------------------------------------------------------------------------------+

openstack server migrate --live comp-a01 981fcdcd-5121-46f8-b36c-35a20c3e651f

Expected result
===============
VM should be running on new compute comp-a01

Actual result
=============

VM stays on source compute, in state ERROR, need to use:
nova reset-state --active 981fcdcd-5121-46f8-b36c-35a20c3e651f
to restore active state

Environment
===========
Openstack Queens

nova controller:
root@nova-01:~# dpkg -l | grep nova
ii nova-api 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - API frontend
ii nova-common 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - common files
ii nova-conductor 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - conductor service
ii nova-consoleauth 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - Console Authenticator
ii nova-novncproxy 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - NoVNC proxy
ii nova-placement-api 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - placement API frontend
ii nova-scheduler 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - virtual machine scheduler
ii python-nova 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7
root@nova-01:~#

compute nodes:
source:
[root@comp-b02 ~]# dpkg -l | grep nova
ii nova-common 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - common files
ii nova-compute 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node base
ii nova-compute-kvm 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node libvirt support
ii python-nova 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7

destination:
[root@comp-a01 ~]# dpkg -l | grep nova
ii nova-common 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - common files
ii nova-compute 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node base
ii nova-compute-kvm 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node libvirt support
ii python-nova 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7

Hypervisor:
Libvirt + KVM
[root@comp-a01 ~]# dpkg -l | grep libvirt
ii libvirt-bin 4.0.0-1ubuntu8.3~cloud0 amd64 programs for the libvirt library
ii libvirt-clients 4.0.0-1ubuntu8.3~cloud0 amd64 Programs for the libvirt library
ii libvirt-daemon 4.0.0-1ubuntu8.3~cloud0 amd64 Virtualization daemon
ii libvirt-daemon-driver-storage-rbd 4.0.0-1ubuntu8.3~cloud0 amd64 Virtualization daemon RBD storage driver
ii libvirt-daemon-system 4.0.0-1ubuntu8.3~cloud0 amd64 Libvirt daemon configuration files
ii libvirt0:amd64 4.0.0-1ubuntu8.3~cloud0 amd64 library for interfacing with different virtualization systems
ii nova-compute-libvirt 2:17.0.5-0ubuntu1~cloud0 all OpenStack Compute - compute node libvirt support
ii python-libvirt 4.0.0-1~cloud0 amd64 libvirt Python bindings

Storage: ceph, luminous 12.2.9

Networking: neutron + linux bridge

tags: added: live-migration neutron
tags: added: linuxbridge
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Does it happens always when you have more than two interfaces or it is random? If it is random does increasing the number of interfaces makes it more likely to fail?

I'm setting this bug to Invalid, please set it back to New when the questions are answered.

Changed in nova:
status: New → Incomplete
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

I mean incomplete not invalid

Revision history for this message
sean mooney (sean-k-mooney) wrote :

there seam to be 2 issues

2019-03-12 15:25:59.490 16855 ERROR nova.virt.libvirt.driver [req-04b22e72-a9a4-423e-a904-3396dcf7beb2 39cc388544de4fc0b33f421181fc8683 b9e3c4e073bd4c14a1dba80e4f98636a - default default] [instance: 981fcdcd-5121-46f8-b36c-35a20c3e651f] Live Migration failure: Cannot get interface MTU on 'brqb010f6ce-84': No such device: libvirtError: Cannot get interface MTU on 'brqb010f6ce-84': No such device

so libvirt raised an error which i guess we ignored then later we hit the vif timeout

do you see and os-vif related issue?
i noticed some failures in the gate recently for linuxbridge https://bugs.launchpad.net/os-vif/+bug/1893144. its proably unrelated but we do not have any linuxbridge live migration jobs for most patches. we have one that runs on a very limited set of changes
https://github.com/openstack/nova/blob/master/.zuul.yaml#L434-L439

but in general it very lightly tested. we might want to put that into the periodic pipeline.

i wont get to it today but i have a linux birdge test vm ill try deploying a second and see if i can replicate this later this week but i need to finish off a few other things first so it might slip to next week.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.