vm live migration not working when port is admin down

Bug #1999582 reported by do3meli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned
neutron
Invalid
Undecided
Unassigned

Bug Description

having a vm with a provider network port that has it's status set to "admin down" won't let this vm migrate successfully. the source compute host will abort the migration with the following message:

Timed out waiting for events: [('network-vif-plugged', '8066f303-a72c-4784-9fc0-de5f7d8a993f'), ('network-vif-plugged', '6c167efe-3a55-46d1-baf1-4962b3bc387f')]. If these timeouts are a persistent issue it could mean the networking backend on host computeXXX does not support sending these events unless there are port binding host changes which does not happen at this point in the live migration process. You may need to disable the live_migration_wait_for_vif_plug option on host computeXXX.: eventlet.timeout.Timeout: 300 seconds

setting the port admin status up and re-running the life migration let the vm successfully complete. Therefore the logic for the live_migration_wait_for_vif_plug settings should be adjusted to take into consideration if a port should be up or not.

Environment:

- Ubuntu 20.04.5 with Cloud Archive Repositories in version 19.4.0-0ubuntu1~cloud0 on Xena branch
- Using ML2 with OVS on physical provider network

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello do3meli:

This configuration parameter "live_migration_wait_for_vif_plug" belongs to Nova. If you check the documentation, that parameter will tell Nova if it needs to wait (and how long) or not for the "vif-plugged-event", sent by Neutron.

When using ML2/OVS, if the port admin status is set to down, the agent reports it as DOWN and the Neutron server doesn´t send the expected "vif-plugged-event".

This is not a Neutron bug but something that should be documented in the Nova documentation.

Regards.

Changed in neutron:
status: New → Invalid
Revision history for this message
do3meli (d-info-e) wrote :

Hi Rodolfo

Thanks for the heads up. I have verified and it seems correct that neutron openvswitch agent reports the port correctly as "down:

2022-12-13 18:52:49.053 2955524 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-149353dc-53f3-48ca-97e1-ec4f47dc56e5 - - - - -] Port 6c167efe-3a55-46d1-baf1-4962b3bc387f is being migrated to host computeXXX.

2022-12-13 18:52:49.053 2955524 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-149353dc-53f3-48ca-97e1-ec4f47dc56e5 - - - - -] Port 6c167efe-3a55-46d1-baf1-4962b3bc387f updated. Details: {'device': '6c167efe-3a55-46d1-baf1-4962b3bc387f', 'device_id': '5d270ee8-76aa-46d8-9ebc-09f51de1b5d9', 'network_id': '4ea4469f-4791-4483-9b10-6b01a0857989', 'port_id': '6c167efe-3a55-46d1-baf1-4962b3bc387f', 'mac_address': 'fa:16:3e:8c:5c:b8', 'admin_state_up': False, 'status': 'DOWN', 'network_type': 'vlan', 'segmentation_id': 102, 'physical_network': 'provider', 'fixed_ips': [{'subnet_id': 'a2831e83-8a5a-4f02-ae09-2eb674089026', 'ip_address': '185.XXX.XX.X'}], 'device_owner': 'compute:XXXXX', 'allowed_address_pairs': [], 'port_security_enabled': True, 'qos_policy_id': None, 'network_qos_policy_id': None, 'profile': {'os_vif_delegation': True, 'migrating_to': 'computeXXX'}, 'vif_type': 'ovs', 'vnic_type': 'normal', 'security_groups': ['4096f1f0-094a-4f7d-b539-8cb387f70763'], 'migrating_to': 'computeXXX'}

This leads me to think that there is a problem in nova that does not correctly handle the "admin port down" case when doing live migrations. Therefore i have added nova as affected project to this bug report. It would be great to have somebody from nova verify this.

tags: added: live-migration
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.