Comment 5 for bug 1944619

Revision history for this message
sean mooney (sean-k-mooney) wrote : Re: Instances with SRIOV ports loose access after failed live migrations

i need to read this a few times to make sure i have not missed anything but i just want to level set on the current support

today the following reqruiements and restriciton apply to the use of sriov and/or ovs hardware offload.

1 if the sriov nic agent is used for standard sriov vnic types (direct, direct-physical, macvtap) nics __must not__ be in __switchdev__ mode, __must__ be in __legacy__ mode
2 vdpa support in nova current does not support any move operations, vdpa supprot in nova requeres teh nic to be in switchdev mode.
3 hardware offloaded ovs uses the ml2/ovs or ml2/ovn mechansiume drivers and does nto use the sriov nic agent.
4 we do not support using the sriov nic agent and ovs hardware offload on teh same phsyical nic
  when using the sriov-nic-agent the nic must be in legacy mode and when using hardware offload it must be in swtichdev mode

we do not currently test or offically support sriov live migration between different ml2 driver but in princial it can work but any work required to make it work would really be a new feature.
it was not in the scope of the sriov live migration spec. limited issue could be adress as bug fixes but live migration form host using sriov nic agent to hardware offlowaded ovs was not in scope.

with that understanding set lets look at the bug you have reported

Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001

this is cause by trying to do move other vms that have neutron sriov port with shelve and unshleve https://bugs.launchpad.net/nova/+bug/1851545

this is very unlikely to be related to sriov live migration feature.

looking at the first pastbin you provdied you are not using standard sriov you are using hardware offloaded ovs at least on the destination host.

 Successfully unplugged vif VIFHostDevice(active=True,address=fa:16:3e:4d:86:24,dev_address=0000:03:04.1,dev_type='ethernet',has_traffic_filtering=True,id=3a3329b5-d123-4009-a152-3e1ed33bd15f,network=Network(725fffcf-f6bf-4e6f-8430-ca2536311805),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True)

can you confirm that you are using hardware offloade ovs on both the source and destionaton hosts
and if you are using ml2/ovs or ml2/ovn. The info in commnet 2 also indicates this is not
starndard sriov but hardware offloaded ovs which has a very differnt code path in nova and neutron so we should fix the title if this is infact hardware offloaded ovs.