Activity log for bug #1944619

Date Who What changed Old value New value Message
2021-09-22 20:49:14 Erlon R. Cruz bug added bug
2021-09-22 20:50:16 Erlon R. Cruz description If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] Create an environment with SRIOV, (our case uses switchdev[1]) Create 1 VM Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) Check the VM's connectivity Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/ If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/
2021-09-22 21:42:28 Dominique Poulain bug added subscriber Dominique Poulain
2021-09-23 14:40:25 Bernard Cafarelli neutron: status New Incomplete
2021-09-23 14:41:58 Bernard Cafarelli tags sriov-pci-pt
2021-09-23 19:00:01 Erlon R. Cruz neutron: status Incomplete New
2021-09-23 19:02:16 Erlon R. Cruz description If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/ If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [Environment] Focal Ussuri with Melanox Connect5 cards [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/
2021-09-23 23:00:55 Brett Milford bug added subscriber Brett Milford
2021-09-30 10:45:08 Rodolfo Alonso neutron: status New Incomplete
2021-09-30 10:45:27 Rodolfo Alonso bug task added nova
2021-10-05 14:29:30 sean mooney nova: status New Incomplete
2021-10-05 14:30:13 sean mooney tags sriov-pci-pt live-migration ovs sriov-pci-pt
2021-10-06 19:44:07 Erlon R. Cruz summary Instances with SRIOV ports loose access after failed live migrations Instances with hardware offloaded ovs ports loose access after failed live migrations
2021-10-20 19:22:37 Erlon R. Cruz description If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled [Environment] Focal Ussuri with Melanox Connect5 cards [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/ If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database. The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with: PCI device 0000:3b:10.0 is in use by driver QEMU [How to reproduce] - Create an environment with SRIOV, (our case uses switchdev[1]) - Create 1 VM - Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>) - Check the VM's connectivity - Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001 Full-stack trace[2] [Expected] VM connectivity is restored even if it gets a brief disconnection As happens for non-SRIOV scenarios, after a failure, no leftovers remains (port bindings and instance path files) [Observed] VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled Port bindings are not removed [Environment] Focal Ussuri with Mellanox Connect5 cards [1] https://paste.ubuntu.com/p/PzBM7y6Dbr/ [2] https://paste.ubuntu.com/p/ThQmDYtdSS/
2021-10-25 15:18:49 OpenStack Infra nova: status Incomplete In Progress
2021-10-25 17:55:07 Erlon R. Cruz nova: assignee Erlon R. Cruz (sombrafam)
2021-11-02 11:39:16 James Troup summary Instances with hardware offloaded ovs ports loose access after failed live migrations Instances with hardware offloaded ovs ports lose access after failed live migrations
2022-03-08 13:56:47 sean mooney tags live-migration ovs sriov-pci-pt live-migration ovs sriov-pci-pt yoga-rc-potential
2022-03-09 17:04:40 Sylvain Bauza nova: importance Undecided Medium
2022-03-21 10:04:22 Sylvain Bauza tags live-migration ovs sriov-pci-pt yoga-rc-potential live-migration ovs sriov-pci-pt
2022-03-30 00:33:43 OpenStack Infra nova: status In Progress Fix Released
2022-05-17 22:08:36 melanie witt nominated for series nova/yoga
2022-05-17 22:08:36 melanie witt bug task added nova/yoga
2022-05-17 22:08:36 melanie witt nominated for series nova/victoria
2022-05-17 22:08:36 melanie witt bug task added nova/victoria
2022-05-17 22:08:36 melanie witt nominated for series nova/xena
2022-05-17 22:08:36 melanie witt bug task added nova/xena
2022-05-17 22:08:36 melanie witt nominated for series nova/ussuri
2022-05-17 22:08:36 melanie witt bug task added nova/ussuri
2022-05-17 22:08:36 melanie witt nominated for series nova/wallaby
2022-05-17 22:08:36 melanie witt bug task added nova/wallaby
2022-05-18 00:01:48 OpenStack Infra tags live-migration ovs sriov-pci-pt in-stable-yoga live-migration ovs sriov-pci-pt
2022-05-18 00:01:59 OpenStack Infra nova/yoga: status New Fix Committed
2022-05-18 16:59:58 OpenStack Infra nova/xena: status New In Progress
2022-05-25 15:25:20 OpenStack Infra tags in-stable-yoga live-migration ovs sriov-pci-pt in-stable-xena in-stable-yoga live-migration ovs sriov-pci-pt
2022-05-25 16:14:38 OpenStack Infra nova/xena: status In Progress Fix Committed
2022-06-23 10:20:50 OpenStack Infra nova/yoga: status Fix Committed Fix Released
2022-06-23 10:54:10 OpenStack Infra nova/xena: status Fix Committed Fix Released
2022-10-21 19:14:15 OpenStack Infra nova/wallaby: status New In Progress
2024-03-05 20:00:01 Shamnad N bug added subscriber Shamnad N
2024-03-05 20:00:08 Shamnad N removed subscriber Shamnad N