2021-09-22 20:49:14 |
Erlon R. Cruz |
bug |
|
|
added bug |
2021-09-22 20:50:16 |
Erlon R. Cruz |
description |
If for some reason a live migration fails for an instance with an SRIOV port
during the '_pre_live_migration' hook. The instance will lose access to the
network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't
know if there's another way to restore connectivity). As a side effect of this
behavior, the pre-live migration cleanup hook also fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
Create an environment with SRIOV, (our case uses switchdev[1])
Create 1 VM
Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
Check the VM's connectivity
Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/ |
If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
- Create an environment with SRIOV, (our case uses switchdev[1])
- Create 1 VM
- Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
- Check the VM's connectivity
- Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/ |
|
2021-09-22 21:42:28 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2021-09-23 14:40:25 |
Bernard Cafarelli |
neutron: status |
New |
Incomplete |
|
2021-09-23 14:41:58 |
Bernard Cafarelli |
tags |
|
sriov-pci-pt |
|
2021-09-23 19:00:01 |
Erlon R. Cruz |
neutron: status |
Incomplete |
New |
|
2021-09-23 19:02:16 |
Erlon R. Cruz |
description |
If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
- Create an environment with SRIOV, (our case uses switchdev[1])
- Create 1 VM
- Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
- Check the VM's connectivity
- Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/ |
If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
- Create an environment with SRIOV, (our case uses switchdev[1])
- Create 1 VM
- Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
- Check the VM's connectivity
- Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
[Environment]
Focal Ussuri with Melanox Connect5 cards
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/ |
|
2021-09-23 23:00:55 |
Brett Milford |
bug |
|
|
added subscriber Brett Milford |
2021-09-30 10:45:08 |
Rodolfo Alonso |
neutron: status |
New |
Incomplete |
|
2021-09-30 10:45:27 |
Rodolfo Alonso |
bug task added |
|
nova |
|
2021-10-05 14:29:30 |
sean mooney |
nova: status |
New |
Incomplete |
|
2021-10-05 14:30:13 |
sean mooney |
tags |
sriov-pci-pt |
live-migration ovs sriov-pci-pt |
|
2021-10-06 19:44:07 |
Erlon R. Cruz |
summary |
Instances with SRIOV ports loose access after failed live migrations |
Instances with hardware offloaded ovs ports loose access after failed live migrations |
|
2021-10-20 19:22:37 |
Erlon R. Cruz |
description |
If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
- Create an environment with SRIOV, (our case uses switchdev[1])
- Create 1 VM
- Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
- Check the VM's connectivity
- Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
[Environment]
Focal Ussuri with Melanox Connect5 cards
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/ |
If for some reason a live migration fails for an instance with an SRIOV port during the '_pre_live_migration' hook. The instance will lose access to the network and leave behind duplicated port bindings on the database.
The instance re-gains connectivity on the source host after a reboot (don't know if there's another way to restore connectivity). As a side effect of this behavior, the pre-live migration cleanup hook also fails with:
PCI device 0000:3b:10.0 is in use by driver QEMU
[How to reproduce]
- Create an environment with SRIOV, (our case uses switchdev[1])
- Create 1 VM
- Provoke a failure in the _pre_live_migration process (for example creating a directory /var/lib/nova/instances/<instance id>)
- Check the VM's connectivity
- Check the logs for: libvirt.libvirtError: Requested operation is not valid: PCI device 0000:03:04.1 is in use by driver QEMU, domain instance-00000001
Full-stack trace[2]
[Expected]
VM connectivity is restored even if it gets a brief disconnection
As happens for non-SRIOV scenarios, after a failure, no leftovers remains (port bindings and instance path files)
[Observed]
VM loses connectivity which is only is restored after the VM status is set to ERROR and the VM is power recycled
Port bindings are not removed
[Environment]
Focal Ussuri with Mellanox Connect5 cards
[1] https://paste.ubuntu.com/p/PzBM7y6Dbr/
[2] https://paste.ubuntu.com/p/ThQmDYtdSS/ |
|
2021-10-25 15:18:49 |
OpenStack Infra |
nova: status |
Incomplete |
In Progress |
|
2021-10-25 17:55:07 |
Erlon R. Cruz |
nova: assignee |
|
Erlon R. Cruz (sombrafam) |
|
2021-11-02 11:39:16 |
James Troup |
summary |
Instances with hardware offloaded ovs ports loose access after failed live migrations |
Instances with hardware offloaded ovs ports lose access after failed live migrations |
|
2022-03-08 13:56:47 |
sean mooney |
tags |
live-migration ovs sriov-pci-pt |
live-migration ovs sriov-pci-pt yoga-rc-potential |
|
2022-03-09 17:04:40 |
Sylvain Bauza |
nova: importance |
Undecided |
Medium |
|
2022-03-21 10:04:22 |
Sylvain Bauza |
tags |
live-migration ovs sriov-pci-pt yoga-rc-potential |
live-migration ovs sriov-pci-pt |
|
2022-03-30 00:33:43 |
OpenStack Infra |
nova: status |
In Progress |
Fix Released |
|
2022-05-17 22:08:36 |
melanie witt |
nominated for series |
|
nova/yoga |
|
2022-05-17 22:08:36 |
melanie witt |
bug task added |
|
nova/yoga |
|
2022-05-17 22:08:36 |
melanie witt |
nominated for series |
|
nova/victoria |
|
2022-05-17 22:08:36 |
melanie witt |
bug task added |
|
nova/victoria |
|
2022-05-17 22:08:36 |
melanie witt |
nominated for series |
|
nova/xena |
|
2022-05-17 22:08:36 |
melanie witt |
bug task added |
|
nova/xena |
|
2022-05-17 22:08:36 |
melanie witt |
nominated for series |
|
nova/ussuri |
|
2022-05-17 22:08:36 |
melanie witt |
bug task added |
|
nova/ussuri |
|
2022-05-17 22:08:36 |
melanie witt |
nominated for series |
|
nova/wallaby |
|
2022-05-17 22:08:36 |
melanie witt |
bug task added |
|
nova/wallaby |
|
2022-05-18 00:01:48 |
OpenStack Infra |
tags |
live-migration ovs sriov-pci-pt |
in-stable-yoga live-migration ovs sriov-pci-pt |
|
2022-05-18 00:01:59 |
OpenStack Infra |
nova/yoga: status |
New |
Fix Committed |
|
2022-05-18 16:59:58 |
OpenStack Infra |
nova/xena: status |
New |
In Progress |
|
2022-05-25 15:25:20 |
OpenStack Infra |
tags |
in-stable-yoga live-migration ovs sriov-pci-pt |
in-stable-xena in-stable-yoga live-migration ovs sriov-pci-pt |
|
2022-05-25 16:14:38 |
OpenStack Infra |
nova/xena: status |
In Progress |
Fix Committed |
|
2022-06-23 10:20:50 |
OpenStack Infra |
nova/yoga: status |
Fix Committed |
Fix Released |
|
2022-06-23 10:54:10 |
OpenStack Infra |
nova/xena: status |
Fix Committed |
Fix Released |
|
2022-10-21 19:14:15 |
OpenStack Infra |
nova/wallaby: status |
New |
In Progress |
|
2024-03-05 20:00:01 |
Shamnad N |
bug |
|
|
added subscriber Shamnad N |
2024-03-05 20:00:08 |
Shamnad N |
removed subscriber Shamnad N |
|
|
|