SRIOV port binding_profile attributes for OVS hardware offload are stripped on instance deletion or port detachment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Description
===========
This issue applies for systems using SRIOV with Mellanox ASAP2 SDN offloads.
An SRIOV port capable for ASAP2 SDN acceleration (OVS hardware offloads) has 'capabilities=
After a VM has been created with SRIOV port attached, the port can no longer be used for subsequent VM builds. Attempt to reuse the port results in an error of the form "Cannot set interface MAC/vlanid to <mac>/<vlan> for ifname ens1f0 vf 7: Operation not supported"
The underlying issue appears to be that when an SRIOV port is detached from a VM, or the VM is destroyed, the capabilities=
If the port binding_profile property is restored then the port can be successfully reused.
The property is preserved during live migration, instance resizes and rebuilds. It only appears to be instance depletion or port detachment where the binding_profile property is removed.
Steps to reproduce
==================
1. Create SRIOV port with ASAP2 capability:
openstack port create --project <project> --network <network> --vnic-type=direct --binding-profile '{"capabilities": ["switchdev"]}' sriov-port-1
2. Check the port binding_profile property:
openstack port show -c binding_profile sriov-port-1
3. Create an instance using the port:
openstack server create --flavor <flavor> --image <image> --key-name <key> --nic port-id=
4. Delete the instance:
openstack server delete sriov-vm-1
5. Check the port binding_profile property:
openstack port show -c binding_profile sriov-port-1
Expected Result
===============
Nova sets properties in the binding_profile while the instance is in use. Alongside those properties the capabilities=
Actual Result
=============
After the instance is deleted (or port detached), the binding_profile is empty.
Environment
===========
This has been observed with the following configuration:
- OpenStack Yoga
- OVN Neutron driver
Logs
====
From Nova Compute:
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.270 7 ERROR nova.virt.
2023-01-24 19:55:32.273 7 ERROR nova.virt.
I agree that nova should not manipulate keys in the binding_profile that is not added by nova in the first place.
I looked through the yoga neutron code path and I don't see where the binding_profile is manipulated improperly. What I see that nova has a list of keys to manipulate and capabilities is not one of them.
I also tried on master with a normal port and added capabilities to the binding_profile then booted with the port and then deleted the VM. The capabilites I added remained in the port.
So I need help. Could you reproduce the lost capability with a normal port?