[sriov] Modifying or removing pci_passthrough_whitelist may result in inconsistent VF availability

Bug #1653810 reported by James Denton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

OpenStack Version: v14 (Newton)
NIC: Mellanox ConnectX-3 Pro

While testing an SR-IOV implementation, we found that pci_passthrough_whitelist in nova.conf is involved in the population of the pci_devices table in the Nova DB. Making changes to the device/interface in the whitelist or commenting out the line altogether, and restarting nova-compute, can result in the entries being marked as 'deleted' in the database. Reconfiguring the pci_passthrough_whitelist option with the same device/interface will result in new entries being created and marked as 'available'. This can cause PCI device claim issues if an existing instance is still running and using a VF and another instance is booted using a 'direct' port.

In the following table, you can see the original implementation that includes an allocated VF. During testing, we commented out the pci_passthrough_whitelist line in nova.conf, and restarted nova-compute. The entries were marked as 'deleted', though the running instance was not deleted and continued to function. The pci_passthrough_whitelist config was then returned and nova-compute restarted. New entries were created and marked as 'available':

MariaDB [nova]> select * from pci_devices;
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
| created_at | updated_at | deleted_at | deleted | id | compute_node_id | address | product_id | vendor_id | dev_type | dev_id | label | status | extra_info | instance_uuid | request_id | numa_node | parent_addr |
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 72 | 72 | 6 | 0000:07:00.0 | 1007 | 15b3 | type-PF | pci_0000_07_00_0 | label_15b3_1007 | unavailable | {} | NULL | NULL | 0 | NULL |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:43:23 | 75 | 75 | 6 | 0000:07:00.1 | 1004 | 15b3 | type-VF | pci_0000_07_00_1 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 78 | 78 | 6 | 0000:07:00.2 | 1004 | 15b3 | type-VF | pci_0000_07_00_2 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:44:25 | 81 | 81 | 6 | 0000:07:00.3 | 1004 | 15b3 | type-VF | pci_0000_07_00_3 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 84 | 84 | 6 | 0000:07:00.4 | 1004 | 15b3 | type-VF | pci_0000_07_00_4 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:43:23 | 87 | 87 | 6 | 0000:07:00.5 | 1004 | 15b3 | type-VF | pci_0000_07_00_5 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 90 | 90 | 6 | 0000:07:00.6 | 1004 | 15b3 | type-VF | pci_0000_07_00_6 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:44:51 | 93 | 93 | 6 | 0000:07:00.7 | 1004 | 15b3 | type-VF | pci_0000_07_00_7 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 17:40:25 | 2016-12-29 20:42:26 | 96 | 96 | 6 | 0000:07:01.0 | 1004 | 15b3 | type-VF | pci_0000_07_01_0 | label_15b3_1004 | allocated | {} | 178c733b-fb6a-4c97-b1e5-cdc14aae2e0d | b8d79a88-5918-4a38-b2fb-de97a263c70e | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 231 | 6 | 0000:07:00.0 | 1007 | 15b3 | type-PF | pci_0000_07_00_0 | label_15b3_1007 | available | {} | NULL | NULL | 0 | NULL |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 234 | 6 | 0000:07:00.1 | 1004 | 15b3 | type-VF | pci_0000_07_00_1 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 237 | 6 | 0000:07:00.2 | 1004 | 15b3 | type-VF | pci_0000_07_00_2 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 240 | 6 | 0000:07:00.3 | 1004 | 15b3 | type-VF | pci_0000_07_00_3 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 243 | 6 | 0000:07:00.4 | 1004 | 15b3 | type-VF | pci_0000_07_00_4 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 246 | 6 | 0000:07:00.5 | 1004 | 15b3 | type-VF | pci_0000_07_00_5 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 249 | 6 | 0000:07:00.6 | 1004 | 15b3 | type-VF | pci_0000_07_00_6 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:38 | NULL | NULL | 0 | 252 | 6 | 0000:07:01.0 | 1004 | 15b3 | type-VF | pci_0000_07_01_0 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+

A new instance was then booted using a new 'direct' port. The instance was marked in an ERROR state with the following error:

2017-01-03 16:10:10.513 12103 ERROR nova.compute.manager [instance: ad961a72-198f-4e3d-8ce0-c157668a44d6] libvirtError: Requested operation is not valid: PCI device 0000:07:01.0 is in use by driver QEMU, domain instance-0000007e

Instance instance-0000007e corresponds to the instance UUID in the DB, 178c733b-fb6a-4c97-b1e5-cdc14aae2e0d. The interface can be seen here:

root@compute01:# ip link show ens1d1
22: ens1d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br-vlan portid e41d2d03005b6213 state UP mode DEFAULT group default qlen 1000
    link/ether e4:1d:2d:5b:62:13 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 1 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 2 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 3 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 4 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 5 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 6 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
    vf 7 MAC fa:16:3e:27:bd:90, vlan 50, spoof checking on, link-state enable

No attempt was made to provision a different VF, or to re-populate the entries in pci_devices based on the existing VF allocation on the host. I'm not sure what the expected action was meant to be in this circumstance, if any.

A similar bug was reported at: https://bugs.launchpad.net/nova/+bug/1633120

Please let me know if you need any additional info.

Tags: pci
Revision history for this message
Matt Riedemann (mriedem) wrote :

Will need Moshe Levi (moshele in IRC) to take a look at this.

tags: added: pci
Revision history for this message
Andrey Volkov (avolkov) wrote :

I reproduced the same behavior on master. A possible solution I see is to update status, instance_uuid, request_id fields for a new pci_device record from deleted record with the same address.

Changed in nova:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/426243

Changed in nova:
assignee: nobody → Andrey Volkov (avolkov)
status: Confirmed → In Progress
Changed in nova:
assignee: Andrey Volkov (avolkov) → Maciej Kucia (maciejkucia)
Changed in nova:
assignee: Maciej Kucia (maciejkucia) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Andrey Volkov (<email address hidden>) on branch: master
Review: https://review.openstack.org/426243
Reason: Change my mind, it's such an edge case really not worth increasing code complexity.

Andrey Volkov (avolkov)
Changed in nova:
status: In Progress → Triaged
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.