Comment 0 for bug 1821938

Revision history for this message
Yang Liu (yliu12) wrote :

Brief Description
-----------------
Unable to enable a host as nova hypervisor due to pci device cannot be found if the host has QAT devices (C62x or DH895XCC) configured.

Severity
--------
Major

Steps to Reproduce
------------------
- Install and configure a system where worker nodes have QAT devices configured. e.g.,
[wrsroot@controller-0 ~(keystone_admin)]$ system host-device-list compute-0
+------------------+--------------+----------+-----------+-----------+---------------------------+---------------------------------+----------------------------------------+-----------+---------+
| name | address | class id | vendor id | device id | class name | vendor name | device name | numa_node | enabled |
+------------------+--------------+----------+-----------+-----------+---------------------------+---------------------------------+----------------------------------------+-----------+---------+
| pci_0000_09_00_0 | 0000:09:00.0 | 0b4000 | 8086 | 0435 | Co-processor | Intel Corporation | DH895XCC Series QAT | 0 | True |
| pci_0000_0c_00_0 | 0000:0c:00.0 | 030000 | 102b | 0522 | VGA compatible controller | Matrox Electronics Systems Ltd. | MGA G200e [Pilot] ServerEngines (SEP1) | 0 | True |
+------------------+--------------+----------+-----------+-----------+---------------------------+---------------------------------+----------------------------------------+-----------+---------+

compute-0:~$ lspci | grep QAT
09:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
09:01.0 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
09:01.1 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
...

- check nova hypervisor-list

Expected Behavior
------------------
- Nova hypervisors exist on system

Actual Behavior
----------------
[wrsroot@controller-0 ~(keystone_admin)]$ nova hypervisor-list
+----+---------------------+-------+--------+
| ID | Hypervisor hostname | State | Status |
+----+---------------------+-------+--------+
+----+---------------------+-------+--------+

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Any system type with QAT devices configured on worker node

Branch/Pull Time/Commit
-----------------------
master as of 2019-03-18

Last Pass
--------------
on f/stein branch in early feb

Timestamp/Logs
--------------
# nova-compute pods are spewing errors so they can't register themselves properly as hypervisors:
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager [req-4f652d4c-da7e-4516-9baa-915265c3fdda - - - - -] Error updating resources for node compute-0.: PciDeviceNotFoundById: PCI device 0000:09:02.3 not found
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager Traceback (most recent call last):
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/compute/manager.py", line 7956, in _update_available_resource_for_node
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager startup=startup)
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 727, in update_available_resource
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename)
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7098, in get_available_resource
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager self._get_pci_passthrough_devices()
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6102, in _get_pci_passthrough_devices
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager pci_info.append(self._get_pcidev_info(name))
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6062, in _get_pcidev_info
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager device.update(_get_device_type(cfgdev, address))
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6021, in _get_device_type
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager pci_address, pf_interface=True),
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager File "/var/lib/openstack/lib/python2.7/site-packages/nova/pci/utils.py", line 159, in get_ifname_by_pci_address
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager raise exception.PciDeviceNotFoundById(id=pci_addr)
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager PciDeviceNotFoundById: PCI device 0000:09:02.3 not found
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.manager