No nova hypervisor can be enabled on workers with QAT devices
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
sean mooney | ||
StarlingX |
Fix Released
|
High
|
Jim Gauld |
Bug Description
Brief Description
-----------------
Unable to enable a host as nova hypervisor due to pci device cannot be found if the host has QAT devices (C62x or DH895XCC) configured.
Severity
--------
Major
Steps to Reproduce
------------------
- Install and configure a system where worker nodes have QAT devices configured. e.g.,
[wrsroot@
+------
| name | address | class id | vendor id | device id | class name | vendor name | device name | numa_node | enabled |
+------
| pci_0000_09_00_0 | 0000:09:00.0 | 0b4000 | 8086 | 0435 | Co-processor | Intel Corporation | DH895XCC Series QAT | 0 | True |
| pci_0000_0c_00_0 | 0000:0c:00.0 | 030000 | 102b | 0522 | VGA compatible controller | Matrox Electronics Systems Ltd. | MGA G200e [Pilot] ServerEngines (SEP1) | 0 | True |
+------
compute-0:~$ lspci | grep QAT
09:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
09:01.0 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
09:01.1 Co-processor: Intel Corporation DH895XCC Series QAT Virtual Function
...
- check nova hypervisor-list
Expected Behavior
------------------
- Nova hypervisors exist on system
Actual Behavior
----------------
[wrsroot@
+----+-
| ID | Hypervisor hostname | State | Status |
+----+-
+----+-
Reproducibility
---------------
Reproducible
System Configuration
-------
Any system type with QAT devices configured on worker node
Branch/Pull Time/Commit
-------
stx master as of 2019-03-18
Last Pass
--------------
on f/stein branch in early feb
Timestamp/Logs
--------------
# nova-compute pods are spewing errors so they can't register themselves properly as hypervisors:
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
2019-03-25 18:46:49,899.899 62394 ERROR nova.compute.
Changed in nova: | |
importance: | Undecided → High |
assignee: | nobody → sean mooney (sean-k-mooney) |
status: | New → In Progress |
tags: | added: stein-rc-potential |
tags: | removed: stx.helpwanted |
Changed in starlingx: | |
assignee: | nobody → Chris Friesen (cbf123) |
assignee: | Chris Friesen (cbf123) → Jim Gauld (jgauld) |
status: | Triaged → In Progress |
tags: |
added: stx.2.0 removed: stx.2019.05 |
tags: | added: stx.retestneeded |
Marking as release gating; high priority given this appears to be impacting the use of systems w/ qat devices