Upon trying to create VM instance (Say A) with one QAT VF, it fails with the following error i.e., “Requested operation is not valid: PCI device 0000:88:04.7 is in use by driver QEMU, domain instance-00000081”. Please note that, PCI device 0000:88:04.7 is already being assigned to another VM (Say B) . We have installed openstack-mitaka release on CentO7 system. It has two Intel QAT devices. There are 32 VF devices available per QAT Device/DH895xCC device Out of 64 VFs, only 8 VFs are allocated (to VM instances) and rest should be available.
But the nova scheduler tries to assign an already-in-use SRIOV VF to a new instance and instance fails. It appears that the nova database is not tracking which VF's have already been taken. But if I shut down VM B instance, then other instance VM A boots up and vice-versa. Note that, both the VM instances cannot run simultaneously because of the aforesaid issue.
We should always be able to create as many instances with the requested PCI devices as there are available VFs.
Please feel free to let me know if additional information is needed. Can anyone please suggest why it tries to assign same PCI device which has been assigned already? Is there any way to resolve this issue? Thank you in advance for your support and help.
[root@localhost ~(keystone_admin)]# lspci -d:435
83:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
88:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
[root@localhost ~(keystone_admin)]#
Upon trying to create VM instance (Say A) with one QAT VF, it fails with the following error i.e., “Requested operation is not valid: PCI device 0000:88:04.7 is in use by driver QEMU, domain instance-00000081”. Please note that, PCI device 0000:88:04.7 is already being assigned to another VM (Say B) . We have installed openstack-mitaka release on CentO7 system. It has two Intel QAT devices. There are 32 VF devices available per QAT Device/DH895xCC device Out of 64 VFs, only 8 VFs are allocated (to VM instances) and rest should be available.
But the nova scheduler tries to assign an already-in-use SRIOV VF to a new instance and instance fails. It appears that the nova database is not tracking which VF's have already been taken. But if I shut down VM B instance, then other instance VM A boots up and vice-versa. Note that, both the VM instances cannot run simultaneously because of the aforesaid issue.
We should always be able to create as many instances with the requested PCI devices as there are available VFs.
Please feel free to let me know if additional information is needed. Can anyone please suggest why it tries to assign same PCI device which has been assigned already? Is there any way to resolve this issue? Thank you in advance for your support and help.
[root@localhost ~(keystone_admin)]# lspci -d:435
83:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
88:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# lspci -d:443 | grep "QAT Virtual Function" | wc -l
64
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# mysql -u root nova -e "SELECT hypervisor_ hostname, address, instance_uuid, status FROM pci_devices JOIN compute_nodes oncompute_ nodes.id= compute_ node_id" | grep 0000:88:04.7 e58e-4071- a4dd-7a545e8000 de allocated 198d-4150- ba0f-a80b912d80 21 allocated 83f0-4881- b68f-6d154d565c e3 allocated nfv.benunets. com 0000:88:04.7 0c3c11a5- f9a4-4f0d- b120-40e4dde843 d4 allocated
localhost 0000:88:04.7 e10a76f3-
localhost 0000:88:04.7 c3dbac90-
localhost 0000:88:04.7 c7f6adad-
localhost.
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# grep -r e10a76f3- e58e-4071- a4dd-7a545e8000 de /etc/libvirt/qemu qemu/instance- 00000081. xml: <uuid>e10a76f3- e58e-4071- a4dd-7a545e8000 de</uuid> qemu/instance- 00000081. xml: <entry name='uuid' >e10a76f3- e58e-4071- a4dd-7a545e8000 de</entry> qemu/instance- 00000081. xml: <source file='/ var/lib/ nova/instances/ e10a76f3- e58e-4071- a4dd-7a545e8000 de/disk' /> qemu/instance- 00000081. xml: <source path='/ var/lib/ nova/instances/ e10a76f3- e58e-4071- a4dd-7a545e8000 de/console. log'/> qemu/instance- 00000081. xml: <source path='/ var/lib/ nova/instances/ e10a76f3- e58e-4071- a4dd-7a545e8000 de/console. log'/> f9a4-4f0d- b120-40e4dde843 d4 /etc/libvirt/qemu qemu/instance- 000000ab. xml: <uuid>0c3c11a5- f9a4-4f0d- b120-40e4dde843 d4</uuid> qemu/instance- 000000ab. xml: <entry name='uuid' >0c3c11a5- f9a4-4f0d- b120-40e4dde843 d4</entry> qemu/instance- 000000ab. xml: <source file='/ var/lib/ nova/instances/ 0c3c11a5- f9a4-4f0d- b120-40e4dde843 d4/disk' /> qemu/instance- 000000ab. xml: <source path='/ var/lib/ nova/instances/ 0c3c11a5- f9a4-4f0d- b120-40e4dde843 d4/console. log'/> qemu/instance- 000000ab. xml: <source path='/ var/lib/ nova/instances/ 0c3c11a5- f9a4-4f0d- b120-40e4dde843 d4/console. log'/>
/etc/libvirt/
/etc/libvirt/
/etc/libvirt/
/etc/libvirt/
/etc/libvirt/
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# grep -r 0c3c11a5-
/etc/libvirt/
/etc/libvirt/
/etc/libvirt/
/etc/libvirt/
/etc/libvirt/
[root@localhost ~(keystone_admin)]#
On the controller, , it appears there are duplicate PCI device entries in the Database:
MariaDB [nova]> select hypervisor_ hostname, address, count(* ) from pci_devices JOIN compute_nodes on compute_ nodes.id= compute_ node_id group by hypervisor_ hostname, address having count(*) > 1; ------- ------- -+----- ------- --+---- ------+ ------- ------- -+----- ------- --+---- ------+ ------- ------- -+----- ------- --+---- ------+
+------
| hypervisor_hostname | address | count(*) |
+------
| localhost | 0000:05:00.0 | 3 |
| localhost | 0000:05:00.1 | 3 |
| localhost | 0000:83:01.0 | 3 |
| localhost | 0000:83:01.1 | 3 |
| localhost | 0000:83:01.2 | 3 |
| localhost | 0000:83:01.3 | 3 |
| localhost | 0000:83:01.4 | 3 |
| localhost | 0000:83:01.5 | 3 |
| localhost | 0000:83:01.6 | 3 |
| localhost | 0000:83:01.7 | 3 |
| localhost | 0000:83:02.0 | 3 |
| localhost | 0000:83:02.1 | 3 |
| localhost | 0000:83:02.2 | 3 |
| localhost | 0000:83:02.3 | 3 |
| localhost | 0000:83:02.4 | 3 |
| localhost | 0000:83:02.5 | 3 |
| localhost | 0000:83:02.6 | 3 |
| localhost | 0000:83:02.7 | 3 |
| localhost | 0000:83:03.0 | 3 |
| localhost | 0000:83:03.1 | 3 |
| localhost | 0000:83:03.2 | 3 |
| localhost | 0000:83:03.3 | 3 |
| localhost | 0000:83:03.4 | 3 |
| localhost | 0000:83:03.5 | 3 |
| localhost | 0000:83:03.6 | 3 |
| localhost | 0000:83:03.7 | 3 |
| localhost | 0000:83:04.0 | 3 |
| localhost | 0000:83:04.1 | 3 |
| localhost | 0000:83:04.2 | 3 |
| localhost | 0000:83:04.3 | 3 |
| localhost | 0000:83:04.4 | 3 |
| localhost | 0000:83:04.5 | 3 |
| localhost | 0000:83:04.6 | 3 |
| localhost | 0000:83:04.7 | 3 |
| localhost | 0000:88:01.0 | 3 |
| localhost | 0000:88:01.1 | 3 |
| localhost | 0000:88:01.2 | 3 |
| localhost | 0000:88:01.3 | 3 |
| localhost | 0000:88:01.4 | 3 |
| localhost | 0000:88:01.5 | 3 |
| localhost | 0000:88:01.6 | 3 |
| localhost | 0000:88:01.7 | 3 |
| localhost | 0000:88:02.0 | 3 |
| localhost | 0000:88:02.1 | 3 |
| localhost | 0000:88:02.2 | 3 |
| localhost | 0000:88:02.3 | 3 |
| localhost | 0000:88:02.4 | 3 |
| localhost | 0000:88:02.5 | 3 |
| localhost | 0000:88:02.6 | 3 |
| localhost | 0000:88:02.7 | 3 |
| localhost | 0000:88:03.0 | 3 |
| localhost | 0000:88:03.1 | 3 |
| localhost | 0000:88:03.2 | 3 |
| localhost | 0000:88:03.3 | 3 |
| localhost | 0000:88:03.4 | 3 |
| localhost | 0000:88:03.5 | 3 |
| localhost | 0000:88:03.6 | 3 |
| localhost | 0000:88:03.7 | 3 |
| localhost | 0000:88:04.0 | 3 |
| localhost | 0000:88:04.1 | 3 |
| localhost | 0000:88:04.2 | 3 |
| localhost | 0000:88:04.3 | 3 |
| localhost | 0000:88:04.4 | 3 |
| localhost | 0000:88:04.5 | 3 |
| localhost | 0000:88:04.6 | 3 |
| localhost | 0000:88:04.7 | 3 |
+------
66 rows in set (0.00 sec)
MariaDB [nova]>