Libvirt error when using --max > 1 with vGPU
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Sylvain Bauza |
Bug Description
Description
===========
Using devstack Rocky with a NVIDIA Tesla M10 + GRID driver on RHEL 7.5.
Profile used in nova: nvidia-35 (num_heads=2, frl_config=45, framebuffer=512M, max_resolution=
I can launch instances one by one without any issue.
I cannot use --max paramater greater than 1.
Expected result
===============
Be able to use --max parameter with vGPU
Steps to reproduce
==================
[root@host2 ~]# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 56aeda96-
+------
[root@host2 ~]# openstack server create --flavor vgpu --image rhel75 --key-name myself --max 2 instance
+------
| Field | Value |
+------
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-STS:vm_state | building |
| OS-SRV-
| OS-SRV-
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | iNiFmD6kNszw |
| config_drive | |
| created | 2018-07-
| flavor | vgpu (vgpu1) |
| hostId | |
| id | 5a8691a8-
| image | rhel75 (e63a49a8-
| key_name | myself |
| name | instance-1 |
| progress | 0 |
| project_id | fdea2c781db74ae
| properties | |
| security_groups | name='default' |
| status | BUILD |
| updated | 2018-07-
| user_id | 130a646fc362418
| volumes_attached | |
+------
[root@host2 ~]# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 515f0d21-
| 5a8691a8-
| 56aeda96-
+------
[root@host2 ~]# openstack server create --flavor vgpu --image rhel75 --key-name myself --max 1 instance
+------
| Field | Value |
+------
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-STS:vm_state | building |
| OS-SRV-
| OS-SRV-
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | MGxmntECb22S |
| config_drive | |
| created | 2018-07-
| flavor | vgpu (vgpu1) |
| hostId | |
| id | 24df940f-
| image | rhel75 (e63a49a8-
| key_name | myself |
| name | instance |
| progress | 0 |
| project_id | fdea2c781db74ae
| properties | |
| security_groups | name='default' |
| status | BUILD |
| updated | 2018-07-
| user_id | 130a646fc362418
| volumes_attached | |
+------
[root@host2 ~]# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 24df940f-
| 515f0d21-
| 5a8691a8-
| 56aeda96-
+------
[root@host2 ~]# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 24df940f-
| 515f0d21-
| 5a8691a8-
| 56aeda96-
+------
[root@host2 ~]# openstack server create --flavor vgpu --image rhel75 --key-name myself --max 1 instance
+------
| Field | Value |
+------
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-STS:vm_state | building |
| OS-SRV-
| OS-SRV-
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | 69crZEFxBT9j |
| config_drive | |
| created | 2018-07-
| flavor | vgpu (vgpu1) |
| hostId | |
| id | 4a172549-
| image | rhel75 (e63a49a8-
| key_name | myself |
| name | instance |
| progress | 0 |
| project_id | fdea2c781db74ae
| properties | |
| security_groups | name='default' |
| status | BUILD |
| updated | 2018-07-
| user_id | 130a646fc362418
| volumes_attached | |
+------
[root@host2 ~]# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 4a172549-
| 24df940f-
| 515f0d21-
| 5a8691a8-
| 56aeda96-
+------
[root@host2 ~]# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 4a172549-
| 24df940f-
| 515f0d21-
| 5a8691a8-
| 56aeda96-
+------
- Nova error:
{u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance de2a5078-
- Libvirt error:
messages:Jul 5 03:32:51 host2 nova-compute: #033[00m: libvirtError: Requested operation is not valid: mediated device /sys/bus/
messages:Jul 5 03:32:51 host2 nova-compute: #033[01;31mERROR nova.virt.
description: | updated |
Changed in nova: | |
assignee: | nobody → Sylvain Bauza (sylvain-bauza) |
importance: | Undecided → High |
Setting this to Confirmed as the Importance has been set and the bug has been Assigned.