If flavor has more than 32 cpus cannot spawn instance if glance image has hw_vif_multiqueue_enabled='true'

Bug #1644839 reported by Saverio Proto
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

Steps to reproduce:

Create a glance image with hw_vif_multiqueue_enabled='true'.
Boot a instance with this image, and use a flavor with 46 cpus.

Bug introduced in:
https://review.openstack.org/#/c/128829/

See nova/virt/libvirt/vif.py line 163
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L163

If we have a flavor with more than 32 Cpus, the spawning of the instance will fail because it is not possible to have more than 32 hardware queues in the virtual network nic.

In our case we have a Flavor with 46 Cpus, and we get the following:
Invalid number of queues (= 46), must be a postive integer less than 31.

Whatever is the number of Cpus of the flavor, the max number of hw queues for the network nic should be 32.

Here the complete stacktrace:

2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [req-ccfc9497-1d91-43f1-b7d1-0bfcf5d90bfb 5c3f2df216fa42a987f8f4600e5e0da2 bdf747f88fee4b5a9faca3da7c26754c - - -] [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] Instance failed to spawn
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] Traceback (most recent call last):
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2156, in _build_resources
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] yield resources
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2009, in _build_and_run_instance
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] block_device_info=block_device_info)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2534, in spawn
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] block_device_info=block_device_info)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4620, in _create_domain_and_network
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] xml, pause=pause, power_on=power_on)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4550, in _create_domain
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] guest.launch(pause=pause)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 142, in launch
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] self._encoded_xml, errors='ignore')
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] six.reraise(self.type_, self.value, self.tb)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 137, in launch
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] return self._domain.createWithFlags(flags)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in proxy_call
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] rv = execute(f, *args, **kwargs)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] six.reraise(c, e, tb)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] rv = meth(*args, **kwargs)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1059, in createWithFlags
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] libvirtError: internal error: process exited while connecting to monitor: 2016-11-25T09:44:50.496310Z qemu-system-x86_64: -device virtio-net-pci,mq=on,vectors=94,netdev=hostnet0,id=net0,mac=fa:16:3e:19:32:9f,bus=pci.0,addr=0x3: Invalid number of queues (= 46), must be a postive integer less than 31.
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c] 2016-11-25T09:44:50.496389Z qemu-system-x86_64: -device virtio-net-pci,mq=on,vectors=94,netdev=hostnet0,id=net0,mac=fa:16:3e:19:32:9f,bus=pci.0,addr=0x3: Device 'virtio-net-pci' could not be initialized
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c]
2016-11-25 10:44:52.265 4528 ERROR nova.compute.manager [instance: 8e8af25a-74ad-4a68-9f39-7a987c2a049c]

Saverio Proto (zioproto)
summary: - If flavor has more than 32 cpus cannot spawn instancee if glance image
+ If flavor has more than 32 cpus cannot spawn instance if glance image
has hw_vif_multiqueue_enabled='true'
Revision history for this message
Saverio Proto (zioproto) wrote :

I found a related bug:
https://bugs.launchpad.net/nova/+bug/1570631

However my kernel is Linux zhdk0088 4.4.0-47-generic #68~14.04.1-Ubuntu SMP Wed Oct 26 19:42:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

And it looks like the limit is not 256 queues

Revision history for this message
Simon Leinen (simon-leinen) wrote :

Shouldn't the subject say "more than 30 cpus"? The error message was "Invalid number of queues (= 46), must be a postive integer less than 31." At any rate, this is substantially the same problem as bug #1570631, which was fixed by https://review.openstack.org/#/c/332660/ (though apparently not backported to Mitaka or Liberty?)

Revision history for this message
Simon Leinen (simon-leinen) wrote :

Correction to my previous comment: The patch in https://review.openstack.org/#/c/332660/ seems incomplete, since even on the Linux 4.4 kernel, we don't seem to get 256 queues, only 30 ("a positive integer less than 31"). But where does that limit come from?

Revision history for this message
Saverio Proto (zioproto) wrote :

I would close this bug a duplicated of 1570631

Revision history for this message
Saverio Proto (zioproto) wrote :

nova has to check the qemu version in addition to the kernel version and set its limit accordingly

In the version I am using of qemu (Ubuntu Liberty UCA) I have:

VIRTIO_PCI_QUEUE_MAX == 64

This leads to 31 max queues: (VIRTIO_PCI_QUEUE_MAX - 1) / 2

It is not just the Kernel version

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.