Comment 9 for bug 1668829

Revision history for this message
Jan Scheurich (jan-scheurich) wrote :

Hi Rafael,

Some answers to your questions:

1. You are probably right that OpenStack Mitaka in principle supports assigning one vhost queue per vCPU of an instance, but since this requires support in VNFs we cannot utilize this in general.

Some VNFs we need to support with our NFV Infrastructure are not able to deal with multiple vhost queues or, if they can, may not distribute traffic evenly over multiple TX queues. That's is why vhost-multique is not a general solution to the problem we see with short virtio queues.

2. The run-time behavior of VNFs on Qemu 2.5 has degraded compared to Qemu 2.2 and 2.3. The increased burstiness of TX traffic is much more likely to overrun the short 256 slot virtio queues, which leads to the increase of packet drops at lower traffic rates.

But even with Qemu 2.0 certain VNFs drop TX packets at traffic rates well below the maximum vSwitch throughput because of too short TX queues. We have seen a 30% increase in throughput at same packet drop level between Qemu 2.5 with 1024 queue slots and Qemu 2.2 with the original 256 queue slots, which indicates that the original queue size is underdimensioned.

3. We agree that your bisection approach is the only way to find the commit between Qemu 2.3 and 2.5 that is responsible for the increased burstiness. Then we can assess if this is bug, or an avoidable consequence of some new feature implemented in Qemu 2.4 or 2.5 and decide on the right upstreaming strategy for this. With the 1K queue length option in place fixing this is clearly no longer as critical.

It is not guaranteed that all of the 12 intermediate commits picked by the bisect procedure will be fully working in end-to-end NFV context, though. We might need to try out more commits than 12 to hunt down the guilty one. When we have the test channel ready for this procedure, we will find out.

4. For the above reasons we really want to make the virtio queue length configurable in rx and tx direction in upstream Qemu up to the limit of 1024 as earlier proposed by Patrick Hermansson (https://patchwork.ozlabs.org/patch/549544/). This can be done per port or by a global default configuration option.

BR, Jan