libvirt: tx/rx queue length and max queues are not updated on live migration

Bug #1854844 reported by sean mooney
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Triaged
Medium
sean mooney

Bug Description

While addressing https://bugs.launchpad.net/nova/+bug/1847367 "Images with hw:vif_multiqueue_enabled can be limited to 8 queues even if more are supported" in https://review.opendev.org/#/c/695118/
i noticed that we currently have no way of reporting per host networking config options such
as rx_queue_size(https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size) and tx_queue_size (https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size)

this means that on live migration the source node values are used on the destination which may be invalid. https://review.opendev.org/#/c/695118/ add a new [libvirt]/max_queue option that similarly
could change per host. at this time it is not clear if libvirt would allow the number of queues or queue length to be change as part of a live migration, as such it is not clear it the existing behaviour is correct and nova should select a host with a matching value or if the value can
be updated.

cold migration can be used as a workaround today as can shelved.
where live migration is used today and it is successfully a hard reboot will result in the
correct values for rx/tx_queue_size and max_queues being used. as such i am triaging this as low
given this is a latent issue that has not been reported for several release since the introduction
of rx/tx queue size in rocky https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html

description: updated
David Comay (comay)
summary: - libvirt: tx/rx queue lenght and max queues are not updtated on live
+ libvirt: tx/rx queue lenght and max queues are not updated on live
migration
summary: - libvirt: tx/rx queue lenght and max queues are not updated on live
+ libvirt: tx/rx queue length and max queues are not updated on live
migration
Revision history for this message
sean mooney (sean-k-mooney) wrote :

following downstream bug report https://bugzilla.redhat.com/show_bug.cgi?id=2135392
i can confirm that qemu does not allow the queue count to change during live migration.

the path forward for the queue count is to record the current queue count in use and restore that when we generate the updated XML.

the tx/rx queue size should be maintained the same way as should the mtu however long term it would be better to either schedule on these parmater or validate compatibility in pre live migration.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

note that this can be encountered during upgrade by live migrating from an non upgraded host to an upgraded host and then migrating the same instance without restarting to another host.

the first migration will work but the second will fail as not will attempt to update the XML and it will fail.

the workaround is to set the max_queue config option or hard reboot/cold migrate the instance.

Changed in nova:
importance: Low → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.