Yoga, Live migration bandwidth not applies to disks

Bug #2008935 reported by keerthivasan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

Description
===========
I am trying to live migrate using --block-migration flag to move local disks between compute nodes. As part of the I am setting "live_migration_bandwidth=900" (MiB), able to see value applied by the libvirt, but copying disks take very longer, definitely not using the above bandwidth. Memory copy is ver faster & bandwidth applied properly

Steps to reproduce
==================

VM spec ( Pinned vm ) with disk OS-FLV-EXT-DATA:ephemeral 600 & root disk 10.
As part of migration using --block-migration copies disk first before the actual memory transfer

from libvirt disk targets

 Target Source
----------------------------------------------------------------------------------
 vda /var/lib/nova/instances/84b43962-5623-42c2-9ecd-26e09753dead/disk
 vdb /var/lib/nova/instances/84b43962-5623-42c2-9ecd-26e09753dead/disk.eph0

blockjob info for vdb
----

Block Copy: [ 12 %] Bandwidth limit: 943718400 bytes/s (900.000 MiB/s) ( applied bandwidth )

Actual speed is 295Kb to 2Mb while checking with iftop. It is clear that actual bandwidth is not applied for the disk transfer

Expected result
===============
Disk copy should be done using the applied bandwidth

Actual result
=============

 It is clear that actual bandwidth is not applied for the disk transfer

Environment
===========

Openstack version Yoga across both source & destination compute nodes

Logs & Configs
==============

Revision history for this message
keerthivasan (keerthivassan86) wrote :

Hello Sean

Can you please help in review this ?

description: updated
description: updated
Revision history for this message
sean mooney (sean-k-mooney) wrote :

This sounds like a issue with your networking configuration.

nova is not involved in any data copy for live migration (ram or disk)
its entirely handled by libvirt and qemu.

the libvirt live_migration_bandwidth is a maximum limit on how much bandwidth can be used by libvirt/qemu to do the data transfer
it is not a bandwidth allocation or even a bandwidth target.
https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.live_migration_bandwidth

"""
Maximum bandwidth(in MiB/s) to be used during migration.

If set to 0, the hypervisor will choose a suitable default. Some hypervisors do not support this feature and will return an error if bandwidth is not 0. Please refer to the libvirt documentation for further details.
"""

i would check the live_migration_inbound_addr to ensure that the migration data transfer is happening over the network interface you belive it is

https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.live_migration_inbound_addr

you should also inspec how you have configured libvirt to do the migration.
is it used the deprecated tunnels mode? natave tls? ssh as transport.
all of the above could impact your preformance
https://libvirt.org/migration.html describes how the differnt transports work in libvirt.

if this only affect the disk transfer i would also look a the iowait on both systems as slow disk io would slow down the transfer.

in general, I do not believe this is a nova bug as other than passing a bandwidth limit and the IP address of the host to migrate too, nova has no involvement in the data transfer. we expose some flags which you can set in the config which are passed to libvirt but the configuration of those is left to the installer/operator.

Changed in nova:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.