commit 07c7e5caf2819d96809af1b9a19c046b9fd09851
Author: Daniel P. Berrange <email address hidden>
Date: Fri Mar 6 16:12:28 2015 +0000
libvirt: support management of downtime during migration
Currently live migration runs with the default maximum downtime
setting defined by QEMU. This is often inadequate to allow
migration of large VMs to ever complete. Rather than trying to
invent a new policy for changing downtime in OpenStack, copy
the existing logic that is successfully battle tested by the
oVirt project in VDSM.
Note that setting the downtime step delay based on guest RAM size
is an inexact science, as RAM size is only one factor influencing
success of migration. Just as important is the rate of dirtying
data in the guest, but this is based on guest workload which is
not something Nova has visibility into. The bottleneck is the
network which needs to be able to keep up with the dirtying of
data in the guest. The greater the overall RAM size, the more
time is required to transfer the total guest memory. So for
larger guest sizes, we need to allow greater time for the guest
to attempt to successfully migrate before increasing the max
downtime. Scaling downtime step delay according to the overall
guest RAM size is a reasonable, albeit not foolproof, way to
tune migration to increase chances of success.
This adds three host level config parameters which admins can
use to control the base downtime value and the rate at which
downtime is allowed to be increased during migration.
Related-bug: #1429220
DocImpact: three new libvirt configuration parameters in nova.conf allow the administrator to control
the maximum permitted downtime for migration making migration more likely to complete for large VMs.
Change-Id: I1992ffe9d3b2ff8d436cf1c419af9a238a8fecd8
Reviewed: https:/ /review. openstack. org/162253 /git.openstack. org/cgit/ openstack/ nova/commit/ ?id=07c7e5caf28 19d96809af1b9a1 9c046b9fd09851
Committed: https:/
Submitter: Jenkins
Branch: master
commit 07c7e5caf2819d9 6809af1b9a19c04 6b9fd09851
Author: Daniel P. Berrange <email address hidden>
Date: Fri Mar 6 16:12:28 2015 +0000
libvirt: support management of downtime during migration
Currently live migration runs with the default maximum downtime
setting defined by QEMU. This is often inadequate to allow
migration of large VMs to ever complete. Rather than trying to
invent a new policy for changing downtime in OpenStack, copy
the existing logic that is successfully battle tested by the
oVirt project in VDSM.
Note that setting the downtime step delay based on guest RAM size
is an inexact science, as RAM size is only one factor influencing
success of migration. Just as important is the rate of dirtying
data in the guest, but this is based on guest workload which is
not something Nova has visibility into. The bottleneck is the
network which needs to be able to keep up with the dirtying of
data in the guest. The greater the overall RAM size, the more
time is required to transfer the total guest memory. So for
larger guest sizes, we need to allow greater time for the guest
to attempt to successfully migrate before increasing the max
downtime. Scaling downtime step delay according to the overall
guest RAM size is a reasonable, albeit not foolproof, way to
tune migration to increase chances of success.
This adds three host level config parameters which admins can
use to control the base downtime value and the rate at which
downtime is allowed to be increased during migration.
Related-bug: #1429220
nova.conf allow the administrator to control
making migration more likely to complete for
large VMs. 8d436cf1c419af9 a238a8fecd8
DocImpact: three new libvirt configuration parameters in
the maximum permitted downtime for migration
Change-Id: I1992ffe9d3b2ff