Comment 10 for bug 1429220

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/162253
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=07c7e5caf2819d96809af1b9a19c046b9fd09851
Submitter: Jenkins
Branch: master

commit 07c7e5caf2819d96809af1b9a19c046b9fd09851
Author: Daniel P. Berrange <email address hidden>
Date: Fri Mar 6 16:12:28 2015 +0000

    libvirt: support management of downtime during migration

    Currently live migration runs with the default maximum downtime
    setting defined by QEMU. This is often inadequate to allow
    migration of large VMs to ever complete. Rather than trying to
    invent a new policy for changing downtime in OpenStack, copy
    the existing logic that is successfully battle tested by the
    oVirt project in VDSM.

    Note that setting the downtime step delay based on guest RAM size
    is an inexact science, as RAM size is only one factor influencing
    success of migration. Just as important is the rate of dirtying
    data in the guest, but this is based on guest workload which is
    not something Nova has visibility into. The bottleneck is the
    network which needs to be able to keep up with the dirtying of
    data in the guest. The greater the overall RAM size, the more
    time is required to transfer the total guest memory. So for
    larger guest sizes, we need to allow greater time for the guest
    to attempt to successfully migrate before increasing the max
    downtime. Scaling downtime step delay according to the overall
    guest RAM size is a reasonable, albeit not foolproof, way to
    tune migration to increase chances of success.

    This adds three host level config parameters which admins can
    use to control the base downtime value and the rate at which
    downtime is allowed to be increased during migration.

    Related-bug: #1429220
    DocImpact: three new libvirt configuration parameters in
               nova.conf allow the administrator to control
               the maximum permitted downtime for migration
               making migration more likely to complete for
               large VMs.
    Change-Id: I1992ffe9d3b2ff8d436cf1c419af9a238a8fecd8