libvirt does ensure live migration will eventually complete (or abort)

Bug #1429220 reported by Daniel Berrange on 2015-03-06
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Michael Still

Bug Description

Currently the libvirt driver's approach to live migration is bested characterized as "launch & pray". It starts the live migration operation and then just unconditionally waits for it to finish. It never makes any attempt to tune its behaviour (for example changing max downtime), nor does it look at the data transfer statistics to check if it is making any progress, nor does it have any overall timeout.

It is not uncommon for guests to have workloads that will preclude live migration from completing. Basically they can be dirtying guest RAM (or block devices) faster than the network is able to transfer it to the destination host. In such a case Nova will just leave the migration running, burning up host CPU cycles and trashing network bandwidth until the end of the universe.

There are many features exposed by libvirt, that Nova could be using to do a better job, but the question is obviously ...which features and how should they be used. Fortunately Nova is not the first project to come across this problem. The oVirt data center mgmt project has the exact same problem. So rather than trying to invent some new logic for Nova, we should, as an immediate bug fix task, just copy the oVirt logic from VDSM

https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430

If we get this out to users and then get real world feedback on how it operates, we will have an idea of how/where to focus future ongoing efforts.

Changed in nova:
importance: Undecided → High
assignee: nobody → Daniel Berrange (berrange)
status: New → Confirmed

Related fix proposed to branch: master
Review: https://review.openstack.org/162253

Changed in nova:
status: Confirmed → In Progress

"copy the logic" right? Hate to bring this up as code referred seems to be GPL :(

Daniel Berrange (berrange) wrote :

I'm not literally copying the source code into openstack as it has completely different structure to what we need. As you can see from the proposed patch I'm just applying the logic at a conceptual level. In any case, the code in question was written by Red Hat employees, so it is copyright is Red Hat owned and as a Red Hat employee myself I'm free to relicense it

Thanks Daniel

tags: added: live-migrate

Change abandoned by Joe Gordon (<email address hidden>) on branch: master
Review: https://review.openstack.org/162254
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Related fix proposed to branch: master
Review: https://review.openstack.org/206631

Reviewed: https://review.openstack.org/162253
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=07c7e5caf2819d96809af1b9a19c046b9fd09851
Submitter: Jenkins
Branch: master

commit 07c7e5caf2819d96809af1b9a19c046b9fd09851
Author: Daniel P. Berrange <email address hidden>
Date: Fri Mar 6 16:12:28 2015 +0000

    libvirt: support management of downtime during migration

    Currently live migration runs with the default maximum downtime
    setting defined by QEMU. This is often inadequate to allow
    migration of large VMs to ever complete. Rather than trying to
    invent a new policy for changing downtime in OpenStack, copy
    the existing logic that is successfully battle tested by the
    oVirt project in VDSM.

    Note that setting the downtime step delay based on guest RAM size
    is an inexact science, as RAM size is only one factor influencing
    success of migration. Just as important is the rate of dirtying
    data in the guest, but this is based on guest workload which is
    not something Nova has visibility into. The bottleneck is the
    network which needs to be able to keep up with the dirtying of
    data in the guest. The greater the overall RAM size, the more
    time is required to transfer the total guest memory. So for
    larger guest sizes, we need to allow greater time for the guest
    to attempt to successfully migrate before increasing the max
    downtime. Scaling downtime step delay according to the overall
    guest RAM size is a reasonable, albeit not foolproof, way to
    tune migration to increase chances of success.

    This adds three host level config parameters which admins can
    use to control the base downtime value and the rate at which
    downtime is allowed to be increased during migration.

    Related-bug: #1429220
    DocImpact: three new libvirt configuration parameters in
               nova.conf allow the administrator to control
               the maximum permitted downtime for migration
               making migration more likely to complete for
               large VMs.
    Change-Id: I1992ffe9d3b2ff8d436cf1c419af9a238a8fecd8

Reviewed: https://review.openstack.org/162254
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=da33ab4f7be6fab4b5f0e5a5d276b186c0fb0d93
Submitter: Jenkins
Branch: master

commit da33ab4f7be6fab4b5f0e5a5d276b186c0fb0d93
Author: Daniel P. Berrange <email address hidden>
Date: Fri Mar 6 17:34:16 2015 +0000

    libvirt: set caps on maximum live migration time

    Currently Nova launches live migration and then just leaves it
    to run, assuming it'll eventually finish. If the guest is
    dirtying memory quicker than the network can transfer it, it
    is entirely possible that a migration will never complete. In
    such a case it is highly undesirable to leave it running
    forever since it wastes valuable CPU and network resources
    Rather than trying to come up with a new policy for aborting
    migration in OpenStack, copy the existing logic that is
    successfully battle tested by the oVirt project in VDSM.

    This introduces two new host level configuration parameters
    that cloud administrators can use to ensure migrations which
    don't appear likely to complete will be aborted. First is
    an overall cap on the total running time of migration. The
    configured value is scaled by the number of GB of guest RAM,
    since larger guests will obviously take proportionally longer
    to migrate. The second is a timeout that is applied when Nova
    detects that memory is being dirtied faster than it can be
    transferred. It tracks a low watermark of data remaining and
    if that low watermark doesn't decrease in the given time,
    it assumes the VM is stuck and aborts migration

    NB with the default values for the config parameters, the code
    for detecting stuck migrations is only going to kick in for
    guests which have more than 2 GB of RAM allocated, as for
    smaller guests the overall time limit will abort migration
    before this happens. This is reasonable as small guests are
    less likely to get stuck during migration, as the network
    will generally be able to keep up with dirtying data well
    enough to get to a point where the final switchover can be
    performed.

    Related-bug: #1429220
    DocImpact: two new libvirt configuration parameters in
               nova.conf allow the administrator to control
               length of migration before aborting it
    Change-Id: I461affe6c85aaf2a6bf6e8749586bfbfe0ebc146

Reviewed: https://review.openstack.org/206630
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f67b8fd5ce23917221a5126b16db39afcb8f7710
Submitter: Jenkins
Branch: master

commit f67b8fd5ce23917221a5126b16db39afcb8f7710
Author: Daniel P. Berrange <email address hidden>
Date: Tue Jul 28 16:58:01 2015 +0100

    libvirt: ensure LibvirtConfigGuestDisk parses readonly/shareable flags

    The LibvirtConfigGuestDisk class did not handle parsing of the
    readonly flag and was missing support for the shareable flag
    entirely. Add support for dealing with both.

    Related-bug: #1429220
    Change-Id: I69df232951c99f6c79f2c2c0ee95327ca872624e

Changed in nova:
assignee: Daniel Berrange (berrange) → Michael Still (mikalstill)

Reviewed: https://review.openstack.org/206631
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2f6cf7cba883c594e14aea05a85bb36b9f5209b0
Submitter: Jenkins
Branch: master

commit 2f6cf7cba883c594e14aea05a85bb36b9f5209b0
Author: Daniel P. Berrange <email address hidden>
Date: Tue Jul 28 17:39:33 2015 +0100

    libvirt: add helper methods for getting guest devices/disks

    Add get_all_devices and get_all_disks methods to the libvirt
    Guest object.

    Related-bug: #1429220
    Change-Id: I97ee786c5cc603aec1695929f58aa127063db439

Changed in nova:
assignee: Michael Still (mikalstill) → Daniel Berrange (berrange)
Changed in nova:
assignee: Daniel Berrange (berrange) → Michael Still (mikalstill)

Reviewed: https://review.openstack.org/206632
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9d353e59ec4b8653738bc93b62a30c0888f6c3e5
Submitter: Jenkins
Branch: master

commit 9d353e59ec4b8653738bc93b62a30c0888f6c3e5
Author: Daniel P. Berrange <email address hidden>
Date: Tue Jul 28 13:18:25 2015 +0100

    libvirt: take account of disks in migration data size

    Currently migration is tuned based on the guest RAM size
    alone, but when doing block migration we must also look
    at the guest disk size to determine total data transfer.

    This change currently follows the (stupid) libvirt logic
    for deciding which disks are copied. In the future when
    we can use new libvirt which lets us specify a desired
    list of disks we'll update this logic.

    DocImpact: the amount of time a live migration can take
    is now configurable for the libvirt driver, and is
    based on a number of seconds per gigabyte of RAM and
    local disk.

    Closes-bug: #1429220
    Change-Id: I3d525e62c6686277c6a05f2a91edefad3230c73f

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2015-09-03
Changed in nova:
milestone: none → liberty-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-10-15
Changed in nova:
milestone: liberty-3 → 12.0.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers