Config doesn't support setting of live migration timeout

Bug #1861986 reported by Matus Kosut
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Fix Released
Wishlist
Matus Kosut

Bug Description

Within Openstack deployement for scientific computing we provide quite a few bigger machines that usually timeout when live migrated, we figured out that we have to change live_migration_completion_timeout and so far it is a little hassle to handle it beside juju.

Currently live_migration_completion_timeout is not supported in config with it's own key.
It has to be set in the libvirt section in nova.conf, therefore it doesn't not belong to config-flags which only set default section.

Changed in charm-nova-compute:
importance: Undecided → Wishlist
Matus Kosut (matuskosut)
Changed in charm-nova-compute:
assignee: nobody → Matus Kosut (matuskosut)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (stable/20.02)

Fix proposed to branch: stable/20.02
Review: https://review.opendev.org/708316

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Note that the details for the option (at https://docs.openstack.org/nova/train/configuration/config.html) say:

    Time to wait, in seconds, for migration to successfully complete transferring data before aborting the operation.

    Value is per GiB of guest RAM + disk to be transferred, with lower bound of a minimum of 2 GiB. Should usually be larger than downtime delay * downtime steps. Set to 0 to disable timeouts.

    Related options:

    * live_migration_downtime
    * live_migration_downtime_steps
    * live_migration_downtime_delay

Is there something else going on in the network that might cause the default of 800 (or 500 post queens?) seconds per GB of size? Although, it's not clear from above whether that 800 (500 perhaps) applies to the 1st 2GB and then subsequent 1GB chunks of data?

Revision history for this message
Matus Kosut (matuskosut) wrote :

Well network is a bit limited compared to what you would see in private DC, but traffic is on separate links.

VMs for which we had to test increase of completion timeouts are having 120GB+ memory, and they can be pretty active when it comes to cpu and memory (larger diffs). Not even expecting to migrate everything live, but having a proper way to increase those limits will help, so mainly trying to avoid hackish solutions around config.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

I think the addition of the options is useful as it does allow tweaking of the configuration for various network situations.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (master)

Reviewed: https://review.opendev.org/705916
Committed: https://git.openstack.org/cgit/openstack/charm-nova-compute/commit/?id=6351653c864d0d83778941e8775bd9b5056d5efc
Submitter: Zuul
Branch: master

commit 6351653c864d0d83778941e8775bd9b5056d5efc
Author: matuskosut <email address hidden>
Date: Wed Feb 5 11:40:03 2020 +0100

    Add support and tests for live_migration_ parameters in charm config, templates and hook.

    Change-Id: I9e426d7718f044f0e73231448b0a3dad17c81524
    Closes-Bug: 1861986

Changed in charm-nova-compute:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-nova-compute:
milestone: none → 20.05
David Ames (thedac)
Changed in charm-nova-compute:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-nova-compute (stable/20.02)

Change abandoned by Alex Kavanagh (tinwood) (<email address hidden>) on branch: stable/20.02
Review: https://review.opendev.org/708316
Reason: Abandoning, as 20.02 is no longer a stable supported version. (current is 20.05; release of 20.08 is almost here).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.