After nova live-migration-abort canceled "queued" live-migration, instance status remains "MIGRATING"

Bug #1949808 reported by Alexey Stupnikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Alexey Stupnikov
Wallaby
New
Undecided
Unassigned
Xena
Fix Committed
Undecided
Unassigned

Bug Description

OpenStack supports cancelling "queued" live-migration by "nova live-migration-abort", but state of canceled instance remains "MIGRATING".

Revision history for this message
Alexey Stupnikov (astupnikov) wrote (last edit ):

Downstream RHOSP bug #1729366

description: updated
Changed in nova:
status: New → In Progress
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/776250
Committed: https://opendev.org/openstack/nova/commit/3e7b9b69e68c8594eac92d88f0579aab40d7d5ae
Submitter: "Zuul (22348)"
Branch: master

commit 3e7b9b69e68c8594eac92d88f0579aab40d7d5ae
Author: Alexey Stupnikov <email address hidden>
Date: Tue Nov 9 16:05:52 2021 +0100

    Test aborting queued live migration

    This patch adds a regression test which asserts that if a live migration
    is aborted while it's 'queued', the instance's status is never reverted
    back to ACTIVE, and instance remains in MIGRATING state.

    There is simple idea behind implemented LiveMigrationQueuedAbortTest:
    we start two instances on the same compute and try to migrate
    them simultaneously when max_concurrent_live_migrations is set to 1
    and nova.tests.fixtures.libvirt.Domain.migrateToURI3 is locked.
    As a result, we get two live migrations stuck in 'migrating' and
    'queued' states and we can issue API call to abort the second one.

    Lock is removed and first instance is live migrated after second
    instance's live migration is aborted.

    Co-Authored-By: Alex Stupnikov <email address hidden>
    Partial-Bug: #1949808
    Change-Id: I67d41a8e439b1ff3c5983ee17823616b80698639

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/828374

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/828570

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

Patch https://review.opendev.org/c/openstack/nova/+/828570 would be used to solve this specific bug and separate set of patches would be developed for bug #1960412

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/830010

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by "Alexey Stupnikov <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/828374
Reason: Abandoned in favor of Ic97eff86f580bff67b1f02c8eeb60c4cf4181e6a

Changed in nova:
importance: Undecided → Medium
tags: added: yoga-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/830010
Committed: https://opendev.org/openstack/nova/commit/1ad287bf9a8f65ce68c14f4634775f58abda15c2
Submitter: "Zuul (22348)"
Branch: master

commit 1ad287bf9a8f65ce68c14f4634775f58abda15c2
Author: Alexey Stupnikov <email address hidden>
Date: Sat Feb 19 21:38:44 2022 +0100

    Add functional tests to reproduce bug #1960412

    Instance would be affected by problems described in bug #1949808
    and bug #1960412 when queued live migration is aborted.

    This change adds functional test to reproduce problems with
    placement allocations (record for aborted live migration is not
    removed when queued live migration is aborted) and with Neutron port
    bindings (INACTIVE port binding records for destination host are not
    removed when queued live migration is aborted).

    It looks like there are no other modifications introduced by Nova
    control plane which should be reverted when queued live migration is
    aborted.

    This patch also changes libvirt and neutron fixtures:

    - libvirt fixture was changed to support live migrations of
      instances with regular ports: without this change
      _update_vif_xml() complains about lack of address element in VIF's
      XML.
    - neutron fixture was changed to improve active port binding's
      tracking during live migration: without this change port's
      binding:host_id is not updated when activate_port_binding() is
      called. As a result, list_ports() function returns empty list
      when constants.BINDING_HOST_ID is used in search_opts, which is
      the case for setup_networks_on_host() called with teardown=True.

    Related-bug: #1960412
    Related-bug: #1949808
    Change-Id: I152581deb6e659c551f78eed66e4b0b958b20c53

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/828570
Committed: https://opendev.org/openstack/nova/commit/219520d9cec6a204e0d0f75881d75c8db48e7f56
Submitter: "Zuul (22348)"
Branch: master

commit 219520d9cec6a204e0d0f75881d75c8db48e7f56
Author: Alexey Stupnikov <email address hidden>
Date: Mon Mar 7 16:57:39 2022 +0100

    Clean up when queued live migration aborted

    This patch solves bug #1949808 and bug #1960412 by tuning
    live_migration_abort() function and adding calls to:

    - remove placement allocations for live migration;
    - remove INACTIVE port bindings against destination compute node;
    - restore instance's state.

    Related unit test was adjusted and related functional tests were
    fixed.

    Closes-bug: #1949808
    Closes-bug: #1960412

    Change-Id: Ic97eff86f580bff67b1f02c8eeb60c4cf4181e6a

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 25.0.0.0rc1

This issue was fixed in the openstack/nova 25.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/835853

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/835854

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/835855

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/836145

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/836146

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/836147

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/xena)

Change abandoned by "Alexey Stupnikov <email address hidden>" on branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/835855
Reason: Incorrect change ID

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Alexey Stupnikov <email address hidden>" on branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/835854
Reason: Incorrect change ID

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Alexey Stupnikov <email address hidden>" on branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/835853
Reason: Incorrect change ID

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/836145
Committed: https://opendev.org/openstack/nova/commit/19c4f8e973bd7f88bde3c55b714439072b73aadf
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 19c4f8e973bd7f88bde3c55b714439072b73aadf
Author: Alexey Stupnikov <email address hidden>
Date: Tue Nov 9 16:05:52 2021 +0100

    Test aborting queued live migration

    This patch adds a regression test which asserts that if a live migration
    is aborted while it's 'queued', the instance's status is never reverted
    back to ACTIVE, and instance remains in MIGRATING state.

    There is simple idea behind implemented LiveMigrationQueuedAbortTest:
    we start two instances on the same compute and try to migrate
    them simultaneously when max_concurrent_live_migrations is set to 1
    and nova.tests.fixtures.libvirt.Domain.migrateToURI3 is locked.
    As a result, we get two live migrations stuck in 'migrating' and
    'queued' states and we can issue API call to abort the second one.

    Lock is removed and first instance is live migrated after second
    instance's live migration is aborted.

    Co-Authored-By: Alex Stupnikov <email address hidden>
    Partial-Bug: #1949808
    Change-Id: I67d41a8e439b1ff3c5983ee17823616b80698639
    (cherry picked from commit 3e7b9b69e68c8594eac92d88f0579aab40d7d5ae)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/836146
Committed: https://opendev.org/openstack/nova/commit/479b8db3ab07dd1f50c029904cca17f3a5708685
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 479b8db3ab07dd1f50c029904cca17f3a5708685
Author: Alexey Stupnikov <email address hidden>
Date: Sat Feb 19 21:38:44 2022 +0100

    Add functional tests to reproduce bug #1960412

    Instance would be affected by problems described in bug #1949808
    and bug #1960412 when queued live migration is aborted.

    This change adds functional test to reproduce problems with
    placement allocations (record for aborted live migration is not
    removed when queued live migration is aborted) and with Neutron port
    bindings (INACTIVE port binding records for destination host are not
    removed when queued live migration is aborted).

    It looks like there are no other modifications introduced by Nova
    control plane which should be reverted when queued live migration is
    aborted.

    This patch also changes libvirt and neutron fixtures:

    - libvirt fixture was changed to support live migrations of
      instances with regular ports: without this change
      _update_vif_xml() complains about lack of address element in VIF's
      XML.
    - neutron fixture was changed to improve active port binding's
      tracking during live migration: without this change port's
      binding:host_id is not updated when activate_port_binding() is
      called. As a result, list_ports() function returns empty list
      when constants.BINDING_HOST_ID is used in search_opts, which is
      the case for setup_networks_on_host() called with teardown=True.

    Related-bug: #1960412
    Related-bug: #1949808
    Change-Id: I152581deb6e659c551f78eed66e4b0b958b20c53
    (cherry picked from commit 1ad287bf9a8f65ce68c14f4634775f58abda15c2)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/836147
Committed: https://opendev.org/openstack/nova/commit/8670ca8bb290d7b434437ebf6c65d2e396498df8
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 8670ca8bb290d7b434437ebf6c65d2e396498df8
Author: Alexey Stupnikov <email address hidden>
Date: Mon Mar 7 16:57:39 2022 +0100

    Clean up when queued live migration aborted

    This patch solves bug #1949808 and bug #1960412 by tuning
    live_migration_abort() function and adding calls to:

    - remove placement allocations for live migration;
    - remove INACTIVE port bindings against destination compute node;
    - restore instance's state.

    Related unit test was adjusted and related functional tests were
    fixed.

    Closes-bug: #1949808
    Closes-bug: #1960412
    Change-Id: Ic97eff86f580bff67b1f02c8eeb60c4cf4181e6a
    (cherry picked from commit 219520d9cec6a204e0d0f75881d75c8db48e7f56)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/841483

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/841760

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/841736

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.