Instance stuck in 'migrating' status due to invalid host

Bug #1643623 reported by Sivasathurappan Radhakrishnan
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Sivasathurappan Radhakrishnan

Bug Description

Tried to live migrate instance to invalid destination host. Got an error message saying host was not available <class 'nova.exception.ComputeHostNotFound'>. Did a nova list and found status and task state was stuck in migrating status forever. Couldn't see the instance in 'nova migration-list' and not able to abort the migration using 'nova live-migration-abort' as the operation was aborted well before migration id could be set.

Steps to reproduce:
1) Create an instance test_1
2) live migrate instance using 'nova live-migration test_1 <invalid destination host name>'
3) Check status of the instance using 'nova show test_1' or 'nova list'.

Expected Result:
Status of the instance should have been in Active status as live migration failed with invalid host name

Actual Result:
Instance is stuck in 'migrating' status forever.

Environment:
Multinode devstack environment with 2 compute nodes or it can be done in single node environment as the validation of host name happens before live migration.
Multinode environment is not really required to reproduce above scenario
1)Current master
2)Networking-neutron
3)Hypervisor Libvirt-KVM

tags: added: live-migration
Changed in nova:
assignee: nobody → Sivasathurappan Radhakrishnan (siva-radhakrishnan)
Revision history for this message
Matt Riedemann (mriedem) wrote :

What exact command did you use to run the abort command? And do you have a new enough python-novaclient to support that microversion?

That was added in microversion 2.24:

http://docs.openstack.org/developer/nova/api_microversion_history.html#id22

Looks like you need at least novaclient 3.3.0 for that:

https://github.com/openstack/python-novaclient/commit/77e50cc91b328b1f7681cfc6f31bc41e40ab214e

Also, do you see this error in the nova-compute logs when the abort fails?

https://review.openstack.org/#/c/277971/19/nova/virt/libvirt/driver.py@5831

Changed in nova:
status: New → Incomplete
Revision history for this message
Matt Riedemann (mriedem) wrote :

Marking as incomplete as this requires some more debug information.

description: updated
Revision history for this message
Sivasathurappan Radhakrishnan (siva-radhakrishnan) wrote :

@mriedem: I think my bug title is confusing. In the above scenario migration doesn't happen at all as it does not have valid host in the environment but instance task state is stuck in migrating. But for the end user it might be little confusing as this particular instance would not have migration id set to use 'nova live-migration-abort'. Above instance can be reset to active state by 'nova reset-state', but this can be avoided by resetting vm state to active at this place https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3695.

summary: - Not able to abort live migration
+ Instance stuck in 'migrating' status due to invalid host
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/401009

Changed in nova:
status: Incomplete → In Progress
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/447355

Revision history for this message
Rajesh Tailor (ratailor) wrote :

Hi Siva,

Are you still working on it ?

If you are not planning to work on it, would you please assign it to me.

Revision history for this message
Sivasathurappan Radhakrishnan (siva-radhakrishnan) wrote :

Hi Rajesh!
I am not working on it anymore. My last patch requires unit test case to be considered for review. Feel free to work on it if you would like to.

Changed in nova:
assignee: Sivasathurappan Radhakrishnan (siva-radhakrishnan) → nobody
Rajesh Tailor (ratailor)
Changed in nova:
assignee: nobody → Rajesh Tailor (ratailor)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/447355
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fb68fd12e2fd6e9686ad45c9875508bd9fa0df91
Submitter: Zuul
Branch: master

commit fb68fd12e2fd6e9686ad45c9875508bd9fa0df91
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Mon Mar 20 03:13:13 2017 +0000

    Return 400 when compute host is not found

    Previously user was getting a 500 error code for ComputeHostNotFound
    if they are using latest microversion that does live migration in
    async. This patches changes return response to 400 as 500 internal
    server error should not be returned to the user for failures due to
    user error that can be fixed by changing to request on client side.

    Change-Id: I7a9de211ecfaa7f2816fbf8bcd73ebbdd990643c
    closes-bug:1643623

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/550661

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/550707

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/550661
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cc3f3adfb8dada83cab381d0ec9536925a43a560
Submitter: Zuul
Branch: stable/queens

commit cc3f3adfb8dada83cab381d0ec9536925a43a560
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Mon Mar 20 03:13:13 2017 +0000

    Return 400 when compute host is not found

    Previously user was getting a 500 error code for ComputeHostNotFound
    if they are using latest microversion that does live migration in
    async. This patches changes return response to 400 as 500 internal
    server error should not be returned to the user for failures due to
    user error that can be fixed by changing to request on client side.

    Change-Id: I7a9de211ecfaa7f2816fbf8bcd73ebbdd990643c
    closes-bug:1643623
    (cherry picked from commit fb68fd12e2fd6e9686ad45c9875508bd9fa0df91)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/550707
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8ebdd91d463fd9f5bee8598a0f3358c8ce440716
Submitter: Zuul
Branch: stable/pike

commit 8ebdd91d463fd9f5bee8598a0f3358c8ce440716
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Mon Mar 20 03:13:13 2017 +0000

    Return 400 when compute host is not found

    Previously user was getting a 500 error code for ComputeHostNotFound
    if they are using latest microversion that does live migration in
    async. This patches changes return response to 400 as 500 internal
    server error should not be returned to the user for failures due to
    user error that can be fixed by changing to request on client side.

    Change-Id: I7a9de211ecfaa7f2816fbf8bcd73ebbdd990643c
    closes-bug:1643623
    (cherry picked from commit fb68fd12e2fd6e9686ad45c9875508bd9fa0df91)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.2

This issue was fixed in the openstack/nova 17.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.1.1

This issue was fixed in the openstack/nova 16.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b1

This issue was fixed in the openstack/nova 18.0.0.0b1 development milestone.

Matt Riedemann (mriedem)
Changed in nova:
assignee: Rajesh Tailor (ratailor) → Sivasathurappan Radhakrishnan (siva-radhakrishnan)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/401009
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c7aed3d139909913387e4ac2e771f19c9502c5b1
Submitter: Zuul
Branch: master

commit c7aed3d139909913387e4ac2e771f19c9502c5b1
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Wed Nov 23 00:17:47 2016 +0000

    Fix host validity check for live-migration

    When live migrating instance to invalid host, live migration fails
    with host not found and sets instance task state to migrating.

    This change handles host validity in API layer before changing instance
    task_state to 'MIGRATING' and raise proper exception on invalid host.

    Change-Id: I7c5e80298b9adf1bd53cc6c464a3744b5397b7e8
    Related-Bug: #1643623
    Closes-Bug: #1785031

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.openstack.org/590262

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/590263

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/590262
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0a60496ddbcd3480a342492125c0adcf461324bd
Submitter: Zuul
Branch: stable/queens

commit 0a60496ddbcd3480a342492125c0adcf461324bd
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Wed Nov 23 00:17:47 2016 +0000

    Fix host validity check for live-migration

    When live migrating instance to invalid host, live migration fails
    with host not found and sets instance task state to migrating.

    This change handles host validity in API layer before changing instance
    task_state to 'MIGRATING' and raise proper exception on invalid host.

    Change-Id: I7c5e80298b9adf1bd53cc6c464a3744b5397b7e8
    Related-Bug: #1643623
    Closes-Bug: #1785031
    (cherry picked from commit c7aed3d139909913387e4ac2e771f19c9502c5b1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/ocata)

Related fix proposed to branch: stable/ocata
Review: https://review.openstack.org/590611

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/590649

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/590649
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f5b8a0a09baa3522f5c6f120d84db738cc236ab1
Submitter: Zuul
Branch: stable/ocata

commit f5b8a0a09baa3522f5c6f120d84db738cc236ab1
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Mon Mar 20 03:13:13 2017 +0000

    Return 400 when compute host is not found

    Previously user was getting a 500 error code for ComputeHostNotFound
    if they are using latest microversion that does live migration in
    async. This patches changes return response to 400 as 500 internal
    server error should not be returned to the user for failures due to
    user error that can be fixed by changing to request on client side.

    Change-Id: I7a9de211ecfaa7f2816fbf8bcd73ebbdd990643c
    closes-bug:1643623
    (cherry picked from commit fb68fd12e2fd6e9686ad45c9875508bd9fa0df91)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.1.4

This issue was fixed in the openstack/nova 15.1.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/590263
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=064daf9aeb39a7f977de4989ad096821e7830ff6
Submitter: Zuul
Branch: stable/pike

commit 064daf9aeb39a7f977de4989ad096821e7830ff6
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Wed Nov 23 00:17:47 2016 +0000

    Fix host validity check for live-migration

    When live migrating instance to invalid host, live migration fails
    with host not found and sets instance task state to migrating.

    This change handles host validity in API layer before changing instance
    task_state to 'MIGRATING' and raise proper exception on invalid host.

    Change-Id: I7c5e80298b9adf1bd53cc6c464a3744b5397b7e8
    Related-Bug: #1643623
    Closes-Bug: #1785031
    (cherry picked from commit c7aed3d139909913387e4ac2e771f19c9502c5b1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/590611
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fbc91183ffa32e881ca5cf15758f5f602401acc0
Submitter: Zuul
Branch: stable/ocata

commit fbc91183ffa32e881ca5cf15758f5f602401acc0
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Wed Nov 23 00:17:47 2016 +0000

    Fix host validity check for live-migration

    When live migrating instance to invalid host, live migration fails
    with host not found and sets instance task state to migrating.

    This change handles host validity in API layer before changing instance
    task_state to 'MIGRATING' and raise proper exception on invalid host.

    Change-Id: I7c5e80298b9adf1bd53cc6c464a3744b5397b7e8
    Related-Bug: #1643623
    Closes-Bug: #1785031
    (cherry picked from commit c7aed3d139909913387e4ac2e771f19c9502c5b1)
    (cherry picked from commit 064daf9aeb39a7f977de4989ad096821e7830ff6)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.