Nova doesn't call migrate_volume_completion after cinder volume migration

Bug #1803961 reported by Matthew Booth on 2018-11-19
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Lee Yarwood
Queens
Medium
Lee Yarwood
Rocky
Medium
Lee Yarwood
Stein
Medium
Lee Yarwood

Bug Description

Originally reported in Red Hat Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1648931

Create a cinder volume, attach it to a nova instance, and migrate the volume to a different storage host:

$ cinder create 1 --volume-type foo --name myvol
$ nova volume-attach myinstance myvol
$ cinder migrate myvol c-vol2

Everything seems to work correctly, but if we look at myinstance we see that it's now connected to a new volume, and the original volume is still present on the original storage host.

This is because nova didn't call cinder's migrate_volume_completion. migrate_volume_completion would have deleted the original volume, and changed the volume id of the new volume to be the same as the original. The result would be that myinstance would appear to be connected to the same volume as before.

Note that there are 2 ways (that I'm aware of) to intiate a cinder volume migration: retype and migrate. AFAICT retype is *not* affected. In fact, I updated the relevant tempest test to try to trip it up and it didn't fail. However, an exlicit migrate *is* affected. They are different top-level entry points in cinder, and set different state, which is what triggers the Nova bug.

This appears to be a regression which was introduced by https://review.openstack.org/#/c/456971/ :

        # Yes this is a tightly-coupled state check of what's going on inside
        # cinder, but we need this while we still support old (v1/v2) and
        # new style attachments (v3.44). Once we drop support for old style
        # attachments we could think about cleaning up the cinder-initiated
        # swap volume API flows.
        is_cinder_migration = (
            True if old_volume['status'] in ('retyping',
                                             'migrating') else False)

There's a bug here because AFAICT cinder never sets status to 'migrating' during any operation: it sets migration_status to 'migrating' during both retype and migrate. During retype it sets status to 'retyping', but not during an explicit migrate.

Fix proposed to branch: master
Review: https://review.openstack.org/618717

Changed in nova:
assignee: nobody → Matthew Booth (mbooth-9)
status: New → In Progress
Matthew Booth (mbooth-9) wrote :

In case I don't get back to this soon, anybody else should feel free to update my patches. See also tempest patch https://review.openstack.org/#/c/618533/ .

tags: added: cinder volumes

Fix proposed to branch: master
Review: https://review.openstack.org/637224

Changed in nova:
assignee: Matthew Booth (mbooth-9) → Lee Yarwood (lyarwood)

Reviewed: https://review.opendev.org/637224
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=53c3cfa7a02d684ce27800e22e00a816af44c510
Submitter: Zuul
Branch: master

commit 53c3cfa7a02d684ce27800e22e00a816af44c510
Author: Lee Yarwood <email address hidden>
Date: Fri Feb 15 16:26:23 2019 +0000

    Use migration_status during volume migrating and retyping

    When swapping volumes Nova has to identify if the swap itself is related
    to an underlying migration or retype of the volume by Cinder. Nova
    would previously use the status of the volume to determine if the volume
    was retyping or migrating.

    However in the migration case where a volume is moved directly between
    hosts the volume is never given a status of migrating by Cinder leading
    to Nova never calling the os-migrate_volume_completion cinder API to
    complete the migration.

    This change switches Nova to use the migration_status of the volume to
    ensure that this API is called for both retypes and migrations.

    Depends-On: https://review.openstack.org/#/c/639331/
    Change-Id: I1bdf3431bda2da98380e0dcaa9f952e6768ca3af
    Closes-bug: #1803961

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem) on 2019-05-07
Changed in nova:
importance: Undecided → Medium

Reviewed: https://review.opendev.org/657575
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=91282f879cb453e4baa6c6decb4fff849c7a1b2a
Submitter: Zuul
Branch: stable/stein

commit 91282f879cb453e4baa6c6decb4fff849c7a1b2a
Author: Lee Yarwood <email address hidden>
Date: Fri Feb 15 16:26:23 2019 +0000

    Use migration_status during volume migrating and retyping

    When swapping volumes Nova has to identify if the swap itself is related
    to an underlying migration or retype of the volume by Cinder. Nova
    would previously use the status of the volume to determine if the volume
    was retyping or migrating.

    However in the migration case where a volume is moved directly between
    hosts the volume is never given a status of migrating by Cinder leading
    to Nova never calling the os-migrate_volume_completion cinder API to
    complete the migration.

    This change switches Nova to use the migration_status of the volume to
    ensure that this API is called for both retypes and migrations.

    Depends-On: https://review.openstack.org/#/c/639331/
    Change-Id: I1bdf3431bda2da98380e0dcaa9f952e6768ca3af
    Closes-bug: #1803961
    (cherry picked from commit 53c3cfa7a02d684ce27800e22e00a816af44c510)

Reviewed: https://review.opendev.org/657577
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8e130e2be7c2eb604d53fd6053254c6b54d47524
Submitter: Zuul
Branch: stable/rocky

commit 8e130e2be7c2eb604d53fd6053254c6b54d47524
Author: Lee Yarwood <email address hidden>
Date: Fri Feb 15 16:26:23 2019 +0000

    Use migration_status during volume migrating and retyping

    When swapping volumes Nova has to identify if the swap itself is related
    to an underlying migration or retype of the volume by Cinder. Nova
    would previously use the status of the volume to determine if the volume
    was retyping or migrating.

    However in the migration case where a volume is moved directly between
    hosts the volume is never given a status of migrating by Cinder leading
    to Nova never calling the os-migrate_volume_completion cinder API to
    complete the migration.

    This change switches Nova to use the migration_status of the volume to
    ensure that this API is called for both retypes and migrations.

    Depends-On: https://review.openstack.org/#/c/639331/
    Change-Id: I1bdf3431bda2da98380e0dcaa9f952e6768ca3af
    Closes-bug: #1803961
    (cherry picked from commit 53c3cfa7a02d684ce27800e22e00a816af44c510)
    (cherry picked from commit 91282f879cb453e4baa6c6decb4fff849c7a1b2a)

Reviewed: https://review.opendev.org/657579
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=01141392db549728e78865b6a56468d17102bee0
Submitter: Zuul
Branch: stable/queens

commit 01141392db549728e78865b6a56468d17102bee0
Author: Lee Yarwood <email address hidden>
Date: Fri Feb 15 16:26:23 2019 +0000

    Use migration_status during volume migrating and retyping

    When swapping volumes Nova has to identify if the swap itself is related
    to an underlying migration or retype of the volume by Cinder. Nova
    would previously use the status of the volume to determine if the volume
    was retyping or migrating.

    However in the migration case where a volume is moved directly between
    hosts the volume is never given a status of migrating by Cinder leading
    to Nova never calling the os-migrate_volume_completion cinder API to
    complete the migration.

    This change switches Nova to use the migration_status of the volume to
    ensure that this API is called for both retypes and migrations.

    Depends-On: https://review.openstack.org/#/c/639331/
    Change-Id: I1bdf3431bda2da98380e0dcaa9f952e6768ca3af
    Closes-bug: #1803961
    (cherry picked from commit 53c3cfa7a02d684ce27800e22e00a816af44c510)
    (cherry picked from commit 91282f879cb453e4baa6c6decb4fff849c7a1b2a)
    (cherry picked from commit 8e130e2be7c2eb604d53fd6053254c6b54d47524)

This issue was fixed in the openstack/nova 19.0.1 release.

This issue was fixed in the openstack/nova 18.2.1 release.

This issue was fixed in the openstack/nova 17.0.11 release.

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers