Reverting a resize does not update the instance.availability_zone value to the source az

Bug #1819963 reported by Matt Riedemann on 2019-03-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Matt Riedemann
Pike
Medium
Matt Riedemann
Queens
Medium
Matt Riedemann
Rocky
Medium
Matt Riedemann
Stein
Medium
Matt Riedemann

Bug Description

With this change in pike: https://review.openstack.org/#/c/446053/ - when resizing a server the instance.availability_zone is changed to whatever zone the destination host is in.

For a server that is created with an explicit AZ, the resize is restricted to the same AZ via the AvailabilityZoneFilter. For a server that is not created with an explicit AZ and https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_schedule_zone is None (the default), the server is free to "move" between zones.

The bug is that when a server is resized and moves from zone 1 to zone 2, if the resize is reverted, nothing updates the instance.availability_zone value back to the original zone even though the server is on the original source host in the initial zone.

Note that the API hides this a bit when showing the instance AZ:

https://github.com/openstack/nova/blob/482f4fed654f384e8fb277c504a14a6407ba2e7b/nova/availability_zones.py#L179-L194

If the instance.availability_zone value in the database does not match the cached zone that the instance.host is in, the API will return the zone for the host rather than the instance.availability_zone value. But the instance.availability_zone value in the database is still incorrect.

Matt Riedemann (mriedem) wrote :

Note that even though the API shows the correct AZ when showing the server details in isolation, the 'availability_zone' query parameter when listing servers will be wrong.

Matt Riedemann (mriedem) on 2019-03-13
Changed in nova:
importance: Low → Medium

Fix proposed to branch: master
Review: https://review.openstack.org/643155

Changed in nova:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/643151
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=73aaead294b9df412305abb1cb01aac95477bcc1
Submitter: Zuul
Branch: master

commit 73aaead294b9df412305abb1cb01aac95477bcc1
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 15:53:09 2019 -0400

    Add functional recreate test for bug 1819963

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This adds a functional recreate test for the bug.

    Change-Id: Ib107650d6a2c991c26b646a0dd10ddc7a3fb7e56
    Related-Bug: #1819963

Reviewed: https://review.openstack.org/643155
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=40f6672f53794b563f4c7e27ede7b59a1d63c14a
Submitter: Zuul
Branch: master

commit 40f6672f53794b563f4c7e27ede7b59a1d63c14a
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 16:20:47 2019 -0400

    Update instance.availability_zone on revertResize

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This fixes the bug by updating the instance.availability_zone
    value in the API (where we have access to the aggregates
    table in the API DB) before casting to nova-compute to
    complete the revert. As noted in the comment within, this
    is not fail-safe in case the revert fails before the
    instance.host is updated in finish_revert_resize, but we
    don't have a lot of great backportable options here that
    don't involve "up-calls" from the compute to the API DB.

    Change-Id: I8dc862b90d398b693b259abd3583616d07d8d206
    Closes-Bug: #1819963

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/648401
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0b2345442dfc99586c1e2b5ca13c54fe8045e5c8
Submitter: Zuul
Branch: stable/stein

commit 0b2345442dfc99586c1e2b5ca13c54fe8045e5c8
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 15:53:09 2019 -0400

    Add functional recreate test for bug 1819963

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This adds a functional recreate test for the bug.

    Change-Id: Ib107650d6a2c991c26b646a0dd10ddc7a3fb7e56
    Related-Bug: #1819963
    (cherry picked from commit 73aaead294b9df412305abb1cb01aac95477bcc1)

tags: added: in-stable-stein

Reviewed: https://review.openstack.org/648402
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=26e59912838b145b627aa247c45b0b19393466c0
Submitter: Zuul
Branch: stable/stein

commit 26e59912838b145b627aa247c45b0b19393466c0
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 16:20:47 2019 -0400

    Update instance.availability_zone on revertResize

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This fixes the bug by updating the instance.availability_zone
    value in the API (where we have access to the aggregates
    table in the API DB) before casting to nova-compute to
    complete the revert. As noted in the comment within, this
    is not fail-safe in case the revert fails before the
    instance.host is updated in finish_revert_resize, but we
    don't have a lot of great backportable options here that
    don't involve "up-calls" from the compute to the API DB.

    Change-Id: I8dc862b90d398b693b259abd3583616d07d8d206
    Closes-Bug: #1819963
    (cherry picked from commit 40f6672f53794b563f4c7e27ede7b59a1d63c14a)

Reviewed: https://review.openstack.org/648409
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=440c0fb0ad7b127dfdcfb7302e0e31fa41d5255a
Submitter: Zuul
Branch: stable/rocky

commit 440c0fb0ad7b127dfdcfb7302e0e31fa41d5255a
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 15:53:09 2019 -0400

    Add functional recreate test for bug 1819963

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This adds a functional recreate test for the bug.

    Change-Id: Ib107650d6a2c991c26b646a0dd10ddc7a3fb7e56
    Related-Bug: #1819963
    (cherry picked from commit 73aaead294b9df412305abb1cb01aac95477bcc1)
    (cherry picked from commit 0b2345442dfc99586c1e2b5ca13c54fe8045e5c8)

tags: added: in-stable-rocky

Reviewed: https://review.openstack.org/648410
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1342cd75e9a091d28bec857f44f833ab5a7d1b96
Submitter: Zuul
Branch: stable/rocky

commit 1342cd75e9a091d28bec857f44f833ab5a7d1b96
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 16:20:47 2019 -0400

    Update instance.availability_zone on revertResize

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This fixes the bug by updating the instance.availability_zone
    value in the API (where we have access to the aggregates
    table in the API DB) before casting to nova-compute to
    complete the revert. As noted in the comment within, this
    is not fail-safe in case the revert fails before the
    instance.host is updated in finish_revert_resize, but we
    don't have a lot of great backportable options here that
    don't involve "up-calls" from the compute to the API DB.

    Conflicts:
          nova/compute/api.py
          nova/tests/unit/compute/test_compute_api.py

    NOTE(mriedem): The conflict is due to not having change
    I34ffaf285718059b55f90e812b57f1e11d566c6f in Rocky.

    Change-Id: I8dc862b90d398b693b259abd3583616d07d8d206
    Closes-Bug: #1819963
    (cherry picked from commit 40f6672f53794b563f4c7e27ede7b59a1d63c14a)
    (cherry picked from commit 26e59912838b145b627aa247c45b0b19393466c0)

Reviewed: https://review.openstack.org/648414
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1b15a342e3e174af745274ca62185e7e8f7f6cf9
Submitter: Zuul
Branch: stable/queens

commit 1b15a342e3e174af745274ca62185e7e8f7f6cf9
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 15:53:09 2019 -0400

    Add functional recreate test for bug 1819963

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This adds a functional recreate test for the bug.

    Change-Id: Ib107650d6a2c991c26b646a0dd10ddc7a3fb7e56
    Related-Bug: #1819963
    (cherry picked from commit 73aaead294b9df412305abb1cb01aac95477bcc1)
    (cherry picked from commit 0b2345442dfc99586c1e2b5ca13c54fe8045e5c8)
    (cherry picked from commit 440c0fb0ad7b127dfdcfb7302e0e31fa41d5255a)

tags: added: in-stable-queens

Reviewed: https://review.openstack.org/648415
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=99cbbcfd61c3569b9c299c99e4a2eaf38d335d19
Submitter: Zuul
Branch: stable/queens

commit 99cbbcfd61c3569b9c299c99e4a2eaf38d335d19
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 16:20:47 2019 -0400

    Update instance.availability_zone on revertResize

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This fixes the bug by updating the instance.availability_zone
    value in the API (where we have access to the aggregates
    table in the API DB) before casting to nova-compute to
    complete the revert. As noted in the comment within, this
    is not fail-safe in case the revert fails before the
    instance.host is updated in finish_revert_resize, but we
    don't have a lot of great backportable options here that
    don't involve "up-calls" from the compute to the API DB.

    Change-Id: I8dc862b90d398b693b259abd3583616d07d8d206
    Closes-Bug: #1819963
    (cherry picked from commit 40f6672f53794b563f4c7e27ede7b59a1d63c14a)
    (cherry picked from commit 26e59912838b145b627aa247c45b0b19393466c0)
    (cherry picked from commit 1342cd75e9a091d28bec857f44f833ab5a7d1b96)

Reviewed: https://review.opendev.org/648421
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=59c65a15e69da459f3124aacff8e5fabee1fdc9c
Submitter: Zuul
Branch: stable/pike

commit 59c65a15e69da459f3124aacff8e5fabee1fdc9c
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 15:53:09 2019 -0400

    Add functional recreate test for bug 1819963

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This adds a functional recreate test for the bug.

    Change-Id: Ib107650d6a2c991c26b646a0dd10ddc7a3fb7e56
    Related-Bug: #1819963
    (cherry picked from commit 73aaead294b9df412305abb1cb01aac95477bcc1)
    (cherry picked from commit 0b2345442dfc99586c1e2b5ca13c54fe8045e5c8)
    (cherry picked from commit 440c0fb0ad7b127dfdcfb7302e0e31fa41d5255a)
    (cherry picked from commit 1b15a342e3e174af745274ca62185e7e8f7f6cf9)

tags: added: in-stable-pike

Reviewed: https://review.opendev.org/648422
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bf7112ee4d8e90c79f99526e9071e6e9e94031aa
Submitter: Zuul
Branch: stable/pike

commit bf7112ee4d8e90c79f99526e9071e6e9e94031aa
Author: Matt Riedemann <email address hidden>
Date: Wed Mar 13 16:20:47 2019 -0400

    Update instance.availability_zone on revertResize

    When resizing a server that was not created in an explicit
    zone, the scheduler can pick a host in another zone and
    conductor will update the instance.availability_zone value
    for the new dest host zone.

    The problem is when reverting the resize, the server goes
    back to the original source host/zone but the
    instance.availability_zone value in the database is not
    updated which can lead to incorrect results when listing
    servers and filtering by zone.

    This fixes the bug by updating the instance.availability_zone
    value in the API (where we have access to the aggregates
    table in the API DB) before casting to nova-compute to
    complete the revert. As noted in the comment within, this
    is not fail-safe in case the revert fails before the
    instance.host is updated in finish_revert_resize, but we
    don't have a lot of great backportable options here that
    don't involve "up-calls" from the compute to the API DB.

    Change-Id: I8dc862b90d398b693b259abd3583616d07d8d206
    Closes-Bug: #1819963
    (cherry picked from commit 40f6672f53794b563f4c7e27ede7b59a1d63c14a)
    (cherry picked from commit 26e59912838b145b627aa247c45b0b19393466c0)
    (cherry picked from commit 1342cd75e9a091d28bec857f44f833ab5a7d1b96)
    (cherry picked from commit 99cbbcfd61c3569b9c299c99e4a2eaf38d335d19)

This issue was fixed in the openstack/nova 16.1.8 release.

Change abandoned by Eric Fried (<email address hidden>) on branch: master
Review: https://review.opendev.org/653504
Reason: Purpose served

Change abandoned by Eric Fried (<email address hidden>) on branch: master
Review: https://review.opendev.org/653505
Reason: Purpose served

This issue was fixed in the openstack/nova 19.0.1 release.

This issue was fixed in the openstack/nova 18.2.1 release.

This issue was fixed in the openstack/nova 17.0.11 release.

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers