VM rebuild fails after Zed->2023.1 upgrade

Bug #2040264 reported by Dmitriy Rabotyagov
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Committed
Undecided
Unassigned
2024.1
Fix Committed
Undecided
Unassigned
Antelope
New
Undecided
Unassigned
Bobcat
New
Undecided
Unassigned
Zed
New
Undecided
Unassigned

Bug Description

Description
===========

After upgrade of nova, including compute and conductor nodes, VM rebuild fails. All computes, that have service state UP, and all conductors are having version 66. Though, there was 1 compute during upgrade that is DOWN, which does have version 64.

Due to it conductor negotiates minimal version to 64, which is still acceptable minimal RPC version though that leads to not passing another required argument.

Steps to reproduce
==================

* Setup env with Nova version 26.2.0
* Perform upgrade to 27.1.0 where 1 compute will be down or not upgraded (and thus can't update it's rpc version to latest 66)
* Try to re-build the VM: openstack server rebuild <server_uuid> --image <image_uuid>

Expected result
===============

VM is rebuild

Actual result
=============

VM is stuck in rebuilding state with following trace in nova-compute

Logs & Configs
==============
Stack trace from nova-compute:
https://paste.openstack.org/show/biUIcOzMCx0YlsFob2KK/

Nova-conductor does negotiation by minimal version:
INFO nova.compute.rpcapi [None req-2670be51-8233-4269-ac6a-f49486e8893d - - - - - -] Automatically selected compute RPC version 6.1 from minimum service version 64

Potentially, there's another issue upgrading from Yoga to 2023.1 related to this:
https://github.com/openstack/nova/commit/30aab9c234035b49c7e2cdc940f624a63eeffc1b#diff-47eb12598e353b9e0689707d7b477353200d0aa3ed13045ffd3d017ee7d9e753R3709

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/899234

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/899235

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/899375

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/899234
Committed: https://opendev.org/openstack/nova/commit/21fd0c430c714d21c52e0a0c996351c374a3e3d6
Submitter: "Zuul (22348)"
Branch: master

commit 21fd0c430c714d21c52e0a0c996351c374a3e3d6
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:51:21 2023 +0200

    add a regression test for all compute RPCAPI 6.x pinnings for rebuild

    We forgot that we automatically pin our RPC calls to the RPC version
    that the older compute supports, so when rolling-upgrading computes, we
    continue to use either Yoga or Zed versions for example when upgrading
    to 2023.1.

    Since the new parameters aren't optional, we broke the
    rebuild_instance() method then for Yoga to Zed and Zed to 2023.1.

    Change-Id: Icf340f3d4c5ce0a4b7388003f168e7c479e58eee
    Related-Bug: #2040264

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/899235
Committed: https://opendev.org/openstack/nova/commit/ee9ed0f7c6abf7c4847e6dc31f6d3d79b25b9d99
Submitter: "Zuul (22348)"
Branch: master

commit ee9ed0f7c6abf7c4847e6dc31f6d3d79b25b9d99
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:58:36 2023 +0200

    Fix rebuild compute RPC API exception for rolling-upgrades

    By I0d889691de1af6875603a9f0f174590229e7be18 we broke rebuild for Yoga
    or older computes.
    By I9660d42937ad62d647afc6be965f166cc5631392 we broke rebuild for Zed
    computes.

    Fixing this by making the parameters optional.

    Change-Id: I0ca04045f8ac742e2b50490cbe5efccaee45c5c0
    Closed-Bug: #2040264

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/899375
Committed: https://opendev.org/openstack/nova/commit/b64ecb0cc776bd3eced674b0f879bb23c8a4b486
Submitter: "Zuul (22348)"
Branch: master

commit b64ecb0cc776bd3eced674b0f879bb23c8a4b486
Author: Sylvain Bauza <email address hidden>
Date: Thu Oct 26 11:00:03 2023 +0200

    Adding server actions tests to grenade-multinode

    We recently found a rolling-upgrade bug on rebuild so we need to make
    sure that grenade-multinode can verify all our instance actions.

    Given we pin the compute RPC API version to the N-1 compute one, we are
    sure that all RPC calls continue to behave the previous release.

    NOTE : Given the previous cycle was already supporting 6.2 RPC version,
    we can't test here the previous problems hence why this is the last
    patch from the series.

    Change-Id: I1d8deb139922494dd74ff32965fd7dd74d1d768b
    Related-Bug: #2040264

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/nova/+/900306

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/900307

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/nova/+/900336

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/nova/+/900337

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/2023.2)

Related fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/nova/+/900309

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/nova/+/900338

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/2023.2)

Related fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/nova/+/900339

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/900341

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/900342

Elod Illes (elod-illes)
Changed in nova:
status: New → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900309
Committed: https://opendev.org/openstack/nova/commit/eb310f3bd2f21efe0dd2bc6b133694a687e8f5ff
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit eb310f3bd2f21efe0dd2bc6b133694a687e8f5ff
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:51:21 2023 +0200

    add a regression test for all compute RPCAPI 6.x pinnings for rebuild

    We forgot that we automatically pin our RPC calls to the RPC version
    that the older compute supports, so when rolling-upgrading computes, we
    continue to use either Yoga or Zed versions for example when upgrading
    to 2023.1.

    Since the new parameters aren't optional, we broke the
    rebuild_instance() method then for Yoga to Zed and Zed to 2023.1.

    Change-Id: Icf340f3d4c5ce0a4b7388003f168e7c479e58eee
    Related-Bug: #2040264
    (cherry picked from commit 21fd0c430c714d21c52e0a0c996351c374a3e3d6)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900338
Committed: https://opendev.org/openstack/nova/commit/6b870ab90afe400ec82715e908afecbb00f0ed65
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 6b870ab90afe400ec82715e908afecbb00f0ed65
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:58:36 2023 +0200

    Fix rebuild compute RPC API exception for rolling-upgrades

    By I0d889691de1af6875603a9f0f174590229e7be18 we broke rebuild for Yoga
    or older computes.
    By I9660d42937ad62d647afc6be965f166cc5631392 we broke rebuild for Zed
    computes.

    Fixing this by making the parameters optional.

    Change-Id: I0ca04045f8ac742e2b50490cbe5efccaee45c5c0
    Closed-Bug: #2040264
    (cherry picked from commit ee9ed0f7c6abf7c4847e6dc31f6d3d79b25b9d99)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900306
Committed: https://opendev.org/openstack/nova/commit/a861b575081b31090ff9f89120b2247a7586acf8
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit a861b575081b31090ff9f89120b2247a7586acf8
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:51:21 2023 +0200

    add a regression test for all compute RPCAPI 6.x pinnings for rebuild

    We forgot that we automatically pin our RPC calls to the RPC version
    that the older compute supports, so when rolling-upgrading computes, we
    continue to use either Yoga or Zed versions for example when upgrading
    to 2023.1.

    Since the new parameters aren't optional, we broke the
    rebuild_instance() method then for Yoga to Zed and Zed to 2023.1.

    Change-Id: Icf340f3d4c5ce0a4b7388003f168e7c479e58eee
    Related-Bug: #2040264
    (cherry picked from commit 21fd0c430c714d21c52e0a0c996351c374a3e3d6)
    (cherry picked from commit eb310f3bd2f21efe0dd2bc6b133694a687e8f5ff)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900336
Committed: https://opendev.org/openstack/nova/commit/edfb3975807b3eda4fae0ea07a3d99871ca87cae
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit edfb3975807b3eda4fae0ea07a3d99871ca87cae
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:58:36 2023 +0200

    Fix rebuild compute RPC API exception for rolling-upgrades

    By I0d889691de1af6875603a9f0f174590229e7be18 we broke rebuild for Yoga
    or older computes.
    By I9660d42937ad62d647afc6be965f166cc5631392 we broke rebuild for Zed
    computes.

    Fixing this by making the parameters optional.

    Change-Id: I0ca04045f8ac742e2b50490cbe5efccaee45c5c0
    Closed-Bug: #2040264
    (cherry picked from commit ee9ed0f7c6abf7c4847e6dc31f6d3d79b25b9d99)
    (cherry picked from commit 6b870ab90afe400ec82715e908afecbb00f0ed65)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900307
Committed: https://opendev.org/openstack/nova/commit/e13c86b4f320a0a040b558f2e207a911fc9f6127
Submitter: "Zuul (22348)"
Branch: stable/zed

commit e13c86b4f320a0a040b558f2e207a911fc9f6127
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:51:21 2023 +0200

    add a regression test for all compute RPCAPI 6.x pinnings for rebuild

    We forgot that we automatically pin our RPC calls to the RPC version
    that the older compute supports, so when rolling-upgrading computes, we
    continue to use either Yoga or Zed versions for example when upgrading
    to 2023.1.

    Since the new parameters aren't optional, we broke the
    rebuild_instance() method then for Yoga to Zed and Zed to 2023.1.

    NOTE(elod.illes): test_rebuild_instance_6_1 test needed an update, as
    now we are on Zed branch, so zed to zed upgrade does not raise any
    Error, as we have the same parameters in the RPC call.

    Change-Id: Icf340f3d4c5ce0a4b7388003f168e7c479e58eee
    Related-Bug: #2040264
    (cherry picked from commit 21fd0c430c714d21c52e0a0c996351c374a3e3d6)
    (cherry picked from commit eb310f3bd2f21efe0dd2bc6b133694a687e8f5ff)
    (cherry picked from commit a861b575081b31090ff9f89120b2247a7586acf8)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900341
Committed: https://opendev.org/openstack/nova/commit/1b9c4c7e64425196b5776154a0618c9e2a763be8
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 1b9c4c7e64425196b5776154a0618c9e2a763be8
Author: Sylvain Bauza <email address hidden>
Date: Wed Oct 25 10:58:36 2023 +0200

    Fix rebuild compute RPC API exception for rolling-upgrades

    By I0d889691de1af6875603a9f0f174590229e7be18 we broke rebuild for Yoga
    or older computes.
    By I9660d42937ad62d647afc6be965f166cc5631392 we broke rebuild for Zed
    computes.

    Fixing this by making the parameters optional.

    Conflicts:
      nova/compute/manager.py

    NOTE(elod.illes): conflict is due to feature 'allowing target state for
    evacuate' I9660d42937ad62d647afc6be965f166cc5631392 was added in 2023.1
    Antelope cycle.

    Change-Id: I0ca04045f8ac742e2b50490cbe5efccaee45c5c0
    Closed-Bug: #2040264
    (cherry picked from commit ee9ed0f7c6abf7c4847e6dc31f6d3d79b25b9d99)
    (cherry picked from commit 6b870ab90afe400ec82715e908afecbb00f0ed65)
    (cherry picked from commit edfb3975807b3eda4fae0ea07a3d99871ca87cae)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/zed)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/900342
Reason: stable/zed branch of openstack/nova is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/zed if you want to further work on this patch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/nova/+/900339
Committed: https://opendev.org/openstack/nova/commit/611bd95aab3800419bdc38e37da7d7e145fef558
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 611bd95aab3800419bdc38e37da7d7e145fef558
Author: Sylvain Bauza <email address hidden>
Date: Thu Oct 26 11:00:03 2023 +0200

    Adding server actions tests to grenade-multinode

    We recently found a rolling-upgrade bug on rebuild so we need to make
    sure that grenade-multinode can verify all our instance actions.

    Given we pin the compute RPC API version to the N-1 compute one, we are
    sure that all RPC calls continue to behave the previous release.

    NOTE : Given the previous cycle was already supporting 6.2 RPC version,
    we can't test here the previous problems hence why this is the last
    patch from the series.

    Conflicts:
      .zuul.yaml

    NOTE(elod.illes): conflict is due to test_live_migration_with_trunk
    test exlusion (I0a8dd6e6e30526aa2841b4db67ed9affed166fd8) was not
    backported to stable branches.

    Change-Id: I1d8deb139922494dd74ff32965fd7dd74d1d768b
    Related-Bug: #2040264
    (cherry picked from commit b64ecb0cc776bd3eced674b0f879bb23c8a4b486)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.