Upgrades to compute RPC API 5.12 are broken

Bug #1902925 reported by Sylvain Bauza
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Sylvain Bauza
Victoria
Fix Released
Critical
Sylvain Bauza

Bug Description

In change https://review.opendev.org/#/c/715326/ we allowed a new argument to the rebuild_instance() RPC method named 'accel_uuids'.

In the same change, in order to manage different version of computes, we allowed to not pass this argument if the destination RPC service is not able to speak 5.12.
That being said, as we forgot to make the accel_uuids argument be nullable, we then accordingly cast a call to the compute manager without this attribute while it expects it, which would lead to a TypeError on the server side.

FWIW, this can happen with any RPC pin, even with the compute='auto' default value as this value will elect to automatically pin a version that both the source and destination can support.

Changed in nova:
status: New → Confirmed
importance: Undecided → High
importance: High → Critical
assignee: nobody → Sylvain Bauza (sylvain-bauza)
tags: added: compute upgrade
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/761457

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/761458

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

To be clear, this issue impacts all rolling upgrades in the case of mixed versions for compute services where operators set the rpc pin to 'auto' (or any other value but the default being none) as mentioned in our docs https://docs.openstack.org/nova/latest/user/upgrade.html#rolling-upgrade-process

See https://docs.openstack.org/nova/latest/configuration/config.html#upgrade_levels.compute

In the case of an upgrade from Ussuri to Victoria, operators wanting to still have Ussuri computes would set the pin to 'auto' which will ask the 5.11 RPC version for all the compute services, which will break Victoria ones.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/761638

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/761639

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/761457
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8f79afd448f0234eab82eb1d3e3d48e0f657bcc7
Submitter: Zuul
Branch: master

commit 8f79afd448f0234eab82eb1d3e3d48e0f657bcc7
Author: Sylvain Bauza <email address hidden>
Date: Wed Nov 4 20:16:59 2020 +0100

    Add a regression test for 5.12 compute API issue

    In I147bf4d95e6d86ff1f967a8ce37260730f21d236 we wrote a breaking RPC change
    for the 5.12 version as the accel_uuids parameter is not optional.

    Adding a regression test to check the issue.

    Change-Id: I1f3914e16294c99a625b3984ca0098d835cd9b92
    Related-Bug: #1902925

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/761458
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8d9f298f4c48e32bc625c7e2f47d47e4277ec064
Submitter: Zuul
Branch: master

commit 8d9f298f4c48e32bc625c7e2f47d47e4277ec064
Author: Sylvain Bauza <email address hidden>
Date: Wed Nov 4 20:20:52 2020 +0100

    Fix the compute RPC 5.12 issue

    In I147bf4d95e6d86ff1f967a8ce37260730f21d236 we added a new argument for
    the rebuild_instance() RPC method. Unfortunately, we missed to that it
    needs to be optional for older versions.

    Adding a default none value for it so rolling upgrades would work.

    Change-Id: I59c5e56b00114fea5ec19fa63ec73f032dc3bd5c
    Closes-Bug: #1902925

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/victoria)

Reviewed: https://review.opendev.org/761638
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=239ffff2abdeeea08b27c9108216c73c34b41345
Submitter: Zuul
Branch: stable/victoria

commit 239ffff2abdeeea08b27c9108216c73c34b41345
Author: Sylvain Bauza <email address hidden>
Date: Wed Nov 4 20:16:59 2020 +0100

    Add a regression test for 5.12 compute API issue

    In I147bf4d95e6d86ff1f967a8ce37260730f21d236 we wrote a breaking RPC change
    for the 5.12 version as the accel_uuids parameter is not optional.

    Adding a regression test to check the issue.

    Change-Id: I1f3914e16294c99a625b3984ca0098d835cd9b92
    Related-Bug: #1902925
    (cherry picked from commit 8f79afd448f0234eab82eb1d3e3d48e0f657bcc7)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/victoria)

Reviewed: https://review.opendev.org/761639
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a06e27592f5819b41e45a9e69f677d059205b87d
Submitter: Zuul
Branch: stable/victoria

commit a06e27592f5819b41e45a9e69f677d059205b87d
Author: Sylvain Bauza <email address hidden>
Date: Wed Nov 4 20:20:52 2020 +0100

    Fix the compute RPC 5.12 issue

    In I147bf4d95e6d86ff1f967a8ce37260730f21d236 we added a new argument for
    the rebuild_instance() RPC method. Unfortunately, we missed to that it
    needs to be optional for older versions.

    Adding a default none value for it so rolling upgrades would work.

    Change-Id: I59c5e56b00114fea5ec19fa63ec73f032dc3bd5c
    Closes-Bug: #1902925
    (cherry picked from commit 8d9f298f4c48e32bc625c7e2f47d47e4277ec064)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.0.0.0rc1

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.