nova-status upgrade check fails on Object ID linkage

Bug #2039597 reported by Dmitriy Rabotyagov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Committed
High
Unassigned

Bug Description

Description
===========

With upgrade from 2023.1 to 2023.2 when running nova-status upgrade check it fails with exit code 2.

According to the documentation [1], this command was run with the new codebase (2023.2) but before any service (api/conductor/scheduler/compute) was restarted, so they still run on 2023.1 codebase.

With that all computes are UP and healthy:

# openstack compute service list
+--------------------------------------+----------------+------+----------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+--------------------------------------+----------------+------+----------+---------+-------+----------------------------+
| 001ea1ce-363f-41d1-9ce3-59ff966452a7 | nova-conductor | aio1 | internal | enabled | up | 2023-10-17T18:14:38.000000 |
| 8df25103-65c9-4892-be05-ebed7f3c1ad4 | nova-scheduler | aio1 | internal | enabled | up | 2023-10-17T18:14:40.000000 |
| d85b115a-cd8a-4ac9-82bc-f7a5f457cedc | nova-compute | aio1 | nova | enabled | up | 2023-10-17T18:14:39.000000 |
+--------------------------------------+----------------+------+----------+---------+-------+----------------------------+

Steps to reproduce
==================

* Run cluster on 2023.1
* Perform upgrade to 2023.2 but do not restart nova services (as assumed by the documentation)
* Run nova-status upgrade check

Expected result
===============

Upgrade check passes

Actual result
=============

+---------------------------------------------------------------------+
| Check: Object ID linkage |
| Result: Failure |
| Details: Compute node objects without service_id linkage were found |
| in the database. Ensure all non-deleted compute services |
| have started with upgraded code. |
+---------------------------------------------------------------------+

1] https://docs.openstack.org/nova/latest/cli/nova-status.html#upgrade

Revision history for this message
sean mooney (sean-k-mooney) wrote :

just leaving some context before i finish for tonight.
when i originally asked for this check it was with the intent to detect the case where we were going to rely on this in the next release and notify the operator that one of the compute nodes was not upgaged.

chatting about this on irc that logic was flawed.
in the current release we don't depend on this being set.

before upgrading the compute nodes and restarting them it will always not be set and the help text for this
command says it should be run before the service are restarted to execute the new code.

the check as written will also not support clouds that have ironic deployed as tey will not have a compute service id set in the compute node record (and cant until we remove the hash ring code)

for those reasons i think we should likely revert this status check.

alternatively we can modify it to filter out ironic compute nodes and add a min compute service version check a the start. i.e. only run it if all comptue services are at least upgraded to bobcat.

if they are 2023.2+ and we filter out ironic that menast that you have db corruption as something removed the compute service id form the compute node record.

Changed in nova:
status: New → Triaged
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/898741

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/898741
Committed: https://opendev.org/openstack/nova/commit/e1b84a398766bdbccf2d834364fc9e9a7547bb4a
Submitter: "Zuul (22348)"
Branch: master

commit e1b84a398766bdbccf2d834364fc9e9a7547bb4a
Author: Dan Smith <email address hidden>
Date: Wed Oct 18 07:23:29 2023 -0700

    Revert "Add upgrade check for compute-object-ids linkage"

    This is being reverted because it's overly strict and complaining
    that upgrade-related work has not been done before it should have or
    needs to have been done. This may be re-added later when we start
    depending on these linkages.

    Closes-Bug: #2039597
    This reverts commit 27f384b7ac4f19ffaf884d77484814a220b2d51d.

    Change-Id: Ifa5b82ca3b83d0ba481aa7a062827bd8e838989a

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/nova/+/898953

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/nova/+/898953
Committed: https://opendev.org/openstack/nova/commit/a5e26bf6cab336955c6d5a1c261f9ee25604884c
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit a5e26bf6cab336955c6d5a1c261f9ee25604884c
Author: Dan Smith <email address hidden>
Date: Wed Oct 18 07:23:29 2023 -0700

    Revert "Add upgrade check for compute-object-ids linkage"

    This is being reverted because it's overly strict and complaining
    that upgrade-related work has not been done before it should have or
    needs to have been done. This may be re-added later when we start
    depending on these linkages.

    Closes-Bug: #2039597
    This reverts commit 27f384b7ac4f19ffaf884d77484814a220b2d51d.

    Change-Id: Ifa5b82ca3b83d0ba481aa7a062827bd8e838989a
    (cherry picked from commit e1b84a398766bdbccf2d834364fc9e9a7547bb4a)

Elod Illes (elod-illes)
Changed in nova:
status: Fix Released → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 28.0.1

This issue was fixed in the openstack/nova 28.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 29.0.0.0rc1

This issue was fixed in the openstack/nova 29.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.