No mechanism to wait for computes to update service version before restarting to remove RPC version cap after upgrade
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Triaged
|
Wishlist
|
Unassigned |
Bug Description
Description
===========
When performing an upgrade, services cap their RPC version when communicating with nova-compute to that of the compute service with the lowest version. Once all computes are running the new version, we can restart the services to remove this cap.
When deployment tools try to automate this procedure there is no interface available to check or wait for all computes to be running the new version.
Steps to reproduce
==================
Perform a rolling upgrade of nova, following https:/
Expected results
================
After starting up nova services with the new code, there is some way to check which compute services are running the latest RPC version (from the 'services' DB table). Ideally it would not be necessary for the caller to know the actual minimum RPC version.
Actual results
==============
We need to insert a manual step to check the service versions in the database, or a 'sleep' for long enough that we can be sure that all services are up and running the new version.
tags: | added: upgrade |
I have thought about adding a new microversion to the os-services API to include the service version for something like this in the past but never went through with it. I'm not sure you'd want something to be relying the REST API during an upgrade anyway. I'm guessing we'd want something in the "nova-status upgrade" command area, either in the "check" subcommand (though that's more for pre and post validation - to which this could apply) or maybe some new subcommand.
I'm not sure what you mean by "Ideally it would not be necessary for the caller to know the actual minimum RPC version." Aren't you just looking for a command to say whether or not all services (does it need to be just nova-compute binaries or also things like nova-conductor, nova-scheduler, etc?) are running the current latest version for that release, i.e. this:
https:/ /github. com/openstack/ nova/blob/ 7ecdee01ed0efce cc4b448118a412d 2c6554a27d/ nova/objects/ service. py#L34
This sounds like more of a specless blueprint than a bug so I'm going to mark this as wishlist but it sounds simple enough that we could add something to the nova-status upgrade command.