Upgrade from Queens to Rocky causes volume, scheduler service failures until upgrade is complete across all units

Bug #1922924 reported by Michael Skalka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Cinder Charm
Triaged
Medium
Unassigned

Bug Description

On the latest stable cinder charm, in an HA configuration using hacluster. Following an openstack release upgrade from Queens to Rocky one unit went into a blocked state:

ubuntu@playground-cpe-9f305f1c-537b-477e-afca-1ac364a24afd:~/project$ juju status cinder
Model Controller Cloud/Region Version SLA Timestamp
openstack foundations-maas maas_cloud 2.8.10 unsupported 13:58:05Z

App Version Status Scale Charm Store Rev OS Notes
cinder 13.0.9 blocked 3 cinder jujucharms 308 ubuntu
cinder-ceph 13.0.9 active 3 cinder-ceph jujucharms 260 ubuntu
hacluster-cinder active 3 hacluster jujucharms 74 ubuntu
nrpe-container active 3 nrpe jujucharms 70 ubuntu
public-policy-routing active 3 advanced-routing jujucharms 4 ubuntu

Unit Workload Agent Machine Public address Ports Message
cinder/0 active idle 0/lxd/0 10.244.49.42 8776/tcp Unit is ready
  cinder-ceph/1 active idle 10.244.49.42 Unit is ready
  hacluster-cinder/1 active idle 10.244.49.42 Unit is ready and clustered
  nrpe-container/12 active idle 10.244.49.42 icmp,5666/tcp ready
  public-policy-routing/12 active idle 10.244.49.42 Unit is ready
cinder/1 blocked idle 1/lxd/0 10.244.49.35 8776/tcp Services not running that should be: cinder-scheduler, cinder-volume
  cinder-ceph/2 active idle 10.244.49.35 Unit is ready
  hacluster-cinder/2 active idle 10.244.49.35 Unit is ready and clustered
  nrpe-container/14 active idle 10.244.49.35 icmp,5666/tcp ready
  public-policy-routing/15 active idle 10.244.49.35 Unit is ready
cinder/2* active idle 2/lxd/0 10.244.49.29 8776/tcp Unit is ready
  cinder-ceph/0* active idle 10.244.49.29 Unit is ready
  hacluster-cinder/0* active idle 10.244.49.29 Unit is ready and clustered
  nrpe-container/13 active idle 10.244.49.29 icmp,5666/tcp ready
  public-policy-routing/10 active idle 10.244.49.29 Unit is ready

Looking into the volume and scheduler logs an issue with versioned objects being capped:

2021-04-07 13:46:36.744 826223 INFO cinder.rpc [req-022c4cb1-95d4-4881-a8de-ad99cdfc5d33 - - - - -] Automatically selected cinder-scheduler objects version 1.38 as minimum service version.
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume [req-022c4cb1-95d4-4881-a8de-ad99cdfc5d33 - - - - -] Volume service cinder@cinder-ceph failed to start.: CappedVersionUnknown: Unrecoverable Error: Versioned Objects in DB are capped to unknown version 1.38. Most likely your environment contains only
 new services and you're trying to start an older one. Use `cinder-manage service list` to check that and upgrade this service.
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume Traceback (most recent call last):
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/cmd/volume.py", line 105, in _launch_service
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume cluster=cluster)
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 403, in create
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume cluster=cluster)
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 156, in __init__
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume *args, **kwargs)
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 218, in __init__
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume *args, **kwargs)
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/manager.py", line 183, in __init__
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume self.scheduler_rpcapi = scheduler_rpcapi.SchedulerAPI()
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/rpc.py", line 208, in __init__
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume serializer = base.CinderObjectSerializer(obj_version_cap)
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume File "/usr/lib/python2.7/dist-packages/cinder/objects/base.py", line 543, in __init__
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume raise exception.CappedVersionUnknown(version=version_cap)
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume CappedVersionUnknown: Unrecoverable Error: Versioned Objects in DB are capped to unknown version 1.38. Most likely your environment contains only new services and you're trying to start an older one. Use `cinder-manage service list` to check that and upgrade this service.
2021-04-07 13:46:36.745 826223 ERROR cinder.cmd.volume

Processing the upgrade on this unit resolves the issue. This doesn't seem like a critical issue however this will disrupt any users attempting a live upgrade with no downtime to control plane services.

tags: added: openstack-upgrade
Revision history for this message
Billy Olsen (billy-olsen) wrote :

I suspect to fix this we would probably need to look at doing version pinning of the cinder (and other) services.

Changed in charm-cinder:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.