cinder-plugin-ceph-tempest job failing consistently

Bug #2034933 reported by Rajat Dhasmana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
New
Undecided
Unassigned

Bug Description

the cinder-plugin-ceph-tempest job started failing after change 2c2484ca6e1835105b4e322a65f6e7f588736e61 merged.

The reason is that we started passing a value to the "container" parameter[1] (previously None) which the CEPH backup driver treats as a pool[2] and we get the following error:

Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server rados.ObjectNotFound: [errno 2] RADOS object not found (error opening pool 'tempest-VolumesBackupsAdminTest-backup-container-209081170')

If a value is not passed for the container parameter (None), then it takes the value defined in cinder.conf for config parameter backup_ceph_pool[3][4] so we get the pool name as "backups" (or as defined in cinder.conf) instead of "tempest-VolumesBackupsAdminTest-backup-container-209081170"

[1] https://opendev.org/openstack/tempest/src/commit/d37b68ed889a5fee1c15b7a396c0502e6c5ce579/tempest/api/volume/base.py#L192-L195

[2] https://opendev.org/openstack/cinder/src/commit/f79048d2828a058fca7386f40022259ad434f823/cinder/backup/drivers/ceph.py#L878-L879

[3] https://opendev.org/openstack/cinder/src/commit/f79048d2828a058fca7386f40022259ad434f823/cinder/backup/drivers/ceph.py#L1030-L1032

[4] https://opendev.org/openstack/cinder/src/commit/f79048d2828a058fca7386f40022259ad434f823/cinder/backup/drivers/ceph.py#L197

Container parameter passed in request body:

Tempest

"container": "tempest-VolumesBackupsAdminTest-backup-container-209081170"

Body: {"backup": {"volume_id": "52a893e4-3a80-44a3-ab0c-14f5fb8dbcf2", "name": "tempest-VolumesBackupsAdminTest-Backup-2053578820", "container": "tempest-VolumesBackupsAdminTest-backup-container-209081170"}}

Cinder-backup

container='tempest-VolumesBackupsAdminTest-backup-container-209081170'

Sep 08 04:17:14.435102 np0035195633 cinder-backup[109903]: INFO cinder.backup.manager [None req-3306dae1-0d8e-48af-bcb7-3556047f33a2 tempest-VolumesBackupsAdminTest-1897192954 None] Call Volume Manager to get_backup_device for Backup(availability_zone=None,container='tempest-VolumesBackupsAdminTest-backup-container-209081170',created_at=2023-09-08T04:17:14Z,data_timestamp=2023-09-08T04:17:14Z,deleted=False,deleted_at=None,display_description=None,display_name='tempest-VolumesBackupsAdminTest-Backup-2053578820',encryption_key_id=None,fail_reason=None,host='np0035195633',id=10f56865-3978-42ff-8b95-a3d7826cdbf9,metadata={},num_dependent_backups=0,object_count=0,parent=None,parent_id=None,project_id='7dd556749e984e43b27c5c92536b8d1e',restore_volume_id=None,service='cinder.backup.drivers.ceph.CephBackupDriver',service_metadata=None,size=1,snapshot_id=None,status='creating',temp_snapshot_id=None,temp_volume_id=None,updated_at=None,user_id='c6e96b8834474c7f949aa2a06d0c95d0',volume_id=52a893e4-3a80-44a3-ab0c-14f5fb8dbcf2)

ERROR Trace

Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server [None req-3306dae1-0d8e-48af-bcb7-3556047f33a2 tempest-VolumesBackupsAdminTest-1897192954 None] Exception during message handling: rados.ObjectNotFound: [errno 2] RADOS object not found (error opening pool 'tempest-VolumesBackupsAdminTest-backup-container-209081170')
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server Traceback (most recent call last):
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 522, in continue_backup
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server with excutils.save_and_reraise_exception():
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server self.force_reraise()
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server raise self.value
Sep 08 04:17:17.956899 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 500, in continue_backup
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server with excutils.save_and_reraise_exception():
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server self.force_reraise()
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server raise self.value
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 497, in continue_backup
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server updates = backup_service.backup(backup,
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/drivers/ceph.py", line 1047, in backup
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server updates = self._backup_rbd(backup, volume_file, volume.name,
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/drivers/ceph.py", line 799, in _backup_rbd
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server from_snap, image_created = self._full_rbd_backup(backup.container,
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/drivers/ceph.py", line 736, in _full_rbd_backup
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server with eventlet.tpool.Proxy(rbd_driver.RADOSClient(self,
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 266, in __init__
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server self.cluster, self.ioctx = driver._connect_to_rados(pool)
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/drivers/ceph.py", line 325, in _connect_to_rados
Sep 08 04:17:17.958559 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server ioctx = client.open_ioctx(pool_to_open)
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 193, in doit
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, **kwargs)
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 151, in proxy_call
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server rv = execute(f, *args, **kwargs)
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 132, in execute
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server six.reraise(c, e, tb)
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/six.py", line 719, in reraise
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server raise value
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "/opt/stack/data/venv/lib/python3.10/site-packages/eventlet/tpool.py", line 86, in tworker
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server rv = meth(*args, **kwargs)
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server File "rados.pyx", line 988, in rados.Rados.open_ioctx
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server rados.ObjectNotFound: [errno 2] RADOS object not found (error opening pool 'tempest-VolumesBackupsAdminTest-backup-container-209081170')
Sep 08 04:17:17.959914 np0035195633 cinder-backup[109903]: ERROR oslo_messaging.rpc.server

Revision history for this message
Brian Rosmaita (brian-rosmaita) wrote :

Some more info: change If94facd5a926f7eadd092dfc8f0368d8e4b8d630 assumes that the 'container' parameter functions the same for all backup drivers, but that is not the case. For Ceph (RBD) in particular, the 'container' is actually a pool, and these can only be created by the cloud operator. Thus when the RBD backup driver is used, the backup-create request is accepted with a 202, but when the backup service tries to use the 'container', it discovers that it does not exist, and the backup correctly goes to error status. But this makes the test fail, which has broken the cinder gate.

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.