openstack volume migrate with ceph rbd backend fails and deleted the volume in ceph

Bug #1864058 reported by Anastasios
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
New
Undecided
Jon Bernard

Bug Description

Hello team,
In openstack train version when you try and perform "openstack volume migrate" to a volume that is not attached to any server and you use ceph rbd as backend then the action fails but also deletes the volume from Ceph

Example volume with id:

openstack volume show d0fbafc7-ebf0-4aa1-89ac-425bf20d136e
+--------------------------------+----------------------------------------------------------+
| Field | Value |
+--------------------------------+----------------------------------------------------------+
| attachments | [] |
| availability_zone | nova |
| bootable | false |
| consistencygroup_id | None |
| created_at | 2020-02-20T15:05:19.000000 |
| description | |
| encrypted | False |
| id | d0fbafc7-ebf0-4aa1-89ac-425bf20d136e |
| migration_status | None |
| multiattach | False |
| name | test-migrate2 |
| os-vol-host-attr:host | xxhost1-cinder-volumes-container-c58f584e@RBD#rbd-be |
| os-vol-mig-status-attr:migstat | None |
| os-vol-mig-status-attr:name_id | None |
| os-vol-tenant-attr:tenant_id | 8e94d1cd437d487aa2eee57ac43faa4d |
| properties | |
| replication_status | None |
| size | 1 |
| snapshot_id | None |
| source_volid | None |
| status | available |
| type | StandardDisk |
| updated_at | 2020-02-20T15:05:19.000000 |
| user_id | aedbd9fe0c104632813bacc6c72264c3 |
+--------------------------------+----------------------------------------------------------+

From Ceph:

rbd info cinder-volumes_d09/volume-d0fbafc7-ebf0-4aa1-89ac-425bf20d136e
rbd image 'volume-d0fbafc7-ebf0-4aa1-89ac-425bf20d136e':
 size 1 GiB in 128 objects
 order 23 (8 MiB objects)
 id: 351ce56b8b4567
 block_name_prefix: rbd_data.351ce56b8b4567
 format: 2
 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
 op_features:
 flags:
 create_timestamp: Thu Feb 20 15:05:19 2020

openstack volume migrate --host xxhost2-cinder-volumes-container-85165474@RBD#rbd-be d0fbafc7-ebf0-4aa1-89ac-425bf20d136e

--
Doing openstack volume show again the request fails

openstack volume show d0fbafc7-ebf0-4aa1-89ac-425bf20d136e
+--------------------------------+----------------------------------------------------------+
| Field | Value |
+--------------------------------+----------------------------------------------------------+
| attachments | [] |
| availability_zone | nova |
| bootable | false |
| consistencygroup_id | None |
| created_at | 2020-02-20T15:05:19.000000 |
| description | |
| encrypted | False |
| id | d0fbafc7-ebf0-4aa1-89ac-425bf20d136e |
| migration_status | error |
| multiattach | False |
| name | test-migrate2 |
| os-vol-host-attr:host | xxhost1-cinder-volumes-container-c58f584e@RBD#rbd-be |
| os-vol-mig-status-attr:migstat | error |
| os-vol-mig-status-attr:name_id | None |
| os-vol-tenant-attr:tenant_id | 8e94d1cd437d487aa2eee57ac43faa4d |
| properties | |
| replication_status | None |
| size | 1 |
| snapshot_id | None |
| source_volid | None |
| status | available |
| type | StandardDisk |
| updated_at | 2020-02-20T15:06:25.000000 |
| user_id | aedbd9fe0c104632813bacc6c72264c3 |
+--------------------------------+----------------------------------------------------------+

Checking from Ceph again:
rbd info cinder-volumes_d09/volume-d0fbafc7-ebf0-4aa1-89ac-425bf20d136e
rbd: error opening image volume-d0fbafc7-ebf0-4aa1-89ac-425bf20d136e: (2) No such file or directory

Error i get from destination cinder host:

Feb 20 15:04:13 xxhost2-cinder-volumes-container-85165474 cinder-volume[187]: 2020-02-20 15:04:13.089 187 ERROR cinder.volume.drivers.rbd [req-afa06493-ae42-48ca-94ad-e99edd1966bd f1b43e50a8cd466a86175306283b5528 df0020d1f75346f2a0a
Feb 20 15:04:13 xxhost2-cinder-volumes-container-85165474 cinder-volume[187]: 2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server [req-afa06493-ae42-48ca-94ad-e99edd1966bd f1b43e50a8cd466a86175306283b5528 df0020d1f75346f2a0a
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_messaging/rpc/ser
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_messaging/rpc/dis
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_messaging/rpc/dis
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/cinder/volume/manager.
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server volume.save()
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_utils/excutils.py
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server self.force_reraise()
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_utils/excutils.py
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/six.py", line 693, in
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server raise value
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/cinder/volume/manager.
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server host)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/cinder/volume/drivers/
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server self.RBDProxy().remove(target.ioctx, volume.name)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_utils/excutils.py
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server self.force_reraise()
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/oslo_utils/excutils.py
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/six.py", line 693, in
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server raise value
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/cinder/volume/drivers/
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server source.copy(target.ioctx, volume.name)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/eventlet/tpool.py", li
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, **kwargs)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/eventlet/tpool.py", li
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server rv = execute(f, *args, **kwargs)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/eventlet/tpool.py", li
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server six.reraise(c, e, tb)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/six.py", line 693, in
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server raise value
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "/openstack/venvs/cinder-20.0.1.dev1/lib/python3.6/site-packages/eventlet/tpool.py", li
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server rv = meth(*args, **kwargs)
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server File "rbd.pyx", line 2265, in rbd.Image.copy
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server rbd.ImageExists: [errno 17] error copying image b'volume-fd120f27-b4cd-4564-9882-476e33606a8a'
                                                                                  2020-02-20 15:04:13.245 187 ERROR oslo_messaging.rpc.server

Has anyone seen anything similar? I know cinder migrate is not supported for ceph backend but this is a serious bug.

tags: added: drivers migrate rbd
Jon Bernard (jbernard)
Changed in cinder:
assignee: nobody → Jon Bernard (jbernard)
Revision history for this message
Gorka Eguileor (gorka) wrote :

Hi Anastasios,

Could you tell me how you have deployed those 2 cinder-volume services?
Are they running as Active-Active for the same backend?
Are they just 2 different volume services configured to use the same RBD pool?

I'm asking because this looks to me like you forcefully migrated the volume from one cinder-volume host to another, but in reality the RBD pool was the same, so when the driver tries to create the volume on the destination it fails because it already exists (it's the origin volume), and then proceeds to delete what it thinks is the "destination volume" (it's the origin as well) as part of the failure cleanup code.

Could you confirm, please?

Cheers,
Gorka.

Revision history for this message
Anastasios (adados) wrote :

Hello,
Yes we have two different cinder-volume services that utilize the same rbd backend.
We are testing the scenario of what to do when a cinder-volume service becomes unavailable.
And we thought that when migrating the volume, what it does is to assign the volume to a different cinder-volume instance, not pool unless we are wrong on this.

In any case the fact that the original volume is deleted is bad.

You can always do a test and let us know

Thanks,
Anastasios

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.