Comment 0 for bug 1472412

Revision history for this message
Albert White (albertw) wrote :

After runnign some rally scenarios on my main openstack (Solaris, Juno, mix of x86 & sparc, zfssa for storage) I noticed in the admin panel in horizon that several volumes created by the test were still present but in an ERROR_DELETING state.

Setting these to AVAILABLE and trying to delete them manually failed and they changed back to the ERROR_DELETING state.

The volumes were created on an attached ZFSSA (7320) however the volumes were no longer present on that device.

Looking in the cinder-volume log on the host that has the zfssa cinder driver we find a trace of the error:

2015-07-07 22:35:48.599 19035 INFO cinder.volume.manager [req-f992a7a9-cced-434c-b8f6-8ef06fc55ad5 03227578c38345cea638f864a6491fdb fe45765544ad46b2ac28d1ef0db6e541 - - -] volume a6dd4dd8-4ab6-41b8-9286-88a58797c5f9: deleting
2015-07-07 22:35:50.231 19035 ERROR cinder.volume.drivers.zfssa.zfssarest [req-f992a7a9-cced-434c-b8f6-8ef06fc55ad5 03227578c38345cea638f864a6491fdb fe45765544ad46b2ac28d1ef0db6e541 - - -] Error Setting Volume: volume-a6dd4dd8-4ab6-41b8-9286-88a58797c5f9 to InitiatorGroup: com.sun.ms.vss.hg.maskAll Pool: swimming Project: openstack Return code: 404 Message: Not Found.
2015-07-07 22:35:50.262 19035 ERROR cinder.volume.drivers.zfssa.zfssarest [req-f992a7a9-cced-434c-b8f6-8ef06fc55ad5 03227578c38345cea638f864a6491fdb fe45765544ad46b2ac28d1ef0db6e541 - - -] Error Getting Volume: volume-a6dd4dd8-4ab6-41b8-9286-88a58797c5f9 on Pool: swimming Project: openstack Return code: 404 Message: Not Found.
2015-07-07 22:35:50.315 19035 ERROR oslo.messaging.rpc.dispatcher [req-f992a7a9-cced-434c-b8f6-8ef06fc55ad5 03227578c38345cea638f864a6491fdb fe45765544ad46b2ac28d1ef0db6e541 - - -] Exception during message handling: Bad or unexpected response from the storage volume backend API: Error Getting Volume: volume-a6dd4dd8-4ab6-41b8-9286-88a58797c5f9 on Pool: swimming Project: openstack Return code: 404 Message: Not Found.
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/osprofiler/profiler.py", line 105, in wrapper
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher return f(*args, **kwargs)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/volume/manager.py", line 127, in lvo_inner1
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher return lvo_inner2(inst, context, volume_id, **kwargs)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/openstack/common/lockutils.py", line 233, in inner
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher retval = f(*args, **kwargs)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/volume/manager.py", line 126, in lvo_inner2
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher return f(*_args, **_kwargs)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/volume/manager.py", line 455, in delete_volume
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher {'status': 'error_deleting'})
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/openstack/common/excutils.py", line 82, in __exit__
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/volume/manager.py", line 444, in delete_volume
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher self.driver.delete_volume(volume_ref)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/osprofiler/profiler.py", line 105, in wrapper
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher return f(*args, **kwargs)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/volume/drivers/zfssa/zfssaiscsi.py", line 212, in delete_volume
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher volume['name'])
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/vendor-packages/cinder/volume/drivers/zfssa/zfssarest.py", line 446, in get_lun
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher raise exception.VolumeBackendAPIException(data=exception_msg)
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error Getting Volume: volume-a6dd4dd8-4ab6-41b8-9286-88a58797c5f9 on Pool: swimming Project: openstack Return code: 404 Message: Not Found.
2015-07-07 22:35:50.315 19035 TRACE oslo.messaging.rpc.dispatcher
2015-07-07 22:35:55.551 19035 INFO cinder.volume.manager [req-4af9bdec-7494-4c34-875c-d8d55b8fc3d7 - - - - -] Updating volume status

So there may be a race here somewhere where the volume got deleted but cinder still has it.

Anyway, it seems that we should handle this a bit better. I'd suggest in this particular case where we have established that connection to the zfssa is working correctly that instead of throwing an exception that will cause a stack trace and an ERROR_DELETING state and have the user have to hunt through logs that zfssarest detects that the volume is not present (assumed already deleted) and allows cinder to delete it from its database of current volumes.