volume delete fails due to ToozConnectionError in the gate

Bug #1810526 reported by Matt Riedemann on 2019-01-04
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Unassigned
tooz
Undecided
Unassigned

Bug Description

Seen here:

http://logs.openstack.org/02/628302/2/check/nova-next/b6c5278/logs/screen-c-api.txt.gz?level=TRACE#_Jan_04_02_03_17_173949

Jan 04 02:03:17.173949 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault [req-631661d8-fb9d-497f-9ae8-a6f56a52fb73 req-76a82746-5bfb-4031-b5af-4a9fe776c75e tempest-TestVolumeBootPattern-1042085253 tempest-TestVolumeBootPattern-1042085253] Caught error: <class 'oslo_messaging.rpc.client.RemoteError'> Remote error: ToozConnectionError Service Unavailable
Jan 04 02:03:17.174298 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: [u'Traceback (most recent call last):\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming\n res = self.dispatcher.dispatch(message)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch\n result = func(ctxt, **new_args)\n', u' File "<decorator-gen-683>", line 2, in detach_volume\n', u' File "/opt/stack/new/cinder/cinder/coordination.py", line 144, in _synchronized\n with lock(blocking):\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/locking.py", line 31, in __enter__\n return self.lock.__enter__(*self.args, **self.kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/locking.py", line 52, in __enter__\n acquired = self.acquire(*args, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/drivers/etcd3gw.py", line 47, in wrapper\n cause=e)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/utils.py", line 225, in raise_with_cause\n excutils.raise_with_cause(exc_cls, message, *args, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 143, in raise_with_cause\n six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get(\'cause\'))\n', u' File "/usr/local/lib/python2.7/dist-packages/six.py", line 737, in raise_from\n raise value\n', u'ToozConnectionError: Service Unavailable\n'].: RemoteError: Remote error: ToozConnectionError Service Unavailable
Jan 04 02:03:17.175116 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: [u'Traceback (most recent call last):\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming\n res = self.dispatcher.dispatch(message)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch\n result = func(ctxt, **new_args)\n', u' File "<decorator-gen-683>", line 2, in detach_volume\n', u' File "/opt/stack/new/cinder/cinder/coordination.py", line 144, in _synchronized\n with lock(blocking):\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/locking.py", line 31, in __enter__\n return self.lock.__enter__(*self.args, **self.kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/locking.py", line 52, in __enter__\n acquired = self.acquire(*args, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/drivers/etcd3gw.py", line 47, in wrapper\n cause=e)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/utils.py", line 225, in raise_with_cause\n excutils.raise_with_cause(exc_cls, message, *args, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 143, in raise_with_cause\n six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get(\'cause\'))\n', u' File "/usr/local/lib/python2.7/dist-packages/six.py", line 737, in raise_from\n raise value\n', u'ToozConnectionError: Service Unavailable\n'].
Jan 04 02:03:17.175696 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault Traceback (most recent call last):
Jan 04 02:03:17.175899 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/api/middleware/fault.py", line 85, in __call__
Jan 04 02:03:17.176087 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return req.get_response(self.application)
Jan 04 02:03:17.176309 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1314, in send
Jan 04 02:03:17.176543 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault application, catch_exc_info=False)
Jan 04 02:03:17.176738 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1278, in call_application
Jan 04 02:03:17.176907 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault app_iter = application(self.environ, start_response)
Jan 04 02:03:17.177084 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 143, in __call__
Jan 04 02:03:17.177287 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return resp(environ, start_response)
Jan 04 02:03:17.177441 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 129, in __call__
Jan 04 02:03:17.177560 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault resp = self.call_func(req, *args, **kw)
Jan 04 02:03:17.177710 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 193, in call_func
Jan 04 02:03:17.177824 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return self.func(req, *args, **kwargs)
Jan 04 02:03:17.177937 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/osprofiler/web.py", line 112, in __call__
Jan 04 02:03:17.178061 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return request.get_response(self.application)
Jan 04 02:03:17.178201 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1314, in send
Jan 04 02:03:17.178319 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault application, catch_exc_info=False)
Jan 04 02:03:17.178437 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1278, in call_application
Jan 04 02:03:17.178563 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault app_iter = application(self.environ, start_response)
Jan 04 02:03:17.178669 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 129, in __call__
Jan 04 02:03:17.178774 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault resp = self.call_func(req, *args, **kw)
Jan 04 02:03:17.178888 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 193, in call_func
Jan 04 02:03:17.178993 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return self.func(req, *args, **kwargs)
Jan 04 02:03:17.179138 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/keystonemiddleware/auth_token/__init__.py", line 335, in __call__
Jan 04 02:03:17.179288 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault response = req.get_response(self._app)
Jan 04 02:03:17.179440 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1314, in send
Jan 04 02:03:17.303926 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault application, catch_exc_info=False)
Jan 04 02:03:17.304110 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1278, in call_application
Jan 04 02:03:17.304249 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault app_iter = application(self.environ, start_response)
Jan 04 02:03:17.304383 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 143, in __call__
Jan 04 02:03:17.304515 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return resp(environ, start_response)
Jan 04 02:03:17.304643 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 143, in __call__
Jan 04 02:03:17.304772 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return resp(environ, start_response)
Jan 04 02:03:17.304897 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/routes/middleware.py", line 141, in __call__
Jan 04 02:03:17.305047 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault response = self.app(environ, start_response)
Jan 04 02:03:17.305186 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 143, in __call__
Jan 04 02:03:17.305355 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return resp(environ, start_response)
Jan 04 02:03:17.305486 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 129, in __call__
Jan 04 02:03:17.305642 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault resp = self.call_func(req, *args, **kw)
Jan 04 02:03:17.305768 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 193, in call_func
Jan 04 02:03:17.305897 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return self.func(req, *args, **kwargs)
Jan 04 02:03:17.306033 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/api/openstack/wsgi.py", line 844, in __call__
Jan 04 02:03:17.306165 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault content_type, body, accept)
Jan 04 02:03:17.306295 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/api/openstack/wsgi.py", line 904, in _process_stack
Jan 04 02:03:17.306445 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault action_result = self.dispatch(meth, request, action_args)
Jan 04 02:03:17.306605 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/api/openstack/wsgi.py", line 999, in dispatch
Jan 04 02:03:17.306758 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return method(req=request, **action_args)
Jan 04 02:03:17.306894 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/api/validation/__init__.py", line 61, in wrapper
Jan 04 02:03:17.307038 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return func(*args, **kwargs)
Jan 04 02:03:17.307218 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/api/contrib/volume_actions.py", line 101, in _detach
Jan 04 02:03:17.307355 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault self.volume_api.detach(context, volume, attachment_id)
Jan 04 02:03:17.307505 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/volume/api.py", line 786, in detach
Jan 04 02:03:17.307651 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault attachment_id)
Jan 04 02:03:17.307790 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/opt/stack/new/cinder/cinder/volume/rpcapi.py", line 217, in detach_volume
Jan 04 02:03:17.307926 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault return cctxt.call(ctxt, 'detach_volume', **msg_args)
Jan 04 02:03:17.308058 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 179, in call
Jan 04 02:03:17.309212 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault retry=self.retry)
Jan 04 02:03:17.309375 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 128, in _send
Jan 04 02:03:17.309524 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault retry=retry)
Jan 04 02:03:17.309661 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 645, in send
Jan 04 02:03:17.309819 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault call_monitor_timeout, retry=retry)
Jan 04 02:03:17.309947 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 636, in _send
Jan 04 02:03:17.310084 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault raise result
Jan 04 02:03:17.310227 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault RemoteError: Remote error: ToozConnectionError Service Unavailable
Jan 04 02:03:17.310762 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault [u'Traceback (most recent call last):\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming\n res = self.dispatcher.dispatch(message)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch\n result = func(ctxt, **new_args)\n', u' File "<decorator-gen-683>", line 2, in detach_volume\n', u' File "/opt/stack/new/cinder/cinder/coordination.py", line 144, in _synchronized\n with lock(blocking):\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/locking.py", line 31, in __enter__\n return self.lock.__enter__(*self.args, **self.kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/locking.py", line 52, in __enter__\n acquired = self.acquire(*args, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/drivers/etcd3gw.py", line 47, in wrapper\n cause=e)\n', u' File "/usr/local/lib/python2.7/dist-packages/tooz/utils.py", line 225, in raise_with_cause\n excutils.raise_with_cause(exc_cls, message, *args, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 143, in raise_with_cause\n six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get(\'cause\'))\n', u' File "/usr/local/lib/python2.7/dist-packages/six.py", line 737, in raise_from\n raise value\n', u'ToozConnectionError: Service Unavailable\n'].
Jan 04 02:03:17.311224 ubuntu-xenial-ovh-bhs1-0001495691 <email address hidden>[2195]: ERROR cinder.api.middleware.fault

Looking at the etcd logs at the same time:

Jan 04 02:02:31.558686 ubuntu-xenial-ovh-bhs1-0001495691 etcd[4341]: sync duration of 1.584795989s, expected less than 1s
Jan 04 02:03:17.652401 ubuntu-xenial-ovh-bhs1-0001495691 etcd[4341]: sync duration of 13.450922593s, expected less than 1s

The cinder-api 500 error caused a failure in nova-compute:

Jan 04 02:03:17.308274 ubuntu-xenial-ovh-bhs1-0001495691 nova-compute[1405]: WARNING nova.compute.manager [None req-631661d8-fb9d-497f-9ae8-a6f56a52fb73 tempest-TestVolumeBootPattern-1042085253 tempest-TestVolumeBootPattern-1042085253] [instance: 4c4687ad-3198-40ba-847b-1e2bc6a54914] Ignoring unknown cinder exception for volume d823b22f-eaf3-4f37-aab8-f0c6e3b89e9d: The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-76a82746-5bfb-4031-b5af-4a9fe776c75e): ClientException: The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-76a82746-5bfb-4031-b5af-4a9fe776c75e)
Jan 04 02:03:17.310381 ubuntu-xenial-ovh-bhs1-0001495691 nova-compute[1405]: WARNING nova.compute.manager [None req-631661d8-fb9d-497f-9ae8-a6f56a52fb73 tempest-TestVolumeBootPattern-1042085253 tempest-TestVolumeBootPattern-1042085253] Failed to delete volume: d823b22f-eaf3-4f37-aab8-f0c6e3b89e9d due to Invalid input received: Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer. (HTTP 400) (Request-ID: req-bc4cdf34-0fff-472a-8a32-e9c546481381): InvalidInput: Invalid input received: Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer. (HTTP 400) (Request-ID: req-bc4cdf34-0fff-472a-8a32-e9c546481381)

Which causes the test to fail:

http://logs.openstack.org/02/628302/2/check/nova-next/b6c5278/job-output.txt.gz#_2019-01-04_02_09_47_594012

2019-01-04 02:09:47.593483 | primary | Captured traceback:
2019-01-04 02:09:47.593503 | primary | ~~~~~~~~~~~~~~~~~~~
2019-01-04 02:09:47.593541 | primary | Traceback (most recent call last):
2019-01-04 02:09:47.593590 | primary | File "tempest/common/utils/__init__.py", line 89, in wrapper
2019-01-04 02:09:47.593624 | primary | return f(*func_args, **func_kwargs)
2019-01-04 02:09:47.593692 | primary | File "tempest/scenario/test_volume_boot_pattern.py", line 257, in test_image_defined_boot_from_volume
2019-01-04 02:09:47.593751 | primary | self.volumes_client.wait_for_resource_deletion(created_volume['id'])
2019-01-04 02:09:47.593814 | primary | File "tempest/lib/common/rest_client.py", line 888, in wait_for_resource_deletion
2019-01-04 02:09:47.593858 | primary | raise exceptions.TimeoutException(message)
2019-01-04 02:09:47.593903 | primary | tempest.lib.exceptions.TimeoutException: Request timed out
2019-01-04 02:09:47.594012 | primary | Details: (TestVolumeBootPattern:test_image_defined_boot_from_volume) Failed to delete volume d823b22f-eaf3-4f37-aab8-f0c6e3b89e9d within the required time (196 s).

Looks like this isn't very frequent:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Caught%20error%3A%20%3Cclass%20'oslo_messaging.rpc.client.RemoteError'%3E%20Remote%20error%3A%20ToozConnectionError%20Service%20Unavailable%5C%22%20AND%20tags%3A%5C%22screen-c-api.txt%5C%22&from=7d

I wonder if some kind of retry could be added for connection errors like this though since it doesn't look like etcd crashed, it's just backed up.

Matt Riedemann (mriedem) wrote :

Also seen in the cinder-volume service logs:

http://logs.openstack.org/19/619319/2/gate/grenade-py3/c4e51a8/logs/screen-c-vol.txt.gz?level=TRACE#_Jan_15_04_51_51_420369

During a volume delete operation in a grenade job.

Sean McGinnis (sean-mcginnis) wrote :

The idea was brought up in IRC that tooz should recognize these service unavailable states and perform retries. I think that makes more sense than trying to so something in the consuming services.

Adding tooz to see if someone there has any thoughts on this approach and if someone can work on adding a retry mechanism there that Cinder (and others) can pick up in a newer release.

Matt Riedemann (mriedem) wrote :

One other piece of information is this seems to only happen on OVH nodes (for the vast majority of failures). I wonder if there is something about those nodes that indicate problems with etcd? Like maybe I/O issues?

Matt Riedemann (mriedem) wrote :

frickler pointed this out: https://review.openstack.org/#/c/626885/

We might want to consider setting ETCD_USE_RAMDISK=True as the default.

Clark Boylan (cboylan) wrote :

Re IO issues on OVH nodes I do think there are some known problems with IO there, but unfortunately no known fixes. Maybe we can get dpawlik and amorin reengaged on this?

In the interim running etcd on a ramdisk seems like a great idea.

Ben Nemec (bnemec) wrote :

Given that this seems to be exposing a legitimate issue with the underlying service, do we still want to try to work around it with retries in tooz?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers