Fuel for OpenStack

Downtime of one memcache instance leads to failures after recovery

Bug #1406547 reported by Sergey Yudin on 2014-12-30

This bug report is a duplicate of: Bug #1432242: Keystone with memcached backend may fail in get tokens after the memcached restart. Edit Remove

This bug affects 3 people

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	New	High	Fuel Library (Deprecated)	Fuel for OpenStack 6.1
5.0.x	New	Undecided	Fuel Library (Deprecated)	Fuel for OpenStack 5.0-updates
5.1.x	New	High	Fuel Library (Deprecated)	Fuel for OpenStack 5.1.1-updates
6.0.x	New	Undecided	Fuel Library (Deprecated)	Fuel for OpenStack 6.0-updates
6.1.x	New	High	Fuel Library (Deprecated)	Fuel for OpenStack 6.1

Bug Description

Way to reproduce:

1) restart an instance of memcache
2) spawn dozen of VMs
3) investifgate failed instances and find out that keystone failed to auth few tokens of few services

like this:

<188>Dec 30 14:57:23 node-1 keystone-all 2014-12-30 14:57:23.402 3910 WARNING keystone.common.wsgi [-] Could not find token, b6749f3eb5ea4d24886e2c2cf074db36

ERROR cinder.scheduler.filter_scheduler [req-8db91d3c-2f3f-4115-ab9f-c419f2074651 95a31653ffc342e2b275629104b75eb5 69b89f2718db433e9f647b6fd5bd69b9 - - -] Error scheduling None from last vol-service: cinder : [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/taskflow/engines/action_engine/executor.py", line 36, in _execute_task\n result = task.execute(**arguments)\n', u' File "/usr/lib/python2.7/dist-packages/cinder/volume/flows/manager/create_volume.py", line 265, in execute\n image_id),\n', u' File "/usr/lib/python2.7/dist-packages/cinder/image/glance.py", line 243, in get_location\n _reraise_translated_image_exception(image_id)\n', u' File "/usr/lib/python2.7/dist-packages/cinder/image/glance.py", line 241, in get_location\n image_meta = client.call(context, \'get\', image_id)\n', u' File "/usr/lib/python2.7/dist-packages/cinder/image/glance.py", line 158, in call\n return getattr(client.images, method)(*args, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/glanceclient/v2/images.py", line 79, in get\n resp, body = self.http_client.json_request(\'GET\', url)\n', u' File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 266, in json_request\n resp, body_iter = self._http_request(url, method, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 249, in _http_request\n raise exc.from_response(resp, body_str)\n', u'ImageNotAuthorized: Not authorized for image fed30bc9-91cc-4a18-b76c-e2ff209d219e.\n']

Revision history for this message

Sergey Yudin (tsipa740) wrote on 2014-12-31:

More easier way to reproduce:

1) on 1st screen on one of controllers run:
/etc/init.d/keystone stop ; /etc/init.d/memcached stop ; sleep 1m ; /etc/init.d/memcached start ; /etc/init.d/keystone start

2) on 2nd screen on one of the controllers in parallel run:
while true ; do date ; for f in `seq 1 20` ; do ( out=$(glance --debug image-list 2>&1) || echo $out ) & done ; wait ; done

3) enjoy random failures almost infinitely

Tomasz 'Zen' Napierala (tzn) on 2015-01-05

summary:

- Downtime of on memcache instance leads to failures after recovery
+ Downtime of one memcache instance leads to failures after recovery

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2015-01-08:

Please provide a version of Fuel used, keystone config and if possible, the logs snapshot

Changed in fuel:
status:	New → Incomplete
importance:	Undecided → High
milestone:	none → 6.1
assignee:	nobody → Fuel Library Team (fuel-library)

Revision history for this message

Stanislaw Bogatkin (sbogatkin) wrote on 2015-02-03:

Any updates?

Revision history for this message

Stanislaw Bogatkin (sbogatkin) wrote on 2015-02-06:

This bug was incomplete for more than 4 weeks. We cannot investigate it further so we are setting the status to Invalid. If you think it is not correct, please feel free to provide requested information and reopen the bug, and we will look into it further.

Revision history for this message

Michael Polenchuk (mpolenchuk) wrote on 2015-04-08:

In order to prevent excessive effort spent validating tokens, some services caches previously-seen tokens in-process (or in memcached) for N seconds (default is 300s).

Revision history for this message

Sergey Yudin (tsipa740) wrote on 2015-04-08: