OpenStack Identity (keystone)

Race condition when rapidly deleting and creating tokens

Bug #1099966 reported by Jay Pipes on 2013-01-15

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Identity (keystone)	Won't Fix	Medium	Unassigned

Bug Description

token backend is SQL. PKI enabled. Multi-node setup with database on separate node from keystone server.

The symptom of this looks like this:

http://paste.openstack.org/show/29472/

Which occurs on random tests in Tempest's identity admin tests, but consistently, when executing the tests in a PKI environment with the database server on a separate node. This apparently does not occur when using devstack, which has a local MySQL instance (and may have some caching enabled?)

The race condition happens like so:

Thread 1:

POST /tokens with auth data, passing the token matching the PKI CMS record for a user

Hits this block of code:

https://github.com/openstack/keystone/blob/master/keystone/token/controllers.py#L124 [1]

The call to token_api.create_token() fails with an IntegrityError from SQLAlchemy. This is a planned-for event, apparently, as the code on line 132 [2] catches Exception, with the following in-line code comment:

# an identical token may have been created already.
# if so, return the token_data as it is also identical

now in Thread 2:

A call to DELETE /tokens (or possibly some token expiration code?) proceeds to delete the same token for the user that just resulted in the IntegrityError raised in thread 1.

back in Thread 1:

The call to token_api.get_token() now fails with a NotFound exception, which causes the original exception (IntegrityError) to be re-raised and sent back across the wire to the end-user.

Proposed Solution:

Instead of re-raising the original exception on line 139 [3], instead drop into a simple loop with a randomized timeout that calls create_token() again with the token ID and token data from line 125.

[1] Same block in Folsom: https://github.com/openstack/keystone/blob/stable/folsom/keystone/service.py#L437
[2] Line 445 in Folsom code.
[3] Line 452 in Folsom code.

Dolph Mathews (dolph) on 2013-03-06

Changed in keystone:
status:	New → Triaged
importance:	Undecided → Medium

Revision history for this message

Ante Karamatić (ivoks) wrote on 2013-04-24:

How about checking if the token exist before creating the same one?

try:
    self.token_api.get_token(context=context,
                                                   token_id=token_id)
except exception.TokenNotFound:
    self.token_api.create_token(...

In that case, if token exists, everything is fine (it might even get deleted just after we fetch it).

Revision history for this message

Morgan Fainberg (mdrnstm) wrote on 2014-06-01:

We have included microsecond data (should be unique per token except in some fairly narrow scenarios). Further improvements are likely to require a lot of work for not a lot of benefit.

At this point I don't think we're seeing much of this error occurring either in test or real deployments, so I'm marking this as "Wont Fix". We can revisit this later if it turns out to resurface.

Changed in keystone:
status:	Triaged → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.