Due to user deletion (e.g. when heat deletes a temporary user) all tokens in the active token list need to be scanned to match the user_id. Similarly, tokens that need to match the trust_id require a scan of all active tokens.
This causes an issue as we select all the columns (especially the extra column that contains a significant amount of data). At 15000 active tokens it is not impossible to load 15000x10k of token data for the scan. Under some circumstances this will cause a lot of work to be done, fill up buffers, and return 0 active tokens. Indexes on the user_id and trust_id columns in the token table solve this issue.
The issue primarily is seen when it locks up either an eventlet worker or a mod_wsgi process (as the DB queries do not yield to other coroutines in eventlet when using MySQLDB or consume the resources for the mod_wsgi worker until the results are returned).
The query looks like:
SELECT token.id AS token_id, token.expires AS token_expires, token.extra AS token_extra, token.valid AS token_valid, token.user_id AS token_user_id, token.trust_id AS token_trust_id FROM token WHERE token.valid = 1 AND token.expires > '2014-06-19 23:18:48.196884' AND token.user_id = 'f6d9db238d084998aaef92ce425edff0';
This query most of the time uses the index "idx_token_expires" which results in too many rows. Some times depending on the load using this index matches more than 50000 rows in our performance run which is as good as full table scan.
As the queries in question use "user_id" in the where clause, the above query can be optimized by adding index on user_id. The same performance run after adding the index on user_id doesn't show any degradation.
As an aside, the poll frequency for the revocation list is configurable. In addition, I hope you're purging expired tokens from your backend (e.g. keystone-manage token_flush) to reduce the number of persisted tokens.
I believe such an index was proposed in the past, and rejected... although I don't recall why (I'm sure the answer is buried in gerrit somewhere). I can't think of any reason why it wouldn't be an improvement?