Memcache token backend eventually stops working

Bug #1012381 reported by Rafael Durán Castañeda on 2012-06-12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Rafael Durán Castañeda
Alan Pevec
keystone (Ubuntu)

Bug Description


At BVOX, my company, we've got a weird issue while using the memcache token backend. Eventually, the token validation stops working and after some debugging, I've detected that the error is always triggered after a token validation request, returning the information about the token used for validation instead of the token being validating., e.g.:

GET /v2.0/tokens/123
x-auth-token: 456

Returns information about 456 instead of 123.

Working on the issue, under heavy load, we've also got this error:

RuntimeError: Second simultaneous read on fileno XX detected.
Unless you really know what you're doing, make sure that only one greenthread can read any particular socket.
Consider using a pools.Pool.

This error can be solved just monkey patching the threading module (as Nova does and I think Glance too).

I've attached a quite simple Python script using multiprocessing that is able to trigger the error quite fast, it drops some requests however token validation still works (3 of 4 processes dropped in my tests); not matching what happens on real deployment, where the token validation never works again until Memcached is restarted (I think just because of the ptython-memcache connection reset).

I can send a patch, unless someone has a good reason to Keystone not monkey patching the threading module.

P.S.: This bug can be triggered both under stable/essex and master branches.

Joseph Heck (heckj) on 2012-06-12
Changed in keystone:
status: New → Triaged
importance: Undecided → High
Joseph Heck (heckj) on 2012-06-19
tags: added: essex
Changed in keystone:
assignee: nobody → Rafael Durán Castañeda (rafadurancastaneda)

Fix proposed to branch: master

Changed in keystone:
status: Triaged → In Progress

Submitter: Jenkins
Branch: master

commit 3f9f77af19c748658629a460bc447fe7f2d0a410
Author: Rafael Durán Castañeda <email address hidden>
Date: Tue Jun 19 20:35:43 2012 +0200

    Monkey patching 'thread'.

    Fixes bug 1012381.

    Change-Id: Icb7b2372df96d647fc6dcd4c4ebe72c8aa607f9d

Changed in keystone:
status: In Progress → Fix Committed
Alan Pevec (apevec) on 2012-06-26
tags: added: essex-backport
removed: essex

Submitter: Jenkins
Branch: stable/essex

commit d8dbdbced061fa4a4e42ec33c4b7e7752b0ebc04
Author: Rafael Durán Castañeda <email address hidden>
Date: Tue Jun 19 20:35:43 2012 +0200

    Monkey patching 'thread'.

    Fixes bug 1012381.

    Change-Id: Icb7b2372df96d647fc6dcd4c4ebe72c8aa607f9d

Thierry Carrez (ttx) on 2012-07-04
Changed in keystone:
milestone: none → folsom-2
status: Fix Committed → Fix Released
Dave Walker (davewalker) on 2012-08-24
Changed in keystone (Ubuntu):
status: New → Fix Released
Changed in keystone (Ubuntu Precise):
status: New → Confirmed

Please find the attached test log from the Ubuntu Server Team's CI infrastructure. As part of the verification process for this bug, Keystone has been deployed and configured across multiple nodes using precise-proposed as an installation source. After successful bring-up and configuration of the cluster, a number of exercises and smoke tests have be invoked to ensure the updated package did not introduce any regressions. A number of test iterations were carried out to catch any possible transient errors.

Please Note the list of installed packages at the top and bottom of the report.

For records of upstream test coverage of this update, please see the Jenkins links in the comments of the relevant upstream code-review(s):

Trunk review:
Stable review:

As per the provisional Micro Release Exception granted to this package by the Technical Board, we hope this contributes toward verification of this update.

Adam Gandelman (gandelman-a) wrote :

Test coverage log.

tags: added: verification-done

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package keystone - 2012.1+stable~20120824-a16a0ab9-0ubuntu2

keystone (2012.1+stable~20120824-a16a0ab9-0ubuntu2) precise-proposed; urgency=low

  * New upstream release (LP: #1041120):
    - debian/patches/0013-Flush-tenant-membership-deletion-before-user.patch:
  * Resynchronize with stable/essex:
    - authenticate in ldap backend doesn't return a list of roles
      (LP: #1035428)
    - LDAP should not check username on "sn" field (LP: #997700)
    - Admin API doesn't valid token. (LP: #1006815, #1006822)
    - Memcache token backend eventually stops working. (LP: #1012381)
    - EC2 credentials not migrated from legacy (diablo) database. (LP: #1016056)
    - Deleting tenants or users does not cleanup metadata. (LP: #973243)
    - Deleting tenants does not cleanup its user associations. (LP: #974199)
    - TokenNotFound not raised in testsuite beacuse of timezone issues. (LP: #983800)
    - Token authentication for a user in a disabled tenant does not raise
      Unauthorized error. (LP: #988920)
    - export_legacy_catalog doesn't convert url names correctly. (LP: #994936)
    - Following a password compromise and subsequent password change,
      tokens remain valid. (LP: #996595)
    - Tokens remain valid after a user account is disabled. (LP: #997194)
 -- Adam Gandelman <email address hidden> Fri, 24 Aug 2012 03:34:59 -0400

Changed in keystone (Ubuntu Precise):
status: Confirmed → Fix Released
Thierry Carrez (ttx) on 2012-09-27
Changed in keystone:
milestone: folsom-2 → 2012.2
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers