BaseIdentityPlugin.get_access hang

Bug #1508424 reported by Mehdi Abaakouk on 2015-10-21
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Undecided
Mehdi Abaakouk
python-keystoneclient
High
Jamie Lennox

Bug Description

Hi,

In Gnocchi, our gate have a very high failure rates since last keystoneclient release. After further investigation it appears the code hang on a lock in BaseIdentityPlugin.get_access.

Cheers,

Mehdi Abaakouk (sileht) on 2015-10-21
Changed in python-keystoneclient:
assignee: nobody → Mehdi Abaakouk (sileht)

Fix proposed to branch: master
Review: https://review.openstack.org/238001

Changed in python-keystoneclient:
status: New → In Progress
Dolph Mathews (dolph) wrote :

If others are running into this with different usage patterns, let's upgrade this to critical. Either way, this deserves a minor release immediately.

Changed in python-keystoneclient:
importance: Undecided → High
Brant Knudson (blk-u) wrote :

Can you post a backtrace?

David Stanek (dstanek) wrote :

I've tried writing a test to show the bad behavior, but ran into a wall. We seem to always be passing authenticated=False when calling session.post(). This would prevent reentry into the protected block. Are you using a custom auth plugin?

1. http://git.openstack.org/cgit/openstack/python-keystoneclient/tree/keystoneclient/auth/identity/v3/base.py#n190
2. http://git.openstack.org/cgit/openstack/python-keystoneclient/tree/keystoneclient/auth/identity/v2.py#n88

Changed in python-keystoneclient:
status: In Progress → Incomplete
Jamie Lennox (jamielennox) wrote :

Without the lock if you don't pass authenticated=False when making an auth call you will end up in an infinite loop of the session trying to authenticate the request from the plugin with the plugin, so the lock is preventing a stacktrace there but shouldn't be affecting regular traffic.

There is really no reason that the same thread should trigger that lock twice (infinite loop) it's foreseeable if the token fetch is taking a long time then other threads will be blocked waiting on it - but this should take the same amount of time as if they all did their own fetch.

Have you got a backtrace or something of what's happening?

Change abandoned by Mehdi Abaakouk (sileht) (<email address hidden>) on branch: master
Review: https://review.openstack.org/238001

Mehdi Abaakouk (sileht) wrote :

I have checked the backtrace and everything looks good authenticated=False and the loop doesn't occurs.

I have found something new, the issue occurs with swift, when at least two concurrents requests are done.

It seems Swift doesn't monkeypatch the threading module. Only socket is monkey patched.
https://github.com/openstack/swift/blob/master/swift/common/wsgi.py#L410

If that true:
* First request, get the lock and do socket IO to retrieve a token -> greenlet switch occurs
* Second request, try to get the lock and hang instead of generate a greenlet switch, because this is not a greenlock.

I wonder how we are supposed to fix this ? Swift should patch threading I guess.

Mehdi Abaakouk (sileht) on 2015-10-22
Changed in python-keystoneclient:
assignee: Mehdi Abaakouk (sileht) → nobody
Mehdi Abaakouk (sileht) on 2015-10-22
Changed in swift:
status: New → Incomplete
status: Incomplete → Confirmed
Mehdi Abaakouk (sileht) wrote :

Swift patch for monkeypatching thread modules https://review.openstack.org/#/c/238580/

Changed in swift:
assignee: nobody → Mehdi Abaakouk (sileht)
Changed in swift:
status: Confirmed → In Progress
Jamie Lennox (jamielennox) wrote :

Based on the linked patches it appears this requires monkey-patching thread when using eventlet - which is good policy.

This is a problem in swift because they mix real threads and green threads in the same process. This is insanity and absolutely not something that we can fix/support from keystoneclient

Changed in python-keystoneclient:
status: Incomplete → Invalid
assignee: nobody → Jamie Lennox (jamielennox)

Reviewed: https://review.openstack.org/238580
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=bf8689474a9b3930cceed4c5f855a73ce89b1d4e
Submitter: Jenkins
Branch: master

commit bf8689474a9b3930cceed4c5f855a73ce89b1d4e
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Oct 22 17:33:09 2015 +0200

    monkeypatch thread for keystoneclient

    keystoneclient uses threading.Lock(), but swift doesn't
    monkeypatch threading, this result in lockup when two
    greenthreads try to acquire a non green lock.

    This change fixes that.

    Change-Id: I9b44284a5eb598a6978364819f253e031f4eaeef
    Closes-bug: #1508424

Changed in swift:
status: In Progress → Fix Committed
Download full text (11.9 KiB)

Reviewed: https://review.openstack.org/249975
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=458d5ebb9ac8f2ea4662de86f3a2c5551b4961f6
Submitter: Jenkins
Branch: feature/crypto

commit 41897d96a7dc3ddceccf4eed71345439b83eb243
Author: Tim Burke <email address hidden>
Date: Mon Nov 23 13:53:27 2015 -0800

    Reverse-listings follow-up

     * With the end_prefix changes in the original commit, we no longer need
       the `or not name.startswith(prefix)` check.
     * Improve test coverage of reverse path listings.

    Change-Id: Iaa7d4b83647c3c150be95f88cb3cc9e4f0e33979

commit 7c1e6cd583debe43aca266643cd65f5103159dbf
Author: Matthew Oliver <email address hidden>
Date: Thu Sep 11 16:51:51 2014 +1000

    Add container and account reverse listings

    This change adds the ability to tell the container or account server to
    reverse their listings. This is done by sending a reverse=TRUE_VALUE,

    Where TRUE_VALUE is one of the values true can be in common/utils:

      TRUE_VALUES = set(('true', '1', 'yes', 'on', 't', 'y'))

    For example:

      curl -i -X GET -H "X-Auth-Token: $TOKEN" $STORAGE_URL/c/?reverse=on

    I borrowed the swapping of the markers code from Kevin's old change,
    thanks Kevin. And Tim Burke added some real nuggets of awesomeness.

    DocImpact
    Co-Authored-By: Kevin McDonald <email address hidden>
    Co-Authored-By: Tim Burke <email address hidden>
    Implements: blueprint reverse-object-listing

    Change-Id: I5eb655360ac95042877da26d18707aebc11c02f6

commit 898223398189f96628db7d9928f146c5bb11ba88
Author: Ganesh Maharaj Mahalingam <email address hidden>
Date: Fri Oct 30 21:30:51 2015 +0000

    Unit tests for account/backend.py

    Add test for database create request without account
    Partial test for migrate call on database with storage_policy_index

    Change-Id: I7cfbd6bc7e2b341f433d88f600b19e54826a0e22
    Signed-off-by: Ganesh Maharaj Mahalingam <email address hidden>

commit f80c97a6e964123e766c1e53d797a3b80433ac77
Author: Alistair Coles <email address hidden>
Date: Fri Nov 20 17:51:33 2015 +0000

    Update .mailmap entry

    Fix/add entries for authors that are known to now have
    @hpe.com email addresses.

    Change-Id: I62c83b64e2378cb3d6ad33ac9e6ab111b8a8c86f

commit 848d7299a26988ea1bab3e576ba10762624dfff6
Author: Tim Burke <email address hidden>
Date: Fri Nov 20 09:04:28 2015 -0800

    Ignore Content-Type from client on multipart-manifest=delete

    Otherwise, if a client includes something like:

        Content-Type: application/x-www-form-urlencoded

    ...we won't delete anything, and instead send back:

        Number Deleted: 0
        Number Not Found: 0
        Response Body:
        Response Status: 406 Not Acceptable
        Errors:

    ...despite the fact that we're managing the iterator, so we know it's
    acceptable.

    Change-Id: I791c7bda1d9df830d8dacd3783c2393db5a9ac09

commit c4eeea7820b7bbf6f5994c9878c8838c43e61b9b
Author: Sivasathurappan Radhakrishnan <email address hidden>
Date: Thu Oct 29 23:04:42 2015 +0000

    A...

tags: added: in-feature-crypto
Download full text (54.1 KiB)

Reviewed: https://review.openstack.org/264517
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=93ddaffaebb620bc712d52d0267194562dce02be
Submitter: Jenkins
Branch: feature/hummingbird

commit f53cf1043d078451c4b9957027bf3af378aa0166
Author: Ondřej Nový <email address hidden>
Date: Tue Jan 5 20:20:15 2016 +0100

    Fixed few misspellings in comments

    Change-Id: I8479c85cb8821c48b5da197cac37c80e5c1c7f05

commit 79222e327f9df6335b58e17a6c8dd0dc44b86c17
Author: ChangBo Guo(gcb) <email address hidden>
Date: Sat Dec 26 13:13:37 2015 +0800

    Fix AttributeError for LogAdapter

    LogAdapter object has no attribute 'warn' but has attribute
    'warning'.

    Closes-Bug: #1529321
    Change-Id: I0e0bd0a3dbc4bb5c1f0b343a8809e53491a1da5f

commit 684c4c04592278a280032002b5313b171ee7a4c0
Author: janonymous <email address hidden>
Date: Sun Aug 2 22:47:42 2015 +0530

    Python 3 deprecated the logger.warn method in favor of warning

    DeprecationWarning: The 'warn' method is deprecated, use 'warning'
    instead

    Change-Id: I35df44374c4521b1f06be7a96c0b873e8c3674d8

commit d0a026fcb8e8a9f5475699cc56e1998bdc4cd5ca
Author: Hisashi Osanai <email address hidden>
Date: Wed Dec 16 18:50:37 2015 +0900

    Fix duplication for headers in Access-Control-Expose-Headers

    There are following problems with Access-Control-Expose-Headers.

    * If headers in X-Container-Meta-Access-Control-Expose-Headers are
      configured, the headers are kept with case-sensitive string.
      Then a CORS request comes, the headers are merged into
      Access-Control-Expose-Headers as case-sensitive string even if
      there is a same header which is not case-sensitive string.

    * Access-Control-Expose-Headers is handled by a list.
      If X-Container/Object-Meta-XXX is configured in container/object
      and X-Container-Meta-Access-Control-Expose-Headers, same header
      is listed in Access-Control-Expose-Headers.

    This patch provides a fix for the problems.

    Change-Id: Ifc1c14eb3833ec6a851631cfc23008648463bd81

commit 0bcd7fd50ec0763dcb366dbf43a9696ca3806f15
Author: Bill Huber <email address hidden>
Date: Fri Nov 20 12:09:26 2015 -0600

    Update Erasure Coding Overview doc to remove Beta version

    The major functionality of EC has been released for Liberty and
    the beta version of the code has been removed since it is now
    in production.

    Change-Id: If60712045fb1af803093d6753fcd60434e637772

commit 84ba24a75640be4212e0f984c284faf4c894e7c6
Author: Alistair Coles <email address hidden>
Date: Fri Dec 18 11:24:34 2015 +0000

    Fix rst errors so that html docs are complete

    rst table format errors don't break the gate job
    but do cause sections of the documents to go missing
    from the html output.

    Change-Id: Ic8c9953c93d03dcdafd8f47b271d276c7b356dc3

commit 87f7e907ee412f5847f1f9ffca7a566fb148c6b1
Author: Matthew Oliver <email address hidden>
Date: Wed Dec 16 17:19:24 2015 +1100

    Pass HTTP_REFERER down to subrequests

    Currently a HTTP_REFERER (Referer) header isn't passed down to
    subreq...

tags: added: in-feature-hummingbird
Changed in swift:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers