[OSSA 2014-007] Keystone middleware may confuse contexts (CVE-2014-0105)

Bug #1282865 reported by Kieran Spear on 2014-02-21
274
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Security Advisory
Critical
Tristan Cacqueray
python-keystoneclient
Critical
Dolph Mathews

Bug Description

tl;dr: Occasionally a request from a regular user appears to Glance-registry as an admin user:

We have a server with 12 glance-registry processes running, with auth_token middleware handling authentication.

If I run a python script that sets up Glanceclient:

glance = glanceclient.Client('1', token=xxx)

and makes the following call over and over:

glance.images.get(<private_image_id_from_another_tenant>)

Most calls 404 because the calling tenant doesn't have access to the image. Eventually though, an image get request will succeed.

When this happens, glance-registry logs something like:

INFO glance.registry.api.v1.images [9b88678e-f606-4016-a39a-bb1ac150c029 2d474f5d54e24a56978c0332df1c36d6 e183df4e2bd045f7a9cbdb37a99929a5] Successfully retrieved image 32101b45-461e-452c-b065-4e07391c806e

User 2d474f5d54e24a56978c0332df1c36d6 and tenant e183df4e2bd045f7a9cbdb37a99929a5 are completely unrelated to the user's credentials in the python script I'm running. They're actually the user/tenant of our ceilometer admin user. Ceilometer is doing a few image GET requests per second.

The reverse of this also happens. Occasionally a request from ceilometer will assume the identity of the user in my script. Glance-registry logs the request as if it had come from my user, and denies the image GET because my user does not have the appropriate permission for the image ceilometer was trying to query.

If I comment out the "memcached_servers" line from our log file, the problem goes away. This makes me suspect that something is going on in auth_token middleware. When I watch the memcache entry for the token used by my script, the token data never appears to change.

What on earth is going on here? :)

CVE References

Kieran Spear (kspear) wrote :

auth_token middleware thinks it is retrieving the user's token from memcache, but ends up returning the token data for the admin user:

2014-02-21 16:41:56.868 28193 DEBUG keystoneclient.middleware.auth_token [-] Returning cached token 46b4e15e0bf1bef6655dbee733867972. Token data:
{u'access': {u'token': {u'issued_at': u'2014-02-21T05:41:51.816330', u'expires': u'2014-02-21T11:41:51Z', u'id': u'placeholder', u'tenant': {u'id': u'e4eee8dbc16a49dcbc76edac96674e96', u'enabled': True, u'description': None, u'name': u'admin'}}

46b4e15e0bf1bef6655dbee733867972 is the hash of the user's PKI token. But the token data is from another user's token!

affects: glance → python-keystoneclient
Kieran Spear (kspear) wrote :

Looks like eventlet doesn't play nicely with python-memcache. Monkey patching 'thread' in cmd/registry.py fixes this.

Looks like it's not Glance-specific at all.

summary: - Glance-registry context leak
+ Keystone middleware may confuse contexts
Changed in glance:
status: New → Invalid
Thierry Carrez (ttx) wrote :

It's a bit random as an attack vector, but I suspect you can still trigger it using load, so probably needs an OSSA

Changed in ossa:
status: New → Incomplete
Dolph Mathews (dolph) on 2014-02-21
Changed in python-keystoneclient:
importance: Undecided → Critical
Dolph Mathews (dolph) wrote :

I'm able to reproduce this consistently, and it does appear to be entirely dependent on eventlet monkey patching "thread". In my test, 100,000 requests through auth_token split between two users with different authorization, their authorization gets swapped about 0.2% of the time.

With sufficient load, python-memcached enabled in auth_token, and an unpatched thread module, eventlet produces this failure:

Traceback (most recent call last):
  File ".../lib/python2.7/site-packages/eventlet/wsgi.py", line 414, in handle_one_response
    write('')
  File ".../lib/python2.7/site-packages/eventlet/wsgi.py", line 354, in write
    _writelines(towrite)
  File ".../python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 334, in writelines
    self.flush()
  File ".../python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 303, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
  File ".../lib/python2.7/site-packages/eventlet/greenio.py", line 309, in sendall
    tail = self.send(data, flags)
  File ".../lib/python2.7/site-packages/eventlet/greenio.py", line 295, in send
    total_sent += fd.send(data[total_sent:], flags)
error: [Errno 32] Broken pipe

And python-memcached warns about unexpected responses from memcached:

  MemCached: while expecting 'STORED', got unexpected response 'VALUE tokens/acfaaff23f1576cb8485a797beb60ff4 0 1282'
  MemCached: while expecting 'STORED', got unexpected response 'END'

Given that auth_token does not depend on eventlet itself, but instead can be run by eventlet in another service, then all services depending on eventlet and relying on auth_token should be the ones to monkeypatch thread. That said, monkeypatching thread in keystoneclient/openstack/common/memorycache.py appears to resolve the problem as well.

Glance stable/havana appears to be affected:

  https://github.com/openstack/glance/blob/stable/havana/glance/cmd/registry.py#L30

Glance master does not appear to be affected:

  https://github.com/openstack/glance/blob/master/glance/cmd/registry.py#L29

Changed in python-keystoneclient:
status: New → Confirmed
Dolph Mathews (dolph) wrote :

To reproduce the issue in isolation, I wrote a simple echo service to run behind auth_token which can inform the client if the service's perceived tenancy does not reflect the user's expectation. https://review.openstack.org/#/c/75529/

Patchset 1 of the above review depends on wsgiref from stdlib instead of eventlet. `keystoneclient-repro-bug-1282865.diff` modifies that patchset to use eventlet, and monkeypatches with eventlet similarly to how glance stable/havana does (without also monkeypatching thread).

Dolph Mathews (dolph) wrote :

This attachment demonstrates the issue by acting as a API client to bootstrap some data in keystone, generate valid tokens and produce load against a service protected by auth_token.

Dolph Mathews (dolph) wrote :

This avoids the issue in keystoneclient by patching keystoneclient.openstack.common.memorycache (oslo)

Dolph Mathews (dolph) on 2014-02-22
Changed in python-keystoneclient:
assignee: nobody → Dolph Mathews (dolph)
Thierry Carrez (ttx) wrote :

@Dolph: nice work. I wa

Thierry Carrez (ttx) wrote :

(damn LP)
...s wondering if that could not explain the two reports we had in the past of context confusion ("Tenant A is seeing the VNC Consoles of Tenant B!")

Thierry Carrez (ttx) wrote :

I am a bit confused as to the extent of the issue.

IIUC fixing it in python-keystoneclient is enough to cover all cases -- and then *some* middleware consumers that already patched eventlet.thread might be safe already (like Glance master).

Kieran: are you running Glance master or havana ?

Dolph Mathews (dolph) wrote :

A single fix in python-keystoneclient would be sufficient to protect any deployments carrying the patch, but I'm not familiar enough with the vulnerability management & packaging process for clients to make a recommendation against patching any affected services as well.

Thierry Carrez (ttx) on 2014-02-24
Changed in ossa:
status: Incomplete → Confirmed

I assumed any authenticated requests could results in context swap.
Memcached_server appeared in 0.2.3 (git grep memcached_servers 0.2.2 0.2.3).

@Kieran: Please correct me if I'm wrong about the University of Melbourne credit.

Draft impact description #1 -

Title: Privilege escalation in auth_token middleware
Reporter: Kieran Spear (University of Melbourne)
Products: python-keystoneclient
Versions: 0.2.3 version up to 0.6.0

Description:
Kieran Spear from the University of Melbourne reported a vulnerability in python-keystoneclient auth_token middleware. By doing repeated authenticated requests, with sufficient load on the target system, an authenticated user can inherit another authenticated user's role resulting in a privilege escalation. Note that it is related to a bad interaction between auth_token and eventlet that is fixed if the process used eventlet thread monkey patching. Only setups using auth_token with memcache are vulnerable.

Kieran Spear (kspear) wrote :

@Dolph: Thanks for the great analysis. I suspect you'd find a higher failure rate if you change the ratio of requests. When I tested using the clients I used 1 endlessly looping user client and 5 admin clients, and saw the bug a few times per second.

@Thierry: I'm running Havana. I always forget to mention that...

@Tristan
I think this has been an issue since 0.2.0 (and before that the code was in Keystone itself):
https://github.com/openstack/python-keystoneclient/commit/7920899af119d1697c333d202ca3272f167c19b0

I would list the OpenStack services/versions affected too. Most of the other projects except Ceilometer [1] seem to be patching thread already (at least on master). We don't have memcache enabled for Ceilometer so I can't check whether it is vulnerable in practice.

[1] https://github.com/openstack/ceilometer/search?q=monkey_patch&type=Code

Dolph Mathews (dolph) wrote :

Suggested revisions to impact description:

"Note that it is related to a bad interaction between auth_token and eventlet that is fixed if the process used eventlet thread monkey patching." -> "Note that it is related to a bad interaction between eventlet and python-memcached that is fixed if the process uses eventlet to monkey patch 'thread'."

- it's really an issue between the python-memcached and eventlet; auth_token just happens to optionally consume python-memcached while being (typically) served by eventlet (**this may affect any other service using memcached + eventlet**)
- thread is worth putting in quotes IMO, because it actually results in 3 modules being patched (thread, threading and Queue) [1]

"inherit another authenticated user's role resulting in a privilege escalation" -> "assume another authenticated user's complete identity and multi-tenant authorization, potentially resulting in a privilege escalation"

- "role" is a little too narrow if I'm being pedantic -- it's not just role confusion, or authorization confusion, but completely picking up another user's authentication + authorization
- "inherit" seems to imply that it's added on to the existing (valid) authn + authz, when it fact it just replaces it
- "resulting in privilege escalation" is just a potential / likelihood, but it's not a guaranteed outcome, i suppose

[1] http://eventlet.net/doc/patching.html#monkeypatching-the-standard-library

Thierry Carrez (ttx) wrote :

I would use "Potential context confusion in Keystone middleware" as the title.

Also : "...that is fixed if the process used eventlet..." -> "...that is avoided if the calling process already used eventlet..."

-> the issue is not fixed if the process already monkey-patched 'thread', it is not vulnerable in the first place
-> "calling" process makes it IMHO clearer that we are talking about the server project using the middleware.

If we precisely analyze the grizzly/havana affected server software, we could end with something like: "In Grizzly and Havana, only Glance and Ceilometer were found to be affected. Also note that only keystone middleware setups using auth_token with memcache are vulnerable."

Many thanks for your suggestions guys!

This is the revised draft, I did not include the comprehensive affected services list because it might be overkill here and a bit risky if we miss something...
For example Nova have three differents strategies to cope with eventlet monkey patching:
* In grizzly, some services are not monkey patched at all (like the novncproxy that Thierry mentioned)
* In havanna, all services are monkey patched by default
* In icehouse, if "--remote-debug" switches are used, then "thread" is not monkey patched...

What do you think if we just say that "Only keystone middleware setups using auth_token with memcache are vulnerable" ?

@Dolph: do you know if the fix (comment #8) need to be backported for grizzly and havanna or will it works as-is ?

Draft impact description #2 -

Title: Potential context confusion in Keystone middleware
Reporter: Kieran Spear (University of Melbourne)
Products: python-keystoneclient
Versions: 0.2.0 version up to 0.6.0

Description:
Kieran Spear from the University of Melbourne reported a vulnerability in python-keystoneclient auth_token middleware. By doing repeated authenticated requests, with sufficient load on the target system, an authenticated user can assume another authenticated user's complete identity and multi-tenant authorizations, potentially resulting in a privilege escalation. Note that it is related to a bad interaction between eventlet and python-memcached that may be avoided if the calling process already uses eventlet to monkey patch "thread". Only keystone middleware setups using auth_token with memcache are vulnerable.

Thierry Carrez (ttx) wrote :

A few suggestions:

'vulnerability in python-keystoneclient auth_token middleware' -> 'vulnerability in Keystone auth_token middleware (shipped in python-keystoneclient)'

'can' -> 'may in certain situations'

'already uses eventlet to monkey patch "thread"' -> 'already monkey-patches "thread" to use eventlet'

Versions: I would say "all versions up to 0.6.0" -- the middleware was shipped within Keystone before, so the issue didn't start with 0.2.0.

Note that there is no need to backport the patch at all, since there is only one branch in python-keystoneclient. We may have to play tricks if we pinned python-keystoneclient to a certain version in the grizzly gate, though (a bit like the recent python-swiftclient debacle).

Changed in ossa:
importance: Undecided → Critical

With latests suggestions:

Draft impact description #3 -

Title: Potential context confusion in Keystone middleware
Reporter: Kieran Spear (University of Melbourne)
Products: python-keystoneclient
Versions: all version up to 0.6.0

Description:
Kieran Spear from the University of Melbourne reported a vulnerability in Keystone auth_token middleware (shipped in python-keystoneclient). By doing repeated authenticated requests, with sufficient load on the target system, an authenticated user may in certain situations assume another authenticated user's complete identity and multi-tenant authorizations, potentially resulting in a privilege escalation. Note that it is related to a bad interaction between eventlet and python-memcached that may be avoided if the calling process already monkey-patches "thread" to use eventlet. Only keystone middleware setups using auth_token with memcache are vulnerable.

Thierry Carrez (ttx) wrote :

Last-minute nitpicking (but we need to get this one right, as it's pretty massive):

all version -> All versions

By doing repeated authenticated requests -> By doing repeated requests
(we already mention that the user is authenticated later in the same sentence)

may be avoided -> should be avoided
(conveys slightly more certainty that thread-monkey-patching consumers are safe)

Agree, this bug deserve much attention!

Draft impact description #4 -

Title: Potential context confusion in Keystone middleware
Reporter: Kieran Spear (University of Melbourne)
Products: python-keystoneclient
Versions: All versions up to 0.6.0

Description:
Kieran Spear from the University of Melbourne reported a vulnerability in Keystone auth_token middleware (shipped in python-keystoneclient). By doing repeated requests, with sufficient load on the target system, an authenticated user may in certain situations assume another authenticated user's complete identity and multi-tenant authorizations, potentially resulting in a privilege escalation. Note that it is related to a bad interaction between eventlet and python-memcached that should be avoided if the calling process already monkey-patches "thread" to use eventlet. Only keystone middleware setups using auth_token with memcache are vulnerable.

Changed in ossa:
assignee: nobody → Tristan Cacqueray (tristan-cacqueray)
Thierry Carrez (ttx) wrote :

+1 on version 4 of impact description, great work!

Dolph Mathews (dolph) wrote :

+1 for impact description

Thierry Carrez (ttx) wrote :

@Dolph: could you round up some keystone-core reviews of your proposed patch ?

Brant Knudson (blk-u) wrote :

@Dolph: looking at the patch in comment #8 -- it looks like the code is assuming that auth_token is running in eventlet if it can import eventlet.patcher... but won't the work even if the server isn't running in eventlet, if eventlet is installed somewhere that it can be found?

Brant Knudson (blk-u) wrote :

Another concern is that this is going to monkeypatch threads in the middle of handling a request... isn't that supposed to happen early on in the application, like before any threads are started?

Brant Knudson (blk-u) wrote :

I would rather that we said that if an application is using auth_token middleware in eventlet then the application needs to monkeypatch threads.

Brant Knudson (blk-u) wrote :

Here's from the eventlet documentation[1]: "It is important to call monkey_patch() as early in the lifetime of the application as possible. Try to do it as one of the first lines in the main module." So I don't think it's safe to wait until an incoming request is received to do the monkey-patching.

[1] http://eventlet.net/doc/patching.html#monkeypatching-the-standard-library

Dolph Mathews (dolph) wrote :

Brant: I agree with all your concerns, and my patch is making the assumption that nothing needs to load thread / threading / Queue before python-memcached does (in which case, the monkey patch will work correctly, as happens in my test).

The only other alternative I can think of is for auth_token to completely bail on using touching memcached if the required monkey patching hasn't already taken place? (and emit a warning)

Brant Knudson (blk-u) wrote :

Dolph - A patch for keystoneclient that verifies that thread was monkey patched as required in eventlet would be a good one to have. Maybe keystoneclient could use is_monkey_patched to see if anything else was monkeypatched and if so require that thread is monkeypatched? I don't know if you can tell that the application is running under eventlet?

The other projects that use eventlet would have to change to do the thread monkeypatching.

Brant Knudson (blk-u) wrote :

Looks like auth_token could check if any eventlet modules have been imported already, by looking at sys.modules...

 import sys
 any(mod_name for mod_name in sys.modules.keys() if mod_name.startswith('eventlet.'))

Jamie Lennox (jamielennox) wrote :

Sorry, this is the first i've seen this bug. I agree with brant here that i don't want to assume that auth_token is always running under eventlet. It's definetly not always the case.

I'd prefer to put the change in at top of openstack.common.memorycache that says:

if eventlet.already_patched and not patcher.is_monkey_patched('thread'):
    raise ConfigError("You have incorrectly configured eventlet")

This should inform all servers of the problems of mixing eventlet and memcache - i'm getting kind of sick of eventlet's magic and would just prefer to raise the error. If not the same auto-patch that you proposed above for keystoneclient but in the memorycache module should be ok.

The main problem i see with that is is it possible for a server to load auth_token middleware prior to having called the monkey patch function? In keystone and others that have a bin/keystone-all that manages the paste file internally it should be fine, however anyone that wanted to use a more pure paste deployment would load auth_token first - is this a situation that happens in any servers? Does it need to be like a on first memcache get check thread patched and then flag it as done?

Jamie Lennox (jamielennox) wrote :

Ah skimmed throught that quickly and didn't realize that keystoneclient-fix-bug-1282865.diff was against memoryclient.

I think i prefer the ConfigError, but see that it may break existing servers so i'm happy with that.

summary: - Keystone middleware may confuse contexts
+ Keystone middleware may confuse contexts (CVE-2014-0105)

I think we need to come to an agreement on the approach to take, and then we can implement it. I think the approach to take is to both:

a) change auth_token to fail if it's configured for memcached and it's running in eventlet and 'thread' isn't monkey-patched.
b) eventlet servers that use auth_token are changed to monkey-patch eventlet 'thread' in addition to their other monkey-patching.

Thierry Carrez (ttx) wrote :

@Brant: we should first estimate what services are actually impacted, see how easy it will be to fix server-side as a security patch.

My tests[1] found those services using auth_token without "thread" patched:
* Havana:
  glance-registry ({'socket': True, 'time': True})

* Icehouse
  ceilometer-api ({'socket': True})
  swift

Nova in master is also vulnerable when "--remote_debug" switches are used: "# turn off thread patching to enable the remote debugger"

[1]: http://paste.openstack.org/show/72103/

While I'm not sure my tests are comprehensive (ie shouldn't the nova vncproxy also use the auth_token middleware ?), here is an attempt at fixing glance-registry in Havana.

Thierry Carrez (ttx) on 2014-03-04
Changed in glance:
status: Invalid → Confirmed
Dolph Mathews (dolph) wrote :

The same underlying issue was independently and publicly reported against python-keystoneclient in bug 1289074.

I think a security-minded read of bug 1289074 (ie: Garbage in tokens' cache) would lead to this issue.

So if you guys agree, I propose to open this bug and mark 1289074 as a duplicate of this one. This should also help having a fix more quickly.

gordon chung (chungg) wrote :

haven't verified this bug in Ceilometer yet but regarding monkey_patching eventlets... that solution won't work in Ceilometer as it causes issues (see Julien's comment in https://bugs.launchpad.net/ceilometer/+bug/1291054)

if monkey_patching eventlets is the solution, we'll need to do something differently in Ceilometer.

Thierry Carrez (ttx) wrote :

OK, if the patch proposed in bug 1289074 fixes this issue, we should push it in ASAP and issue an OSSA about it. We can keep this bug private (since it gives a lot more detail on exploitability) until the ossa is out.

Thierry Carrez (ttx) wrote :

Cleanroom implementation of patch at https://review.openstack.org/#/c/81078/

Changed in ossa:
status: Confirmed → Triaged
Changed in python-keystoneclient:
status: Confirmed → In Progress
Thierry Carrez (ttx) on 2014-03-19
no longer affects: ceilometer
no longer affects: glance
no longer affects: swift
Changed in python-keystoneclient:
status: In Progress → Triaged
status: Triaged → In Progress
Changed in ossa:
status: Triaged → In Progress
Dolph Mathews (dolph) wrote :

Reviewed: https://review.openstack.org/81078
Committed: https://git.openstack.org/cgit/openstack/python-keystoneclient/commit/?id=d11553a72ee8febc26e4a76ba900984d1b778f59
Submitter: Jenkins
Branch: master

commit d11553a72ee8febc26e4a76ba900984d1b778f59
Author: Dolph Mathews <email address hidden>
Date: Mon Mar 17 15:56:28 2014 -0500

    add pooling for cache references

    Change-Id: Iffb1d1bff5dc4437544a5aefef3bca0e5b17cc81
    Closes-Bug: 1289074

Changed in python-keystoneclient:
status: In Progress → Fix Committed
milestone: none → 0.7.0

Thank you dolph for such a quick fix!
We are still issuing a pre-OSSA before opening this bug.

Proposed public disclosure date/time:
2014-03-27 15:00 UTC

Changed in ossa:
status: In Progress → Fix Committed
Dolph Mathews (dolph) on 2014-03-26
Changed in python-keystoneclient:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2014-03-27
information type: Private Security → Public Security
Thierry Carrez (ttx) wrote :

[OSSA 2014-007]

summary: - Keystone middleware may confuse contexts (CVE-2014-0105)
+ [OSSA 2014-007] Keystone middleware may confuse contexts (CVE-2014-0105)
Changed in ossa:
status: Fix Committed → Fix Released
Matthew Thode (prometheanfire) wrote :

Please release a patchset that matches the requirements that keystone imposes on keystoneclient

https://github.com/openstack/keystone/blob/2013.1.5/tools/pip-requires

python-keystoneclient>=0.2.1,<0.3

At this time I see no patch that fixes this issue for keystoneclient 0.2.5

Thierry Carrez (ttx) wrote :

We only provide one release channel for client libraries (no stable branches). Furthermore grizzly is now unsupported.

That said, that dep capping is a bit weird, since it prevents you from actually following that release channel. I'll see if (1) that capping is abusive and (2) the patch could be backported to the 0.2.x series

Dolph Mathews (dolph) wrote :

I believe the pin was just a cautionary convention, and not strictly necessary.

python-keystoneclient 0.3.0 didn't even exist (released Sun Jun 23 16:20:39 2013) when the pin was created (Sat Nov 17 14:45:18 2012) in pip requires:

  https://github.com/openstack/keystone/commit/e59360da677c4cd3f6a6391cfebb973c11e2ee47

Backporting to 0.2.5 looks to be fairly ugly, but I'm playing around with it now.

Dolph Mathews (dolph) wrote :

Even better news, I'm unable to reproduce against 0.2.5! 0.3.0 appears to be the first release that is vulnerable.

Dolph Mathews (dolph) wrote :

CORRECTION: I'm able to reproduce in 0.2.5

As it turns out, I was backporting my configuration for auth_token which did not have the desired effect of enabling memcached in 0.2.5) due to memcache_servers being renamed to memcached_servers in 0.3.0 (I was setting memcached_servers for testing).

This patch appears to fix 0.2.5 but I haven't done any integration testing, so I can't recommend it's use over 0.7.x

To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers