Authentication is not checked before sending potentially large request bodies

Bug #1202785 reported by Alex Meade
48
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Glance
Won't Fix
Undecided
Unassigned
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned
keystonemiddleware
Invalid
Medium
Unassigned
python-keystoneclient
Invalid
Medium
Unassigned

Bug Description

When making an HTTP request with a body to an api using the keystone auth_token middleware and no request size limiting then an unauthorized user can send a very large request that will not fail with a 401 until after all of the data is sent. This means that anyone who can hit an api could make many requests with large bodies and not be denied until after all of that data has been sent, wasting lots/all of the resources on the api node essentially bringing it down.

This issue can be mitigated for apis like nova by having middleware or using the webserver to limit the maximum size of a request. In the case of the glance-api however, large requests such as image uploads need to occur. Perhaps the auth_token middleware should look at request headers and perform authN and authZ before accepting all of the request body. It's also very inefficient and time consuming to wait until all the data is sent before receiving a 401.

I am not sure of the level of impact this could have for most deployers and the different APIs.

Here is an example of requests to glance and devstack with a bad token and their times to complete. Nova-api on devstack also accepted large bodies before returning a 401.

1 Meg Image

[ameade@ameade-dev:~]
[17:30:16] $ time glance --debug --os-auth-token 'gah' image-create --name test <1meg.img
curl -i -X POST -H 'Transfer-Encoding: chunked' -H 'User-Agent: python-glanceclient' -H 'x-image-meta-size: 1048576' -H 'x-image-meta-is_public: False' -H 'X-Auth-Token: gah' -H 'Content-Type: application/octet-stream' -H 'x-image-meta-name: test' -d '<open file '<stdin>', mode 'r' at 0x7f8d762bd150>' http://50.56.173.46:9292/v1/images

HTTP/1.1 401 Unauthorized
date: Thu, 18 Jul 2013 17:30:30 GMT
content-length: 253
content-type: text/plain; charset=UTF-8

401 Unauthorized

This server could not verify that you are authorized to access the document you requested. Either you supplied the wrong credentials (e.g., bad password), or your browser does not understand how to supply the credentials required.

Request returned failure status.
Invalid OpenStack Identity credentials.

real 0m0.766s
user 0m0.312s
sys 0m0.164s

100 meg

[ameade@ameade-dev:~]
[17:31:35] $ time glance --debug --os-auth-token 'gah' image-create --name test <100meg.img
curl -i -X POST -H 'Transfer-Encoding: chunked' -H 'User-Agent: python-glanceclient' -H 'x-image-meta-size: 104857600' -H 'x-image-meta-is_public: False' -H 'X-Auth-Token: gah' -H 'Content-Type: application/octet-stream' -H 'x-image-meta-name: test' -d '<open file '<stdin>', mode 'r' at 0x7f6af9768150>' http://50.56.173.46:9292/v1/images

HTTP/1.1 401 Unauthorized
date: Thu, 18 Jul 2013 17:31:40 GMT
content-length: 253
content-type: text/plain; charset=UTF-8

401 Unauthorized

This server could not verify that you are authorized to access the document you requested. Either you supplied the wrong credentials (e.g., bad password), or your browser does not understand how to supply the credentials required.

Request returned failure status.
Invalid OpenStack Identity credentials.

real 0m1.441s
user 0m0.420s
sys 0m0.344s

10 gig

[ameade@ameade-dev:~]
[17:16:23] 1 $ time glance --debug --os-auth-token 'gah' image-create --name test <10g.img
curl -i -X POST -H 'Transfer-Encoding: chunked' -H 'User-Agent: python-glanceclient' -H 'x-image-meta-size: 10000000000' -H 'x-image-meta-is_public: False' -H 'X-Auth-Token: gah' -H 'Content-Type: application/octet-stream' -H 'x-image-meta-name: test' -d '<open file '<stdin>', mode 'r' at 0x7f768c151150>' http://50.56.173.46:9292/v1/images

HTTP/1.1 401 Unauthorized
date: Thu, 18 Jul 2013 17:16:28 GMT
content-length: 253
content-type: text/plain; charset=UTF-8

401 Unauthorized

This server could not verify that you are authorized to access the document you requested. Either you supplied the wrong credentials (e.g., bad password), or your browser does not understand how to supply the credentials required.

Request returned failure status.
Invalid OpenStack Identity credentials.

real 0m56.082s
user 0m6.308s
sys 0m17.669s

Revision history for this message
Thierry Carrez (ttx) wrote :

I agree that public glance-api servers present a unique challenge here. Not totally convinced we can address that as a security fix though -- it might be OSSN territory...

We should discuss on the private bug how we would fix that first, and then decide in which bucket it falls.

Changed in ossa:
status: New → Incomplete
Revision history for this message
Thierry Carrez (ttx) wrote :

Adding Keystone and Glance PTLs to discuss this further

Revision history for this message
Dolph Mathews (dolph) wrote :

I'm not clear on the specifics of WSGI here, but if it's possible to read request headers before the entire chunked request is sent (I'd bet someone who works on glance would know better than I), then a fix should be possible in keystoneclient.middleware.auth_token. That would also be dependent on auth_token appearing first in the pipeline (or at least nothing attempting to read the request body first, etc).

Changed in python-keystoneclient:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Thierry Carrez (ttx) wrote :

markwash, any comment ?

Revision history for this message
Mark Washenberger (markwash) wrote :

Fantastic detective work, Alex!

It looks like there are few if any barriers to making this work in python-keystoneclient and in glance. The only middleware ahead of authentication in the chain in glance is version negotiation, which seems to only care about path info and headers.

I'm actually somewhat surprised that this behavior occurs, at all, however. So I think some more research is going to be necessary, to see how far we get in the code before we start waiting on the whole request body.

Changed in glance:
status: New → Confirmed
status: Confirmed → Triaged
Revision history for this message
Thierry Carrez (ttx) wrote :

This is very much like "HTTP POST limiting advised to avoid Essex/Folsom Keystone DoS" OSSN -- we need to accept data so we are vulnerable to classic, basic DoS techniques.

Let me CC Rob Clark from he OSSG for more input on the "flawed by design vs. security bug" debate.

Revision history for this message
Dolph Mathews (dolph) wrote :

We're probably also susceptible to slowloris attacks for the same reason -- waiting on complete request bodies.

  http://en.wikipedia.org/wiki/Slowloris

Revision history for this message
Thierry Carrez (ttx) wrote :

Issue independently reported by Harri Hämäläinen in bug 1217857

Revision history for this message
Robert Clark (robert-clark) wrote :

So the major issue here that I think (iirc) differentiates this from similar issues in the past is that the requests are unauthenticated and there's no elegant way to rate/size limit these unauthed requests. To that end it's somewhat different to the previous Keystone DoS issue.

Regarding slowloris, I can't see how common controls like deploying Varnish would be of any use...No obvious solution springs to mind. I suppose that the severity of the Glance API going down is less than that of say, Keystone but some deployers may have SLAs around image upload etc and this issue certainly appears to have the capability to DoS the Glance API without too much effort and in an unauthenticated way.

I don't think the fix for Keystone would work here unless you had a way to verify tokens from Apache/Nginx ( or whatever your FE proxy is) because as already pointed out in the bug report Glance needs to accept big uploads.

This feels like a real security bug to me.

Revision history for this message
Thierry Carrez (ttx) wrote :

Agreed... I just don't see an easy way out so this might require a specific feature to fix (a bit like the console log DoS in Nova which we had around since forever).

Revision history for this message
Harri Hämäläinen (hhamalai) wrote :

Any news on this issue? I'm still seeing this as a real & critical problem as unauthorized user can e.g. temporarily reserve all the disk space from glance host. With some system setups this is definitely going to cause issues with some system services which are trying to write to full disk.

Revision history for this message
Thierry Carrez (ttx) wrote :

@markwash: did you make progress on investigating options we have to fix this issue ?

Revision history for this message
Mark Washenberger (markwash) wrote :

No progress yet.

Revision history for this message
Thierry Carrez (ttx) wrote :

@markwash: if this is fixable (and backportable) it would be good to have an answer for the icehouse release. If it's more destructive and needs a design overhaul, it would be great to have it on the table for the Juno summit. In both cases we need to make progress on researching potential solutions for this issue now...

Revision history for this message
Stuart McLaren (stuart-mclaren) wrote :

In case its useful for comparison Swift doesn't seem to exhibit this behaviour.

It returns a 401 without reading all the input data.

Revision history for this message
Stuart McLaren (stuart-mclaren) wrote :

Swift seems to send a "HTTP/1.1 100 Continue" after authentication indicating that the client should start to upload data.
When auth fails it sends a 401 instead.

Revision history for this message
Stuart McLaren (stuart-mclaren) wrote :

If the glance client included this header:

Expect: 100-continue

then you get a 401 straight away.

While not a real fix it would be very easy to put this in the glance client to at least reduce chances
of this happening by accident.

Revision history for this message
Thierry Carrez (ttx) wrote :

@Stuart: yes, this bug is slightly different from the Swift one, and a few details make it fall on the "vulnerable" side of the fuzzy line I mentioned there: you can trigger this without being authenticated, and it's actually hard to mitigate the issue at proxy-level (due to the nature of Glance, uploading massive images is a valid use case for /some/ users)

Revision history for this message
Stuart McLaren (stuart-mclaren) wrote :

Also, Swift does seem to also be vulnerable to this if you remove the 'Expect: 100-continue' header.
(I entered https://bugs.launchpad.net/swift/+bug/1284669)

So we now have four bugs: 401/403 cases for both Glance and swift.

I've kept the 401/403 cases separate as a fix for one may not fix the other case (depending on how
its implemented). If that's too much of a bug proliferation we can close a couple with the caveat that
any fix should be tested for both 401 and 403 cases.

Revision history for this message
Thierry Carrez (ttx) wrote :

Stuart McLaren reported Swift as also affected by this:

"""
The trick here is to add the -H 'Expect:' header:

$ cat /mnt/ubuntu/dd.7000 | curl -i -X PUT -T - -H 'x-auth-token: bogus-token' https://swift.example.com:443/v1/AUTH_XXX/tmp/x -v -H 'Expect:' >/dev/null

[sends 7GB of data then returns 401]
"""

Revision history for this message
Thierry Carrez (ttx) wrote :

So in both cases it's an unauthenticated DoS vector, without amplification.

The main difference between Swift and Glance is that proxies may filter on request size for Swift (to match the 4Gb max request), while it's more difficult to do on Glance servers.

Given the lack of amplification and the possibility to proxy-filter on requestsize for Swift, I'd be inclined to consider the attack vector on Swift as shallow and not consider this a vulnerability there... but I could be convinced otherwise.

Revision history for this message
John Dickinson (notmyname) wrote :

On the Swift side, there are two separate claims:

- Without using the Expect semantics, a client will not get a 4xx response until after the body has been transferred to the server. This makes sense and is known and expected behavior. Without using the "early response" semantics provided in the RFC (ie Expect), the client will continue to send data before looking for a response. This isn't a security issue; it's just a matter of handling inefficient clients.

- When the client has uploaded more data than is allowed in a single object, what should Swift do? Immediately terminate the connection or politely allow the client to finish? Of course, there are edge cases (eg chunked vs not chunked). This is a separate issue than when to return a response code, and it needs to be tracked and discussed independently. There is a bug at https://bugs.launchpad.net/swift/+bug/1284254 for this.

As such, I'm removing Swift from this particular bug.

no longer affects: swift
Revision history for this message
Nikhil Komawar (nikhil-komawar) wrote :

Think we find a similar issue for uploading large chunks of data to Glance (to possibly an non-queued image) and the check [1] failed after a huge portion of the same has been uploaded from the client to the g-api.

P.S. Added John and Erno for any more possible insights, recommendations and reviews for the fix.

[1] https://github.com/openstack/glance/blob/master/glance/api/v1/images.py#L917

Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

I couldn't reproduce against 2014.1.3, the server immediately close the connection with a 401 without consuming much resources...

Is this bug still present ?

Revision history for this message
Thierry Carrez (ttx) wrote :

@glance-coresec: could you confirm the issue disappeared ?

Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

At this point please confirm this is still the case with keystonemiddleware. the middleware provided in keystoneclient has not been the focus of delvelopment (and in fact is it hardly tested at this point since everything has moved over to the modules provided in keystonemiddleware package).

I'm marking this as invalid against keystoneclient and marking it as incomplete against keystonemiddleware with the expectation that if it is still occuring the glance-coresec or VMT team will update the bug for keystonemiddleware.

Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

In addition to my last comment, glance should be using keystonemiddleware as of Juno

Changed in python-keystoneclient:
status: Triaged → Invalid
Changed in keystonemiddleware:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

This bugs does not seems to affect any supported stable release (>= Icehouse).

Does someone have an objection if we open this bug and mark it as a won't fix ?

Revision history for this message
Nikhil Komawar (nikhil-komawar) wrote :

Tristan, I think that would be acceptable.

I can double check on my environment today and get back to you. Please feel free to open this tomorrow otherwise.

Revision history for this message
Nikhil Komawar (nikhil-komawar) wrote :

Confirmed, this is not an issue. Please feel free to open it. Thanks again.

Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

Alright, it's time to close this old bug. Nikhil, you might want to close the Glance task too.

Changed in ossa:
status: Incomplete → Won't Fix
information type: Private Security → Public
Changed in glance:
status: Triaged → Confirmed
status: Confirmed → Won't Fix
importance: High → Undecided
Jeremy Stanley (fungi)
Changed in keystonemiddleware:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.