OpenStack Identity (Keystone)

keystone has no limitation for requests and headers size which may cause DB or process crash

Reported by Yaguang Tang on 2013-01-10
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Keystone
Undecided
Yaguang Tang

Bug Description

concurrent requests with large POST body can crash the keystone process.

this can be used by Malicious and lead to DOS to Cloud Service Provider.

CVE References

Yaguang Tang (heut2008) on 2013-01-10
Changed in keystone:
assignee: nobody → Yaguang Tang (heut2008)
Russell Bryant (russellb) wrote :

There is very little detail here. Can you provide some more information? Do you have something to reproduce the problem you see? Can you provide more information about what happens in the "crash" ?

Changed in keystone:
status: New → Incomplete
Yaguang Tang (heut2008) wrote :

with concurrent requests which have large POST data, can lead to keystone process be killed by kernel duo to out of memory.

Thierry Carrez (ttx) wrote :

We still need more precision. Are all types of requests affected ? Or only authenticated requests ? Any particular example we should use to reproduce and confirm the fix ? thanks in advance...

Adding PTL for confirmation and Dan Prince who works on a similar issue.

Adam Young (ayoung) wrote :
Yaguang Tang (heut2008) wrote :

This isn't exactly the same with https://bugs.launchpad.net/keystone/+bug/1098307, before validate the Http POST data, all stores in the memory, concurrent requests with large post data can consume all the physcial server memory,result in process killed by the kernel.

simple way to test with cgroup.

1, mkdir /sys/fs/cgroup/memory/keystone,
echo 100M > /sys/fs/cgroup/memory/keystone/memory.limit_in_bytes (max memory is 100M for the keystone process)

2, echo $$ /sys/fs/cgroup/memory/keystone/tasks && keystone-all

3 send http post request with data large than 100M

4, see dmesg info ,the keystone process is killed duo to out of memory

Yaguang Tang (heut2008) on 2013-02-22
description: updated
Thierry Carrez (ttx) wrote :

Adding all keystone-core for analysis whether the current patches are enough or not.

Yaguang Tang (heut2008) wrote :

This has been fixed in the trunk by adding a sizelimit middleware, but not backpored to stable/folsom and I have submit a patch to stable/folsom https://review.openstack.org/#/c/22661/

also, I think this affects quantum as well, but need testing, will update later.

Thierry Carrez (ttx) wrote :

Backporting the sizelimit middleware is not an option for a security patch on a stable release. We need to address that with a specific patch addressing the specific issue.

keystone-core: can you confirm that we're still vulnerable, based on comment 5 ?

Dolph Mathews (dolph) wrote :

dprince's patch should have addressed this in folsom for the public API. I'd acknowledge that the admin API remains vulnerable.

Backporting the middleware is inconsequential without corresponding changes to keystone.conf to deploy the new middleware; we could alternatively "overload" the xml_body middleware (already deployed in every pipeline by default, and probably the first middleware to be affected by an oversized request) to also validate the size of the request in stable/folsom and stable/essex -- regardless of whether it's an XML request or not.

Dolph Mathews (dolph) wrote :

Correction- public API is vulnerable if non-contract attributes are provided in the request.

Thierry Carrez (ttx) wrote :

I kinda like the idea of overloading the xml_body middleware, as it's a bit less intrusive than asking everyone to update their config files.

Maybe we should wrap up the sizelimit middleware for grizzly into that too, for consistency ?
@Dan, opinions ?

Dan Prince (dan-prince) wrote :

The issue we are talking about isn't necessarily XML only correct? The initial bug description seems to describe large requests in general as being a potential problem.

I'm also a bit confused here as to what exactly we are trying to solve. Isn't this the same issue we fixed in:

 https://review.openstack.org/#/c/19567/

Like ttx points out we sort of decided not to backport that to Folsom since it fell under the "new feature" umbrella. That said there is nothing preventing a distro from picking up that patch for extra protection.

-----

Also the "keystone has no limitation for requests and headers size which may cause DB or process crash" description on this bug could be a bit misleading I think.

Eventlets WSGI defaults should cover us in terms of checking max header size. Eventlets wsgi.py shows:

 MAX_HEADER_LINE = 8192
 MAX_TOTAL_HEADER_SIZE = 65536
 url_length_limit=MAX_REQUEST_LINE (which is set to 8192 by default...)

----

My main question here is there some new ground/or request type that upstream grizzly isn't protected from? Or just something we need to backport to Folsom?

Dolph Mathews (dolph) wrote :

dan: I think this is just an issue of backporting; I believe grizzly is sufficiently protected.

As for xml_body, I'm not suggesting that XML is somehow special here, just that it's the first place in the folsom pipeline that could be affected by an overly large request, and happens to (arguably) be a logical place to backport the functionality of RequestBodySizeLimiter without requiring pipeline changes.

Dan Prince (dan-prince) wrote :

dolph: I've got an idea. I think we might be able to inject the grizzly sizelimit middleware into the keystone pipeline without making config changes to the keystone.conf file. A bit hackish but if I can get it working (and you guys buy it) then I think we might have a path forward here.

I sort of feel like the cat is out of the bag on this issue (so why handle this privately) but whatever.

Thierry Carrez (ttx) wrote :

I'm fine with considering that this should be addressed by external protection layers in Essex/Folsom (either a backport of the sizelimit middleware or some early proxy) and be done with it. It's a bit unlikely that any serious setup wouldn't have some loadbalancing in place that would reject RAM-eating requests anyway... Not sure it's a lot worse than some hack to insert it in a stable branch update.

We could even ask the OSSG to do a security note about it if they feel it's worth it.
Russell, Mikal: your take ?

About WSGI default limits: I think Yaguang was mentioning POST requests, which are not covered by limits ? If they are, most of this bug is moot.

Russell Bryant (russellb) wrote :

I suppose there's also the option of backporting the middleware and leaving it up to the deployment to enable it if they would like to. That could be done along with noting that we expect larger deployments are already protected by other means.

Michael Still (mikalstill) wrote :

@ttx -- I think its reasonable to realize that we are occasionally going to encounter security problems where there isn't a good fix for stable options. We're offering users a few options here: upgrade; middleware; or some sort of protecting proxy. I think that's a perfectly reasonable response.

Russell Bryant (russellb) wrote :

If backporting the middleware is not acceptable for stable, then I agree that we should just punt on this for stable as Thierry described in comment #15.

Thierry Carrez (ttx) wrote :

My favorite option now is to get the OSSG to issue a security note basically saying: "you should filter size as upstream as you can, there are smart LB/proxies, there is the sizelimit middleware in Grizzly (if you want it for Folsom you can find it here)."

Dolph Mathews (dolph) wrote :

+1 for comment #19; issue is obviously not unique to keystone and it's a one line fix in nginx & apache, for example.

Thierry Carrez (ttx) wrote :

OK, unless someone complains, i'll open up the issue and attract the attention of the OSSG on it.

Robert Clark (robert-clark) wrote :

We're happy to help out with a Security note explaining the problem and detailing a couple of 'best-practice' ways to fix the problem.

Draft started: https://bugs.launchpad.net/osn/+bug/1155566

Thierry Carrez (ttx) on 2013-03-18
information type: Private Security → Public
tags: added: security
Kurt Seifried (kseifried) wrote :

Basically an attacker can use this to cause remote crashes, so DoS. Please use CVE-2013-2014 for this issue.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers