Timing oracle in core auth plugin simplifies brute-forcing usernames

Bug #1795800 reported by Andy Ngo on 2018-10-03
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Wishlist
Gage Hugo
OpenStack Security Advisory
Undecided
Unassigned

Bug Description

The response times for POST /v3/auth/tokens are significantly higher for valid usernames compared to those of invalid ones, making it possible to enumerate users on the system.

Examples:

# For invalid username
+ Request
POST /v3/auth/tokens HTTP/1.1
Host: hostname:5000
Connection: close
Content-Length: 141
Content-Type: application/json

{
   "auth":{
      "identity":{
         "methods":[
            "password"
         ],
         "password":{
            "user":{
               "name":"nonexisting",
               "domain":{
                  "name":"Default"
               },
               "password":"devstacker"
            }
         }
      }
   }
}

+ Response Time: <150ms

# For valid username ('admin' in this case)
+ Request
POST /v3/auth/tokens HTTP/1.1
Host: hostname:5000
Connection: close
Content-Length: 139
Content-Type: application/json

{
   "auth":{
      "identity":{
         "methods":[
            "password"
         ],
         "password":{
            "user":{
               "name":"admin",
               "domain":{
                  "name":"Default"
               },
               "password":"devstacker"
            }
         }
      }
   }
}

+ Response time: >600ms

# Tested version
v3.8

[UPDATE 3 Oct 2018 5:01 AEST]
Looks like it's also possible to enumerate for valid "domain" too. There're 2 ways that I can see:
* With valid username: use the above user enum bug to guess the valid username, then brute the "domain" parameter. Response times are significantly higher for valid compared to invalid domains.
* Without valid username: get a baseline response time using invalid username AND invalid domain name. Bruteforce the "domain" param until the response time hits an average high. For me invalid domain falls in the 90-100ms range whereas valid ones show 100+ms. This one looks a bit more obscure i.e. timing difference is not as distinguishable, but should still be recognizable with a good sample size.

CVE References

Andy Ngo (andyngo) on 2018-10-03
description: updated
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

description: updated
Changed in ossa:
status: New → Incomplete
Gage Hugo (gagehugo) wrote :

I can attempt to recreate this, what environment did you use to test this? Looks like devstack?

Andy Ngo (andyngo) wrote :

I tested this in a test env we have here in our company on Identity 3.8. The POST payload was from the API https://developer.openstack.org/api-ref/identity/v3/#password-authentication-with-unscoped-authorization

Morgan Fainberg (mdrnstm) wrote :

There are a number of things here that make it difficult to solve:

1) High level languages are much harder to solve these types of issues in (e.g. Python)

2) There is high variance on response times based upon hardware, caching, etc.

3) We already get massive complaints when auth times go up a minute amount per request (250ms increase generates these)

While I agree that there is some level of "it would be nice to mitigate guessing of usernames". I think this is going to be a Class D[0] bug and we should be able to open this up to the public, and generate discussion on acceptable solutions (fixed minimum auth times, random sleeps with fixed minimums, samples of real auth responses, a tuneable to set a minimum + sleep, etc).

[0] https://security.openstack.org/vmt-process.html#incident-report-taxonomy

Jeremy Stanley (fungi) wrote :

My opinions mirror those expressed by Damien Miller in his oss-security ML followup[*] about CVE-2018-15473 for a similar report in OpenSSH. To summarize, it isn't actually a user enumeration bug, it's a timing oracle which can lead to user enumeration via brute-force mechanisms, and there's a wide gulf of criticality between the two. The usual mitigations against brute-force attacks apply here.

As he noted and Morgan also stated above, it's really impractical to eliminate these sorts of oracles and most ways we might attempt to accomplish that are also likely to introduce noticeable performance regressions. I know it's been a modern shift to start considering usernames sensitive data, but strong passwords/keys should be the focus for protecting authentication so any system design which is weakened by username disclosure is already severely flawed.

If there are ways to improve or mitigate this particular situation in software then I'm not against discussing them, but I agree it's not necessary to do so under embargo.

[*] https://www.openwall.com/lists/oss-security/2018/08/24/1

Gage Hugo (gagehugo) wrote :

Test this with a containerized OpenStack deployment that runs pretty fast, both authenticating successfully and by providing a non-existent username produced a difference of ~0.020 seconds.

I agree with both Morgan and Jeremy, there likely isn't a very straightforward method to fixing this, as it greatly depends on the overall setup: deployment (VM/container), caching mechanisms, and hardware will all factor into this. Making this public and having a discussion on it is likely the preferred method for tackling this issue.

Jeremy Stanley (fungi) wrote :

Thanks, I've gone ahead and triaged this as a hardening opportunity for now.

information type: Private Security → Public
Changed in ossa:
status: Incomplete → Won't Fix
tags: added: security
Morgan Fainberg (mdrnstm) wrote :

I don't know how we'll address this. Realistically, I think this is going to have to be marked as invalid/wont fix/opinion. I'm going to mark it as wont fix, we can circle back on it if there is more discussion to be had.

Changed in keystone:
status: New → Won't Fix
Colleen Murphy (krinkle) wrote :

I disagree that this is too hard to fix. In fact I'm fairly sure I found it already:

http://git.openstack.org/cgit/openstack/keystone/tree/keystone/auth/plugins/core.py?h=stable/pike#n174

We do a user lookup long before bothering to try to validate the password. The fix is to continue to go through the motions of trying to validate the password while keeping track of the fact that the user is already unauthorized. This is independent of hardware and caching mechanisms.

The performance slowdown would only occur for invalid authentication, not for valid ones, so I think it's an acceptable hit.

Changed in keystone:
status: Won't Fix → Triaged
importance: Undecided → Wishlist
Andy Ngo (andyngo) wrote :

Regardless of the "to fix or not to fix" question, can we please start the process of filing this bug with MITRE and get a CVE assigned for tracking?

Perhaps we should consider disclosing this issue to the public via an official channel e.g. OpenStack maintainers?

I think we have all agreed that this is indeed an information disclosure issue. The question of how easy it is to fix should not prevent us from carrying out our duty of care i.e. properly disclosing to keystone users.

After all, security issues mean different things to different organisations and, along with that, carry difference severity.

Jeremy Stanley (fungi) wrote :

The OpenStack Vulnerability Management Team only requests CVE assignments to track vulnerabilities corresponding to fixes they're announcing via OpenStack Security Advisory publications, but anyone can request a CVE assignment from MITRE for any issue they'd like to track (or assign one themselves directly if they're a CNA). Please don't let the ongoing conversation stop you from obtaining a CVE for this particular bug if you want one, but please note within the bug report if you do so in order that we might avoid future duplication.

Technically this concern is already disclosed to the public by nature of being a public bug report (I just now noticed I failed to clean up the embargo preamble in the description when making it public in October, but have gone ahead and taken care of it just now). Further, as it's tagged "security" all comments are also being copied to the public openstack-security mailing list (I have no idea what "OpenStack Maintainers" is in your last comment): http://lists.openstack.org/pipermail/openstack-security/2018-December/

description: updated
Andy Ngo (andyngo) wrote :

CVE-2018-20170 assigned

Jeremy Stanley (fungi) on 2018-12-17
summary: - Username enumeration via response timing difference
+ Timing oracle in core auth plugin simplifies brute-forcing usernames
Jeremy Stanley (fungi) wrote :

I've updated the bug title to more accurately indicate this is a timing oracle in Keystone's core auth plugin, and so is still mitigated by the usual account brute-forcing defenses (e.g., enforcing strong authentication secrets, temporarily rejecting failing authentication attempts per source IP address, throttling calls to relevant API methods, et cetera).

Jeremy Stanley (fungi) wrote :

Note that, unlike what the CVE dispute info suggests, the Keystone team has not concluded performance degradation would result from possible methods of reducing the efficacy of this particular oracle. They in fact indicated that fixes for this hardening opportunity would be welcome ("wishlist" importance, please see Colleen's comment #9).

Fix proposed to branch: master
Review: https://review.openstack.org/625699

Changed in keystone:
assignee: nobody → Gage Hugo (gagehugo)
status: Triaged → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/625700

description: updated

Change abandoned by Gage Hugo (<email address hidden>) on branch: master
Review: https://review.openstack.org/625700

Andy Ngo (andyngo) wrote :

Have requested MITRE to review this thread again and updated the CVE dispute info.

New one is:

"** DISPUTED ** OpenStack Keystone through 14.0.1 has a user enumeration vulnerability because invalid usernames have much faster responses than valid ones for a POST /v3/auth/tokens request. NOTE: the vendor's position is that this is a hardening opportunity, and not necessarily an issue that should have an OpenStack Security Advisory."

Change abandoned by Gage Hugo (<email address hidden>) on branch: master
Review: https://review.openstack.org/625699
Reason: Replaced by https://review.openstack.org/#/c/634826/

Gage Hugo (gagehugo) wrote :

Still working on this, I've gotten a flask hook setup to catch any unauthorized, allowing us to delay the Unauthorized exception until the very end, however I'm still not seeing identical times. Local testing is showing .480 seconds for an existing user while it's gone from ~0.022 to ~0.033 for invalid users with delaying the exception.

I'm wondering if this is due to the delay with generating a token for successful authentication, instead of simply continuing on when failing to authenticate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers