Use prefixes in Swift hashing to reduce exposure to collision attacks

Bug #1157454 reported by David Hadas
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Wishlist
David Hadas

Bug Description

Swift uses a suffix to avoid collision attacks. See https://answers.launchpad.net/swift/+question/156307 to understand this design choice.

Yet, an attacker does not need to know the secret suffix in order to mount a collision attack. Further, the secret suffix adds nothing to protect Swift against the attack. Once two prefixes (ending with a common suffix) collide, any additional suffix that will be added by Swift would still collide… hence the attacker need not worry about the secret ;)

Further such an attack can be made with just 64 characters.

To date, MD5 is known to be vulnerable to an attack if no prefix is applied or if the prefix of the two aliases is known to the attacker. There is no known attack on an MD5 using a secret prefix (TBD reconfirm), hence Swift needs to be enhanced with an MD5 hash prefix allowing new installations to be protected against collision attacks.

I would create a patch to add a secret prefix in the hope to get this fixed for new installations using 1.8
Later, we need to consider additional measures.

Tags: security
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/24858

Changed in swift:
assignee: nobody → David Hadas (david-hadas)
status: New → In Progress
Revision history for this message
Chuck Thier (cthier) wrote : Re: Swift exposed to hash collision attacks

Do you have a proof of concept to demonstrate this vulnerability?

Revision history for this message
clayg (clay-gerrard) wrote :

Hrmmm... well I'll admit, at least for me, this is not what I expected:

clayg:~$ cat salt.py
from hashlib import md5

a = '\xd11\xdd\x02\xc5\xe6\xee\xc4i=\x9a\x06\x98\xaf\xf9\\/\xca\xb5\x87' \
        '\x12F~\xab@\x04X>\xb8\xfb\x7f\x89U\xad4\x06\t\xf4\xb3\x02\x83' \
        '\xe4\x88\x83%qAZ\x08Q%\xe8\xf7\xcd\xc9\x9f\xd9\x1d\xbd\xf2\x80' \
        '7<[\xd8\x82>1V4\x8f[\xaem\xac\xd46\xc9\x19\xc6\xddS\xe2\xb4\x87' \
        '\xda\x03\xfd\x029c\x06\xd2H\xcd\xa0\xe9\x9f3B\x0fW~\xe8\xceT\xb6' \
        'p\x80\xa8\r\x1e\xc6\x98!\xbc\xb6\xa8\x83\x93\x96\xf9e+o\xf7*p'

b = '\xd11\xdd\x02\xc5\xe6\xee\xc4i=\x9a\x06\x98\xaf\xf9\\/\xca\xb5\x07' \
        '\x12F~\xab@\x04X>\xb8\xfb\x7f\x89U\xad4\x06\t\xf4\xb3\x02\x83' \
        '\xe4\x88\x83%\xf1AZ\x08Q%\xe8\xf7\xcd\xc9\x9f\xd9\x1d\xbdr\x80' \
        '7<[\xd8\x82>1V4\x8f[\xaem\xac\xd46\xc9\x19\xc6\xddS\xe24\x87' \
        '\xda\x03\xfd\x029c\x06\xd2H\xcd\xa0\xe9\x9f3B\x0fW~\xe8\xceT\xb6p' \
        '\x80(\r\x1e\xc6\x98!\xbc\xb6\xa8\x83\x93\x96\xf9e\xabo\xf7*p'

print 'suffix still matches', \
        md5(a + 'salt').hexdigest() == md5(b + 'salt').hexdigest()

print 'prefix still matches', \
        md5('salt' + a).hexdigest() == md5('salt' + b).hexdigest()
clayg:~$ python salt.py
suffix still matches True
prefix still matches False

It would be more interesting if there was an example of a collision that actually works all the way through the proxy against an account name which the "attacker" doesn't control (i.e. a *prefix*) and url-encoding.

None-the-less, quite interesting...

Revision history for this message
gholt (gholt) wrote :

Well, the cat's out of the bag so to speak on this one since there's now a public patch for this private bug. Luckily, this doesn't seem easily exploitable at this point.

Revision history for this message
Chuck Thier (cthier) wrote :

So thinking this through a bit, I just found:

http://stackoverflow.com/questions/1999824/whats-the-shortest-pair-of-strings-that-causes-an-md5-collision

Which references a single block (64byte) collision. Will be interesting to see if there were any others that were found when the report is released. If the user had complete control of what was being hashed, then it would be trivial to create a collision.

What is currently hashed is the string "/account/container/object". The user can set container and object to their heart's content, but they can't set account, and the account has to be their account, otherwise they will get unauthorized before it even gets to generating the hash. On top of that, the path must be a valid UTF-8. All of these add a lot of complexity to the attack, but *may* not make it impossible. I need to reflect on this more.

So the next question would be, how would a person exploit this attack vector? randomly generating two objects in the same account that hash to the same object isn't going to get the attacker anywhere, but could the user access another object in the system (either retreive or poison by writing over). The problem is that then you have to find a path that hashes the same path as that other user's object. This would require a pre-image attack, which is incredibly more difficult, and has yet to be demonstrated (other than in theory).

Like I said these are just some initial thoughts, and will have to meditate on it a bit more, but I'm curious to others' thoughts and other possible attack vectors.

Revision history for this message
Chuck Thier (cthier) wrote :

and btw, the hash suffix was added to not prevent hash collisions, but to prevent people from filling up a drive, by creating a bunch of random paths that would hash to the same partition.

Revision history for this message
David Hadas (david-hadas) wrote :

The attack vectors I found, much with the help of the discussion in 2011, are:

1. A user allowed access to a container/account/tenant may create a planned collision and sue the provider for loss of data

2. A user with access to the system may produce large amount of objects/containers/accounts that will be placed on a specific device creating a DoS attack.

3. A user with access to a container may create an object that is an alias of a different object of the same tenant thus overriding the ACLs of the aliased object and gaining full control over the object (read/change/delete).

4. A user with access to an account (account owner) may create a container that is an alias of a different container thus overriding the ACLs of the aliased container and gaining full control over the container (e.g. change its ACLs).

 In attacks 1,2 - the user has control over both aliases that need to collide
There are known and efficient MD5 attacks allowing to create aliases either if no prefix is used or when the prefixes of both aliases are known.
This attack can be mitigated by a secret prefix.
This attack is not affected by a secret suffix as used in Swift today.
Attack 1 also requires the user to come forward and account for his actions (i.e. show that this was not a deliberate attack) - this seems like an attack that can be contained reasonably well.
Attack 2 is a DoS attack and not a threat to user data. Swift requires some level of disk monitoring anyhow and when a given disk fills up beyond a reasonable level (compared to others), the admin need to be alarmed and investigate - again, this seems like an attack that can be contained reasonably well.

In attacks 3 and 4 - the user has control over one alias and may try to collide with a specific object or with any object in the system.
There are no known MD5 preimage attacks (need to reconfirm this) that allow colliding with a given fully qualified name efficiently (without shire brute force).
Yet, as the number of objects increase the birthday paradox may yield an increased chance of collision.
There is no immediate risk here even when a prefix is not used.

Revision history for this message
Mike Barton (redbo) wrote :

The birthday paradox numbers aren't very scary; at a trillion objects, the chances of a single collision are still like 10^-18. Of course that's assuming MD5 is perfectly distributed, which we know it isn't.

In some future perfect world, we should probably just let people choose other algorithms like sha256, just so this stops being an issue for discussion. I'll really miss how fast md5 is, though.

Revision history for this message
Thierry Carrez (ttx) wrote :

Speaking with my Vulnerability Management Team hat on:

* like gholt mentioned, the review being public I see no point in keeping this bug private. Will open in a few hours if nobody beats me to it or complains

* This is definitely a nice (and welcome) hardening option, however I would not consider this an exploitable vulnerability as-is, so the VMT probably won't go through embargo and security advisory process for this. Swift uses MD5 for object distribution rather than signature/integrity, which makes most collision attack scenarios significantly less scary.

Revision history for this message
Chuck Thier (cthier) wrote :

I just want to clarify, that scenario 2 is not feasible here. In order for a an attacker to fill up a drive, they would have to generate object paths that map to the same partition. The partition is derived from the first n bytes of the hash, where n is the partition power of the cluster. An attacker may try to generate a number of paths that hash to the same first n bytes, but adding the hash suffix will cause those hash to change, and thus the objects will not fall on the same partition.

Revision history for this message
Thierry Carrez (ttx) wrote :

Renamed and opened publicly.

summary: - Swift exposed to hash collision attacks
+ Use prefixes in Swift hashing to reduce exposure to collision attacks
Changed in swift:
importance: Undecided → Wishlist
tags: added: security
information type: Private Security → Public
Revision history for this message
David Hadas (david-hadas) wrote :

We found another solution that seem to solve any MD5 collision attack on Swift and will send a patch.
This patch solves attacks 1,3 above. The solution is to test that the object metadata 'name' attribute in authentic before doing any work on the object and refusing with 403 otherwise.
We will later also send a corresponding patch to solve attack 4 in the same way by testing against the information in the container metadata.
This fix was originated by Michael Factor, DE in the HRL IBM Research Lab.

Btw, as Chuck mentioned attack 2 above is not feasible since:
a. If a full hash collision is used, there will be only one object created
b. If only the partition bits are the same, than the suffix (if kept secret) will make the attack useless.
Still, we should have a solution that does not rely on a system wide secret not leaking out. Simply because if it ever gets out, there is no way to mitigate this. I will open a separate bug for this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/25110

Revision history for this message
Chuck Thier (cthier) wrote :

Ahh good... I was going to propose that idea as well. I think that checking the user path with the path in the metadata is a reasonable way to mitigate this if the issue were to arise.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/24858
Committed: http://github.com/openstack/swift/commit/a979c8007bf7376541c932c3e91d5ba1a95f4481
Submitter: Jenkins
Branch: master

commit a979c8007bf7376541c932c3e91d5ba1a95f4481
Author: David Hadas <email address hidden>
Date: Wed Mar 20 01:35:41 2013 +0200

    Add support for Hash Prefix

    A new configuration parameter is added to /etc/swift/swift.conf
    [swift-hash]
    swift_hash_path_prefix = 'random unique string'

    New installations are advised to set this parameter to a random secret,
    which would not be disclosed ouside the organization.
    The same secret needs to be used by all swift servers of the same cluster.

    Existing installations should set this parameter to an empty string
    (the default)

    DocImpact

    Fixes: Bug #1157454

    Change-Id: I63b10d0b7d6dd3f74e0f10bb41b5f240fa03578a

Changed in swift:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (milestone-proposed)

Fix proposed to branch: milestone-proposed
Review: https://review.openstack.org/25683

Changed in swift:
milestone: none → 1.8.0-rc2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (milestone-proposed)

Reviewed: https://review.openstack.org/25683
Committed: http://github.com/openstack/swift/commit/03587a5c1839d7be82a8ae45dd09d70d10991150
Submitter: Jenkins
Branch: milestone-proposed

commit 03587a5c1839d7be82a8ae45dd09d70d10991150
Author: David Hadas <email address hidden>
Date: Wed Mar 20 01:35:41 2013 +0200

    Add support for Hash Prefix

    A new configuration parameter is added to /etc/swift/swift.conf
    [swift-hash]
    swift_hash_path_prefix = 'random unique string'

    New installations are advised to set this parameter to a random secret,
    which would not be disclosed ouside the organization.
    The same secret needs to be used by all swift servers of the same cluster.

    Existing installations should set this parameter to an empty string
    (the default)

    DocImpact

    Fixes: Bug #1157454

    Change-Id: I63b10d0b7d6dd3f74e0f10bb41b5f240fa03578a
    (cherry picked from commit a979c8007bf7376541c932c3e91d5ba1a95f4481)

Changed in swift:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in swift:
milestone: 1.8.0-rc2 → 1.8.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/25110
Committed: http://github.com/openstack/swift/commit/caa01cd81e7dae6c41c1ffb8a69aed6ff70777cf
Submitter: Jenkins
Branch: master

commit caa01cd81e7dae6c41c1ffb8a69aed6ff70777cf
Author: David Hadas <email address hidden>
Date: Fri Mar 22 09:00:40 2013 +0200

    objects md5-collisions

    This patch identifies md5 collisions on objects and sends a 403
    from the object server.

    Credits for originating this fix are to Michael Factor.

    Change-Id: I4f1b32183e2be6bbea56eaff86b9a4c7f440804a
    Fix: Bug #1157454

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.