EC: ECPutter doesn't close backend HTTPConnection when exception raised

Bug #1594739 reported by Kota Tsuyuzaki
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Unassigned

Bug Description

I'm not sure if this has *really* security risk yet but for safety, wrote down as private one.

For Erasure Code (EC) case, current Swift uses ECPutter to handle PUT object requests and it works like as wrapper of HTTPConnection. ECPutter keeps a HTTPConnection instance to communicate with an object-server and it will be closed at the end of request.

However, ECPutter doesn't close the HTTPConnection instance when an error (e.g. 422 Unprocessable Entity) raised during PUT sequence. This makes ECPutter to leave non-closed HTTPConnection inside and object-server will wait the next chunk because the connection is still open there. After that, object-server will give up to read the next chunk due to ChunkReadTimeout(default 60 sec) and close it from object-server side. I'm not sure for proxy-side connectin at this time right now (maybe half-closed?).

The reproduce procedure is too simple:

- create a container with ec-policy
- create a local file (e.g. "touch empty_object")
- upload object into the container with invalid etag like as 'swift upload ec_container empty_object --header "Etag: invalid"'

After that, you can get UnprocessableEntity immediately. *BUT AFTER 60 seconds*, you can see a log output line at object-server like as:

Jun 21 01:13:39 ubuntu object-server: 127.0.0.1 - - [21/Jun/2016:08:13:39 +0000] "PUT /sdb1/49/AUTH_swift/ec/RE
ADME.rst" 408 - "PUT http://localhost/v1/AUTH_swift/ec/README.rst" "tx6472aa99231343d1af08f-005768f6f6" "proxy-
server 4760" 60.1765 "-" 4833 1

This is an evidence that object-server was waiting in 60 secs *AFTER* proxy-server return 422 immediately.

The attachment file is the code possibly fixes this issue but it needs probably further work, reviews because it has only one unit test yet.

Revision history for this message
Kota Tsuyuzaki (tsuyuzaki-kota) wrote :
Changed in swift:
status: New → Confirmed
Revision history for this message
Alistair Coles (alistair-coles) wrote :

Confirmed using the method Kota describes to reproduce.

The problem can also be observed by running this functional test with an EC policy:

nosetests test/functional/tests.py:TestFile.testFileSizeLimit

which does PUTs with content-length >0 but then sends no data and the client disconnects - the object servers will report 408 after 60 second timeout.

If fallocate is enabled then I believe that the bytes fallocate'd for the tmp file that the object server never writes into will be reserved until the connection is closed i.e. 60 secs by default, which is unfortunate if the declared content-length is large.

Consequently that func test may fail on an SAIO where the available filesystem space is less than that required to fallocate space for 4 max file size objects.

Revision history for this message
Alistair Coles (alistair-coles) wrote :

I ran into a similar issue on the feature/crypto-review branch with review https://review.openstack.org/#/c/328204, but in that case the backend connections were not being closed for a non-MIME Putter. In the end I tracked that down to there being a reference held to the response object, so even though httplib.HTTPConnection.close() was being called by the proxy object controller, the object server was still waiting to read for 60 secs.

The patch attached is for master (diff against commit 84a8465) but uses similar solution that I intend to propose for https://review.openstack.org/#/c/328204. It will need tests adding but first we should decide on the right approach vs Kota's patch or another.

If I understand Kota's patch at #1, the sender (proxy) is deliberately provoking a MIME read error to cause the object server to stop waiting on a read. It may be that the underlying issue is the same as I have seen on feature/crypto-review, and actually the connection could just be closed without sending any bytes if the response reference was not being held.

Revision history for this message
Alistair Coles (alistair-coles) wrote :

Second version of my patch that better encapsulates the conn and resp inside the ECPutter

Revision history for this message
John Dickinson (notmyname) wrote :

I do not think this is a security bug. It's definitely a bug that needs to be fixed, but it doesn't describe anything not otherwise possible, nor does fixing this prevent any client behaviors.

As an example, a client can open a lot of concurrent connections (either to uniquely named objects or all to the same object) and sending a large content length header and then only trickle out 1 byte per second. This is Swift working as designed. If this pattern causes issues for the deployed cluster, then the operators can use standard DOS mitigation strategies for limiting this behavior.

Revision history for this message
Alistair Coles (alistair-coles) wrote :

I agree with John's assessment: not a security bug, but definitely a bug. Having discussed this we concluded that the same set of conditions (i.e. long-lived connections to backend servers) could be created by valid client behaviour.

information type: Private → Public
Revision history for this message
Kota Tsuyuzaki (tsuyuzaki-kota) wrote :

Awesome Alistair! I didn't notice the remaining reference causes to prevent conn.close() propagation. I'll try your patch and crypto-review branch.

Small opinion from me, that's ok this has been public already but note that a client can leave the long-lived backend connections any time easily (e.g. using invalid etag, client disconnect). Therefore client can make a situation like:

(connections between client and proxy) << (connections between proxy and object-server)

Though it also can be mitigated via access restriction, it may require another countermeasure rather than that for a lot of concurrent (slow) client connection.

Revision history for this message
Alistair Coles (alistair-coles) wrote :

Fix proposed on feature/crypto-review https://review.openstack.org/#/c/328204

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/crypto-review)
Download full text (3.7 KiB)

Reviewed: https://review.openstack.org/328204
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=03b762e80a9b3d33ce13b8222f4cd2b549171c51
Submitter: Jenkins
Branch: feature/crypto-review

commit 03b762e80a9b3d33ce13b8222f4cd2b549171c51
Author: Janie Richling <email address hidden>
Date: Mon Jun 6 17:19:48 2016 +0100

    Support for http footers - Replication and EC

    Before this patch, the proxy ObjectController supported sending
    metadata from the proxy server to object servers in "footers" that
    trail the body of HTTP PUT requests, but this support was for EC
    policies only. The encryption feature requires that footers are sent
    with both EC and replicated policy requests in order to persist
    encryption specific sysmeta, and to override container update headers
    with an encrypted Etag value.

    This patch:

    - Moves most of the functionality of ECPutter into a generic Putter
      class that is used for replicated object PUTs without footers.

    - Creates a MIMEPutter subclass to support multipart and multiphase
      behaviour required for any replicated object PUT with footers and
      all EC PUTs.

    - Modifies ReplicatedObjectController to use Putter objects in place
      of raw connection objects.

    - Refactors the _get_put_connections method and _put_connect_node methods
      so that more code is in the BaseObjectController class and therefore
      shared by [EC|Replicated]ObjectController classes.

    - Adds support to call a callback that middleware may have placed
      in the environ, so the callback can set footers. The
      x-object-sysmeta-ec- namespace is reserved and any footer values
      set by middleware in that namespace will not be forwarded to
      object servers.

    In addition this patch enables more than one value to be added to the
    X-Backend-Etag-Is-At header. This header is used to point to an
    (optional) alternative sysmeta header whose value should be used when
    evaluating conditional requests with If-[None-]Match headers. This is
    already used with EC policies when the ECObjectController has
    calculated the actual body Etag and sent it using a footer
    (X-Object-Sysmeta-EC-Etag). X-Backend-Etag-Is-At is in that case set
    to X-Object-Sysmeta-Ec-Etag so as to point to the actual body Etag
    value rather than the EC fragment Etag.

    Encryption will also need to add a pointer to an encrypted Etag value.
    However, the referenced sysmeta may not exist, for example if the
    object was created before encryption was enabled. The
    X-Backend-Etag-Is-At value is therefore changed to support a list of
    possible locations for alternate Etag values. Encryption will place
    its expected alternative Etag location on this list, as will the
    ECObjectController, and the object server will look for the first
    object metadata to match an entry on the list when matching
    conditional requests. That way, if the object was not encrypted then
    the object server will fall through to using the EC Etag value, or in
    the case of a replicated policy will fall through to using the normal
 ...

Read more...

tags: added: in-feature-crypto-review
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)
Download full text (10.5 KiB)

Reviewed: https://review.openstack.org/336407
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=9045f338693474303f209a4909bca64031afbe9d
Submitter: Jenkins
Branch: master

commit f36bc513c5e0029b90207d7a2dec81965eed8300
Author: Alistair Coles <email address hidden>
Date: Tue Jun 7 15:08:54 2016 +0100

    Add encryption overview doc

    Include a note in container-sync docs pointing to specific
    configuration needed to be compatible with encryption.

    Also remove the sample encryption root secret from
    proxy-server.conf-sample and in-process test setup. Remove encryption
    middleware from the default proxy pipeline.

    Change-Id: Ibceac485813f3ac819a53e644995749735592a55

commit 96a0e077532c3227b9290af7d74a0b42ee08e8de
Author: Janie Richling <email address hidden>
Date: Tue Jun 7 15:01:32 2016 +0100

    Enable object body and metadata encryption

    Adds encryption middlewares.

    All object servers and proxy servers should be upgraded before
    introducing encryption middleware.

    Encryption middleware should be first introduced with the
    encryption middleware disable_encryption option set to True.
    Once all proxies have encryption middleware installed this
    option may be set to False (the default).

    Increases constraints.py:MAX_HEADER_COUNT by 4 to allow for
    headers generated by encryption-related middleware.

    Co-Authored-By: Tim Burke <email address hidden>
    Co-Authored-By: Christian Cachin <email address hidden>
    Co-Authored-By: Mahati Chamarthy <email address hidden>
    Co-Authored-By: Peter Chng <email address hidden>
    Co-Authored-By: Alistair Coles <email address hidden>
    Co-Authored-By: Jonathan Hinson <email address hidden>
    Co-Authored-By: Hamdi Roumani <email address hidden>

    UpgradeImpact

    Change-Id: Ie6db22697ceb1021baaa6bddcf8e41ae3acb5376

commit 3ad003cf51151f8ce6dfc6c2c529206eda5f7b60
Author: Alistair Coles <email address hidden>
Date: Mon Jun 6 18:38:50 2016 +0100

    Enable middleware to set metadata on object POST

    Adds a new form of system metadata for objects.

    Sysmeta cannot be updated by an object POST because
    that would cause all existing sysmeta to be deleted.
    Crypto middleware will want to add 'system' metadata
    to object metadata on PUTs and POSTs, but it is ok
    for this metadata to be replaced en-masse on every
    POST.

    This patch introduces x-object-transient-sysmeta-*
    that is persisted by object servers and returned
    in GET and HEAD responses, just like user metadata,
    without polluting the x-object-meta-* namespace.
    All headers in this namespace will be filtered
    inbound and outbound by the gatekeeper, so cannot
    be set or read by clients.

    Co-Authored-By: Clay Gerrard <email address hidden>
    Co-Authored-By: Janie Richling <email address hidden>

    Change-Id: I5075493329935ba6790543fc82ea6e039704811d

commit fa7d80029b53391a7877aeb6438c98a45bab42a7
Author: Alistair Coles <email address hidden>
Date: Mon Jun 6 18:16:11 2016 +0100

    Make container update override headers persistent

    Whate...

Changed in swift:
status: Confirmed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/337431

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/swift 2.9.0

This issue was fixed in the openstack/swift 2.9.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/hummingbird)

Fix proposed to branch: feature/hummingbird
Review: https://review.openstack.org/363111

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/hummingbird)
Download full text (84.1 KiB)

Reviewed: https://review.openstack.org/363111
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=1ab2a296f58ae76aeffef9f3f0fb90e15358be27
Submitter: Jenkins
Branch: feature/hummingbird

commit 3b5850836c59c46f2507a7f62aceccf4c37e5d41
Author: gecong1973 <email address hidden>
Date: Tue Aug 30 15:08:49 2016 +0800

    Remove white space between print and ()

    There is a white space between print and ()
    in /tempauth.py, This patch fix it

    Change-Id: Id3493bdef12223aa3a2bffc200db8710f5949101

commit f88e7fc0da2ed6a63e0ea3c3459d80772b3068cd
Author: zheng yin <email address hidden>
Date: Mon Aug 29 20:21:44 2016 +0800

    Clarify test case in common/ring/test_builder

    They use a bare assertRaises(ValueError, ring.RingBuilder, *,*,*), but
    it's not clear which one raises which ValueError(), so I extend them to
    validate the error strings as well.

    Change-Id: I63280a9fc47ff678fe143e635046a0b402fd4506

commit d68b1bd6ddf44c5088e9d02dcb2f1b802c71411b
Author: zhufl <email address hidden>
Date: Mon Aug 29 14:31:27 2016 +0800

    Remove unnecessary tearDown

    This is to remove unnecessary tearDown to keep code clean.

    Change-Id: Ie70e40d6b55f379b0cc9bc372a35705462cade8b

commit d2fc2614575b04fd9cab5ae589880b92eee9b186
Author: Matthew Oliver <email address hidden>
Date: Fri Aug 19 16:17:31 2016 +1000

    Authorise versioned write PUTs before copy

    Currently a versioned write PUT uses a pre-authed request to move
    it into the versioned container before checking whether the
    user is authorised. This can lead to some interesting behaviour
    whereby a user can select a versioned object path that it does not
    have access to, request a put on that versioned object, and this
    request will execute the copy part of the request before it fails
    due to lack of permissions.

    This patch changes the behaviour to be the same as versioned DELETE
    where the request is authorised before anything is moved.

    Change-Id: Ia8b92251718d10b1eb44a456f28d3d2569a30003
    Closes-Bug: #1562175

commit c1ef6539b6ba9d2e4354c9cd2eec8a0195cdb19f
Author: Clay Gerrard <email address hidden>
Date: Thu Aug 25 11:00:49 2016 -0700

    add test for expirer processes == process

    This is a follow up from a change that improved the error message.

    Related-Change: I3d12b79470d122b2114f9ee486b15d381f290f95

    Change-Id: I093801f3516a60b298c13e2aa026c11c68a63792

commit 01477c78c1163822de41484e914a0736e622085b
Author: zheng yin <email address hidden>
Date: Thu Aug 25 15:37:42 2016 +0800

    Fix ValueError information in obj/expirer

    I fix error information in raise ValueError(...)
    For example:
        if a>=b:
            # It should be under below and not 'a must be less than or equal to b'
            raise ValueError('a must be less than b')

    Change-Id: I3d12b79470d122b2114f9ee486b15d381f290f95

commit b81f53b964fdb8f3b50dd369ce2e194ee4dbb0b7
Author: zheng yin <email address hidden>
Date: Tue Aug 23 14:26:47 2016 +0800

    Improve readability in the obj server's unit tests

    This change improves the reada...

tags: added: in-feature-hummingbird
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.