crash on content-encoding:gzip http responses

Bug #1282861 reported by groqez
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
python-swiftclient
Fix Released
Undecided
Unassigned
python-swiftclient (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Confirmed
Undecided
Unassigned
Utopic
Fix Released
Undecided
Unassigned

Bug Description

Versions:
python-swiftclient==2.0.2
requests==2.2.1

python-swiftclient cannot seem to handle gziped http response.

DEBUG:swiftclient:RESP HEADERS: [('x-container-object-count', '352'), ('content-encoding', 'gzip'), ('transfer-encoding', 'chunked'), ('accept-ranges', 'bytes'), ('date', 'Fri, 21 Feb 2014 02:51:24 GMT'), ('x-timestamp', '1363448617.07881'), ('x-trans-id', 'tx7adcd463316f4393a81ec046dcc7e585'), ('x-container-bytes-used', '4023589531'), ('content-type', 'application/json; charset=utf-8')]
DEBUG:swiftclient:RESP BODY:[A LOT OF BINARY (GZIPPED) DATA]
Traceback (most recent call last):
  File "./bin/swift", line 1488, in <module>
    globals()['st_%s' % args[0]](parser, argv[1:], thread_manager)
  File "./bin/swift", line 592, in st_list
    prefix=options.prefix, delimiter=options.delimiter)[1]
  File "/dev/python/env/lib/python2.7/site-packages/swiftclient/client.py", line 1263, in get_container
    full_listing=full_listing)
  File "/dev/python/env/lib/python2.7/site-packages/swiftclient/client.py", line 1192, in _retry
    rv = func(self.url, self.token, *args, **kwargs)
  File "/dev/python/env/lib/python2.7/site-packages/swiftclient/client.py", line 592, in get_container
    return resp_headers, json_loads(body)
  File "/dev/python/env/lib/python2.7/site-packages/simplejson/__init__.py", line 488, in loads
    return _default_decoder.decode(s)
  File "/dev/python/env/lib/python2.7/site-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/dev/python/lib/python2.7/site-packages/simplejson/decoder.py", line 389, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

HTTP requests are made through the HTTPConnection class, which is a wrapper around python-requests mimicking the httplib API ?..

A way to fix this would be to let urllib do the decoding by passing decode_content=True in HTTPConnection.getresponse()

         self.resp.getheaders = getheaders
         self.resp.getheader = getheader
- self.resp.read = self.resp.raw.read
+ self.resp.read = functools.partial(self.resp.raw.read, decode_content=True)
         return self.resp

The other way would be to let every function using HTTPConnection.getresponse() handle gzip decoding manually...

Tags: gzip
Revision history for this message
Jean-Alexis Lauricella (ja-lauricella) wrote :

I'm using duplicity on swift container which uses python-swiftclient.
It fails only on gzip/encrypted data in containers.

Revision history for this message
Ian Cordasco (icordasc) wrote :

It looks like this has been fixed on master (https://github.com/openstack/python-swiftclient/blob/3d0de79e26e2aa6285742c60aca3c164e9c2fbb9/swiftclient/client.py#L238) and python-swiftclient 2.1.0 should include these changes.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python-swiftclient (Ubuntu):
status: New → Confirmed
Revision history for this message
Stuart Bishop (stub) wrote :

I'm dealing with text/plain, where no crash occurs (no decoding to do), and instead I just get complains about the md5 of the content not matching the etag and a compressed blob of data where I expected something readable. A text/plain file is not round tripping.

2.1.0 handles this, probably the fix icordasc cites. It does, however, introduce Bug #1338464 so there are still kinks in the system.

Revision history for this message
Ian Cordasco (icordasc) wrote :

@stub, the content-type and the content-encoding are not related. You are receiving gzip encoded data according to the headers in Bug #13338464. The bug you're seeing is a consequence of the fact that the fix for this bug did not take into account the fact that we swiftclient checks the integrity of the response in multiple ways.

It seems that a proper fix for this bug would be to tee off the response. One copy will be left gzip'd to check the length and md5 checksum, the other will actually be decoded so the user gets what they expect. I can work on this if no one else wants to. I'll probably be available at the earliest Friday to work on this.

Stuart Bishop (stub)
Changed in python-swiftclient (Ubuntu Utopic):
status: Confirmed → Fix Released
Changed in python-swiftclient (Ubuntu Trusty):
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-swiftclient (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/184659

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/184956

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-swiftclient (master)

Reviewed: https://review.openstack.org/184659
Committed: https://git.openstack.org/cgit/openstack/python-swiftclient/commit/?id=7d5c85ad1013185bd3aa7f0d384f2e0e68e3b484
Submitter: Jenkins
Branch: master

commit 7d5c85ad1013185bd3aa7f0d384f2e0e68e3b484
Author: Tim Burke <email address hidden>
Date: Wed May 20 16:07:04 2015 -0700

    Stop decoding object content

    Previously, we had urllib3 (via requests) automatically decode all
    responses with a Content-Encoding of deflate or gzip. This included
    object downloads, which would in turn cause etag or content-length
    mismatch errors. (See bug 1338464)

    This was apparently added in response to a third-party proxy sitting
    between the client and server which, having observed that the client
    would accept gzip-encoded content while the server sent an unencoded
    response, would perform the compression. (See bug 1282861)

    Now, we'll no longer let requests send any default headers, nor do any
    decoding.

    Change-Id: I6cc30a5c12e37de06d7322533a3c36ad15397cc8
    Closes-Bug: 1338464
    Related-Bug: 1282861

Tim Burke (1-tim-z)
Changed in python-swiftclient:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/184956
Committed: https://git.openstack.org/cgit/openstack/python-swiftclient/commit/?id=f728027bed59d08a6491ae9c14d2f5968f8d6fa3
Submitter: Jenkins
Branch: master

commit f728027bed59d08a6491ae9c14d2f5968f8d6fa3
Author: Tim Burke <email address hidden>
Date: Thu May 21 22:44:36 2015 -0700

    Accept gzip-encoded API responses

    Previously, we would accept gzip-encoded responses, but only because we
    were letting requests decode *all* responses (even object data). This
    restores the previous capability, but with tighter controls about which
    requests will accept gzipped responses and where the decoding happens.

    Change-Id: I4fd8b97207b9ab01b1bcf825cc16efd8ad46344a
    Related-Bug: 1282861
    Related-Bug: 1338464

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.