Object PUT into Walrus has NO end-to-end integrity check

Bug #885230 reported by dino.korah
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Eucalyptus
Confirmed
Undecided
Unassigned

Bug Description

The claim that the RESTful API is Amazon S3 compatible would fall short if this feature is not there.
Additional HEAD request after a PUT is the work around, but that is additional round trip and hence poor user experience for large objects.

Extract from Amazon AWS S3 documentation

Content-MD5
The base64 encoded 128-bit MD5 digest of the message (without the headers) according to RFC 1864. This header can be used as a message integrity check to verify that the data is the same data that was originally sent. Although it is optional, we recommend using the Content-MD5 mechanism as an end-to-end integrity check. For more information about REST request authentication, go to REST Authentication in the Amazon Simple Storage Service Developer Guide

Revision history for this message
Mitch Garnaat (mitch-garnaat) wrote :

I just tried a simple PUT to the ECC service using boto and I found an etag header in the response. In fact, boto would generate an exception if no etag header was found in the response. Could you be more specific about what you feel is missing in Walrus?

Revision history for this message
dino.korah (dckorah) wrote :

I was using an S3 client library called libs3, which I drive from a c++ wrapper.
In the library, when you make a PUT request, you can specify the MD5 value of the object that you are about to PUT.

What I noticed was, when I altered the MD5 to not match the data, Walrus did not come back with an error at all.

PS: libs3 for me is a trusted partner in crime; I use it in various projects to interact with S3. Never had any problems with it.

Revision history for this message
Mitch Garnaat (mitch-garnaat) wrote :

Ah, okay. I understand what you are saying. I'll do some testing and report back.

Revision history for this message
Mitch Garnaat (mitch-garnaat) wrote :

What version of Eucalyptus are you using? I just tried a test against the latest development version and I couldn't reproduce this. I used boto to store a key in an existing bucket but I made sure that the MD5 checksum sent in the request did not match the actual checksum of the file being uploaded. Walrus responded with:

>>> k.set_contents_from_file(s2, md5=cs)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/site-packages/boto/s3/key.py", line 790, in set_contents_from_file
    self.send_file(fp, headers, cb, num_cb, query_args)
  File "/usr/lib/python2.5/site-packages/boto/s3/key.py", line 583, in send_file
    query_args=query_args)
  File "/usr/lib/python2.5/site-packages/boto/s3/connection.py", line 429, in make_request
    override_num_retries=override_num_retries)
  File "/usr/lib/python2.5/site-packages/boto/connection.py", line 796, in make_request
    return self._mexe(http_request, sender, override_num_retries)
  File "/usr/lib/python2.5/site-packages/boto/connection.py", line 717, in _mexe
    request.body, request.headers)
  File "/usr/lib/python2.5/site-packages/boto/s3/key.py", line 550, in sender
    'ETag from S3 did not match computed MD5')
boto.exception.S3DataError: BotoClientError: ETag from S3 did not match computed MD5
>>>

Revision history for this message
Mitch Garnaat (mitch-garnaat) wrote :

I also just tried against the Eucalyptus Community Cloud which is running 2.0.3 and got the same result as shown above.

Revision history for this message
Mitch Garnaat (mitch-garnaat) wrote :

But if I do the same test against S3, I get this server-side error response:

S3ResponseError: S3ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>BadDigest</Code><Message>The Content-MD5 you specified did not match what we received.</Message><ExpectedDigest>vbKjyVqDxcRSPnj3H095Rw==</ExpectedDigest><CalculatedDigest>jvwsjFtBf/mePM+Da9ZnQQ==</CalculatedDigest><RequestId>2662640FEFBC9B81</RequestId><HostId>GjvxD/SnNFd5Zjtg9hQudRiD3ilyv3+Nkb9T5AgSAyZd0rXHgUuQvAG2TUpmYqXA</HostId></Error>

which is really the point you have been trying to make all along 8^)

This does look like a bug in Walrus.

Revision history for this message
dino.korah (dckorah) wrote :

Thanks Mitch! You saved my day!

I am working with 2.0.3.

As I understand from the boto trace, it was boto that was bailing out, where as in S3, the server itself recognized the error and responded accordingly.

I would much prefer the latter behavior; that sounds more SaaSy. Not just because my library likes it that way, but for the sake of any one else who doesnt bother verifying the ETag returned by Walrus.

Revision history for this message
dino.korah (dckorah) wrote :

Moving to confirmed, as per the investigation done by Mitch Garnaat

Changed in eucalyptus:
status: New → Confirmed
Revision history for this message
Andy Grimm (agrimm) wrote :

This issue is now being tracked upstream at http://eucalyptus.atlassian.net/browse/EUCA-2783

Please watch that issue for further updates.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.