md5sum mismatch
This bug report will be marked for expiration in 21 days if no further activity occurs. (find out why)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
s3cmd (Ubuntu) |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
affected version: s3cmd version 1.1.0-beta3
a md5 mismatch occurs when performing a PUT. Possibly because it is performing a multipart PUT?
# s3cmd --config=
15728640 of 15728640 100% in 2s 6.02 MB/s done
WIRELINE-
15728640 of 15728640 100% in 2s 6.03 MB/s done
WIRELINE-
15728640 of 15728640 100% in 2s 5.68 MB/s done
WIRELINE-
7586259 of 7586259 100% in 1s 3.97 MB/s done
# s3cmd --config=
s3://tcn-
File size: 54772179
Last mod: Thu, 30 Jul 2015 19:49:19 GMT
MIME type: text/plain; charset=us-ascii
MD5 sum: b043e8ba574568f
ACL: admin: FULL_CONTROL
# md5sum /opt/compliance
b9ca30c4fdfdc7a
when a GET is performed s3cmd also reports the discrepancy:
# s3cmd --config=
s3://tcn-
54772179 of 54772179 100% in 3s 13.21 MB/s done
WARNING: MD5 signatures do not match: computed=
the original file and the version placed and retrieved from the S3 bucket are identical, but the md5sum mismatches are affecting automated GET/PUT processes.
This does not occur using version 1.0.0-1 (Precise), or the current s3cmd stable 1.0.0-4
I believe the Etag is expected to have an md5sum of the payload, but in the case of multipart uploads, Amazon stores the md5sum of the most recent piece uploaded, with "-" and the piece number appended. You saw that your upload was in four pieces, and "-4" was appended to what was supposed to be the MD5 checksum.
The problem is still present in s3cmd 1.5.3, and I suspect it is a mistake made by Amazon, not by the s3cmd developers.
One suggested workaround was to use 's3cmd mv realname baloney ; s3cmd mv baloney realname ' to force Amazon S3 to recompute the md5sum... but IIRC this doesn't really move (or re-key) the file so much as download, reupload, then delete the original, so if the file is larger than multipart- chunk-size then it will reupload in pieces, recreating the problem.