Include original object md5sum in sysmeta for better record keeping with SLOs

Bug #1744375 reported by Martin Lanner on 2018-01-19
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Wishlist
Unassigned

Bug Description

Operators and application developers frequently split large objects using SLOs. While SLOs are sane in the checking of the md5sum of each SLO segment and the rolled up md5sum of all the segments, it doesn't help an operator/app to maintain a record of the md5sum of the original object before it was segmented.

For example, a 100GB movie object, which is treated as the "golden" image, is uploaded and split into 100 1GB pieces. Swift will do the right thing and verify all the segments to ensure the object in its entirety was correctly written. However, for historical reasons (file systems) and for human sanity and trust in the system (Swift), operators and app devs frequently ask for an md5sum record of the source/original object. This record has to be immutable for obvious reasons. Also, having the md5sum as a piece of sysmeta would allow a subsequent "audit" of an object to be triggered by an operator/app in some fashion.

Tim Burke (1-tim-z) on 2018-01-19
Changed in swift:
status: New → Confirmed
importance: Undecided → Wishlist
Tim Burke (1-tim-z) wrote :

Might be able to enhance slo to accept some content-md5 query param -- have it

- automatically trigger the heartbeat=on behavior,
- move that value to sysmeta, so it won't get overwritten on POST,
- perform a GET immediately after the PUT,
- stream the bytes through a hasher (all the while continuing to
  dribble out bytes to the client), and
- report the result back in the final response.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers