Comment 15 for bug 1503161

Revision history for this message
Matthew Oliver (matt-0) wrote :

Ahh the overrides do tend to get a bit confusing for people.

So this is interesting, and I talked briefly to Bruno in Boston and feel his pain. So we should totally fix this. So far, the best I've seen is Clay's suggestion of supporting handoff affinity to deletes.. this should help somewhat. But there also might be merit in what John was saying.

One option could be.
If DELETEs _could_ follow handoffs in clay's approach, or maybe even if we don't, we could drop ts (on handoffs or primaries) so long as at least 1 node returns a 204. As we did delete one, and even if something else happens in the cluster the <timestamp>.ts will sort itself out in the long run.
This way a we did issue a delete, so we can, I think safely return 204:

   t1 DELETE [204, 404, 404] --> [204, 204 (ts), 204 (ts)]
   t2 PUT [ 404, 200, 200 ] --> 200

In this case the TS have no effect as they will be cleared out because of the PUT at t2, expect 2 zero byte files that may exist on handoffs for a while.. however, in the system the DELETE is an event and it did happen, so we did the right thing.

Though I guess the downside to this is there is a chance we can delete without a quorum. Which is problematic. But on the other hand if we get a 404 we have no idea if the object lives on handoffs or has already been deleted.

Damn why do people what to delete on an object store anyway :P

Another option would be to do exactly as clay says, and now after thinking about it some more, I tend to agree more. Have the ability use write affinity in Deletes such that, it orders itself (handoff_deletes = 2)

  [pri, pri, pri, affinity_hndoff, affinity_handoff, hndoff]

And only read 2x replicas or something.

My 2 cents.
Matt