x-if-delete-at is bad
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
I've been meaning to record and fix this bug for quite some time; at least it's recorded now.
When I implemented expiring objects I screwed up and added the x-if-delete-at conditional for object deletes. This is wrong due to eventual consistency possibilities. Though the possibility of an issue is small, it does exist.
Imagine I put an object with the timestamp 1369160000.0000000 and I set it's x-delete-at value to 1369160010.0000000. This put succeeds on all three replicas.
Just afterwards I overwrite the object again with the timestamp 1369160001.0000000 and no x-delete-at value. This put succeeds, but only on two nodes.
Then, the expirer comes along later (and no replication has occurred yet to stabilize the out of date copy) and issues the delete. The delete fails on the two replicas with newer copies (and no x-delete-at value, so not matching) but succeeds on the out of date replica since it does still match. On the one replica where it succeeded it would make a new tombstone, say with the timestamp of 1369160011.0000000.
When replication kicks in it will replicate the newest action, the tombstone, over the newer actual objects.
The best fix I can imagine is dropping the x-if-delete-at conditional delete, because it's wrong, and instead making the expirer do a head on all replicas, check the x-delete-at headers itself, and issue the delete only if all replicas responded and matched. There are still race conditions that can arise, but they are quite a bit smaller.
Of course, I'm open to an even better solution, and I'm not sure when I'll actually get around to implementing this.
I must be confused, I'll double check myself. But before you get to far into it... what about just issuing the DELETE with an X-Timestamp set to the expiry time?