OpenStack Object Storage (swift)

Bug #1652323
Activity log

Activity log for bug #1652323

Date	Who	What changed	Old value	New value	Message
2016-12-23 14:25:53	Alistair Coles	bug			added bug
2016-12-23 14:26:15	Alistair Coles	swift: importance	Undecided	Medium
2017-01-03 23:59:02	clayg	swift: status	New	Confirmed
2017-06-08 08:56:03	Alistair Coles	summary	ssync syncs an expired object as a tombstone, probe test_object_expirer fails	ssync syncs an expired object as a tombstone
2017-06-08 08:57:31	Alistair Coles	description	With replication method = ssync: When ssync sender on node A syncs an expired object t0.data which has an expired delete-at header value of t_expire it sends a DELETE subrequest which generates a tombstone on the receiver node B at t0. So after sync we have t0.data on sender node A and t0.ts on receiver node B. That's not good. When the expirer runs and tries to delete the expired object, the expirer's DELETE to node A succeeds and node A gets t_expire.ts. The expirer's DELETE to node B fails with 412 because the tombstone t0.ts on node B does not have an exires-at header that matches the x-if-delete-at header sent by the expirer. So the result is t_expire.ts on node A and t1.ts on node B. The next time the replicator runs this anomaly will be corrected and both nodes will end up with t_expires.ts. However, apart from the fact that a replication process should not actually generate inconsistent state, the anomaly has undesirable side-effects: 1. The expirer DELETE to node B fails and is therefore retried (by default 3 times) 2. Because the expirer DELETE to node B fails, some container db listings are not updated with the delete, so container listing remain inconsistent after expiration. This can all be seen to play out with test/probe/test_object_expirer.py:TestObjectExpirer.test_expirer_delete_returns_outdated_412 which fails if replication method is ssync but passes with rsync The solution is likely to be for the ssync sender to be more discriminating when it opens a diskfile and gets a DiskFileDeleted exception, here: https://github.com/openstack/swift/blob/3218f8b064e462d901466b04a4813e15ec96da85/swift/obj/ssync_sender.py#L349-L351 When the exception is DiskFileExpired, the sender should probably attempt send_put/post rather than send_delete.	With replication method = ssync: When ssync sender on node A syncs an expired object t0.data which has an expired delete-at header value of t_expire it sends a DELETE subrequest which generates a tombstone on the receiver node B at t0. So after sync we have t0.data on sender node A and t0.ts on receiver node B. That's not good. When the expirer runs and tries to delete the expired object, the expirer's DELETE to node A succeeds and node A gets t_expire.ts. The expirer's DELETE to node B fails with 412 because the tombstone t0.ts on node B does not have an exires-at header that matches the x-if-delete-at header sent by the expirer. So the result is t_expire.ts on node A and t1.ts on node B. The next time the replicator runs this anomaly will be corrected and both nodes will end up with t_expires.ts. However, apart from the fact that a replication process should not actually generate inconsistent state, the anomaly has undesirable side-effects: 1. The expirer DELETE to node B fails and is therefore retried (by default 3 times) 2. Because the expirer DELETE to node B fails, some container db listings are not updated with the delete, so container listing remain inconsistent after expiration. (Until https://review.openstack.org/416384 was merged, this could all be seen to play out with test/probe/test_object_expirer.py:TestObjectExpirer.test_expirer_delete_returns_outdated_412 which failed if replication method is ssync but passes with rsync) The solution is likely to be for the ssync sender to be more discriminating when it opens a diskfile and gets a DiskFileDeleted exception, here: https://github.com/openstack/swift/blob/3218f8b064e462d901466b04a4813e15ec96da85/swift/obj/ssync_sender.py#L349-L351 When the exception is DiskFileExpired, the sender should probably attempt send_put/post rather than send_delete.
2017-06-28 07:54:20	Alistair Coles	swift: importance	Medium	High
2017-08-17 22:37:39	OpenStack Infra	swift: status	Confirmed	Fix Released
2018-01-19 13:29:53	OpenStack Infra	tags		in-feature-s3api
2018-01-22 22:23:32	OpenStack Infra	tags	in-feature-s3api	in-feature-deep in-feature-s3api