Activity log for bug #1691566

Date Who What changed Old value New value Message
2017-05-17 21:16:15 clayg bug added bug
2017-05-17 23:38:39 Tim Burke description There's probably more than one reason a remote container-replicator might fail to call the REPLICATE RPC after rsyncing a database over (either the ReplicatorRpc complete_rsync or rsync_then_merge) which could cause a temporary db to stack up (and waste disk space) in the temporary dir. But one common way seems to be a race when two (or more) remote container-replicators are both trying to complete_rsync a remote db onto a new node after rebalance. If the database is large and the network busy - it's not uncommon to hit such a wide race. When it does the the looser will miss some cleanup code: https://github.com/openstack/swift/blob/6e893e228840bc42cfd13546245438832bc2bb46/swift/common/db_replicator.py#L820 While it's probably reasonable to avoid some sort of sync/merge and return the 404 error - before doing so the local container server should cleanup the temporary db which the remote is trying to tell us about. Otherwise it *will* get reaped if it's older than a reclaim age (lp bug #1691565) There's probably more than one reason a remote container-replicator might fail to call the REPLICATE RPC after rsyncing a database over (either the ReplicatorRpc complete_rsync or rsync_then_merge) which could cause a temporary db to stack up (and waste disk space) in the temporary dir. But one common way seems to be a race when two (or more) remote container-replicators are both trying to complete_rsync a remote db onto a new node after rebalance. If the database is large and the network busy - it's not uncommon to hit such a wide race. When it does then the loser will miss some cleanup code: https://github.com/openstack/swift/blob/6e893e228840bc42cfd13546245438832bc2bb46/swift/common/db_replicator.py#L820 While it's probably reasonable to avoid some sort of sync/merge and return the 404 error - before doing so the local container server should cleanup the temporary db which the remote is trying to tell us about. Otherwise it *will* get reaped if it's older than a reclaim age (lp bug #1691565)
2017-05-23 00:28:34 Charles Hsu bug added subscriber Charles Hsu
2017-12-28 00:11:50 clayg swift: importance Undecided Medium
2017-12-28 00:11:53 clayg swift: status New Confirmed
2019-09-12 22:04:45 Alexis Deberg bug added subscriber Adeberg