In a situation where two nodes does not have the same version of a ring
and they both think the other node is the primary node of a partition,
a race condition can lead to the loss of some of the objects of the
partition.
The following sequence leads to the loss of some of the objects:
1. A gets and reloads the new ring
2. A starts to replicate/revert the partition P to node B
3. B (with the old ring) starts to replicate/revert the (partial)
partition P to node A
=> replication should be fast as all objects are already on node A
4. B finished replication of (partial) partition P to node A
5. B remove the (partial) partition P after replication succeeded
6. A finishes replication of partition P to node B
7. A removes the partition P
8. B gets and reloads the new ring
All data transfered between steps 2 and 5 will be lost as they are not
anymore on node B and they are also removed from node A.
This commit make the replicator/reconstructor to hold a replication_lock
on partition P so that remote node cannot start an opposite replication.
Reviewed: https:/ /review. opendev. org/754242 /git.openstack. org/cgit/ openstack/ swift/commit/ ?id=8c0a1abf744 a11b5c289239e3a c830786a9de4e9
Committed: https:/
Submitter: Zuul
Branch: master
commit 8c0a1abf744a11b 5c289239e3ac830 786a9de4e9
Author: Romain LE DISEZ <email address hidden>
Date: Thu Sep 24 20:36:36 2020 -0400
Fix a race condition in case of cross-replication
In a situation where two nodes does not have the same version of a ring
and they both think the other node is the primary node of a partition,
a race condition can lead to the loss of some of the objects of the
partition.
The following sequence leads to the loss of some of the objects:
1. A gets and reloads the new ring
2. A starts to replicate/revert the partition P to node B
3. B (with the old ring) starts to replicate/revert the (partial)
partition P to node A
=> replication should be fast as all objects are already on node A
4. B finished replication of (partial) partition P to node A
5. B remove the (partial) partition P after replication succeeded
6. A finishes replication of partition P to node B
7. A removes the partition P
8. B gets and reloads the new ring
All data transfered between steps 2 and 5 will be lost as they are not
anymore on node B and they are also removed from node A.
This commit make the replicator/ reconstructor to hold a replication_lock
on partition P so that remote node cannot start an opposite replication.
Change-Id: I29acc1302a75ed 52c935f42485f77 5cd41648e4d
Closes-Bug: #1897177