Reaper only consults last container primary for object list, this could potentially lead to dark data
Bug #1786730 reported by
Matthew Oliver
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
New
|
Undecided
|
Unassigned |
Bug Description
I was reviewing some code today and noticed something interesting. The account reaper uses the direct client to get a list of objects to reap. However it only ever asks the last primary for the given container for this list.
https:/
We've discussed in the passed, and in sharding that there is a chance that the object lists could vary a little in the container DBs, so only ever asking the 1 node could lead to lost data.
I have written a probe test to demonstrate the problem: http://
To post a comment you must log in.
I wonder if this might be sufficiently covered by recommending that the delay_reaping setting be sufficiently large that container replication will have settled... though the advent of deep containers makes that even harder to reason about.
Looks like once the reaper thinks the container is empty, it issues a DELETE to all replicas of the container. Presumably, any replica that isn't actually empty will 409... but there should be fewer objects in the container, which will eventually translate into fewer rows, which should mean that it will get easier to replicate to the one replica that's been driving the work (and presumably accepted the DELETE) ...
This definitely highlights the need to have a large account reclaim_age -- we probably ought to expand on our comment about
> The sum of [delay_reaping] and the container-updater
> interval should be less than the account-replicator
> reclaim_age.
in etc/account- server. conf-sample; there are all sorts of other factors to consider like
- container replication cycle time, replicators/ sharders are actually keeping up...
- container sharder cycle time,
- object updater cycle time,
- whether your updaters/
Ultimately, I think it all comes back to John's great lament: implementing DELETE was the worst idea Swift ever had :P