Sorry for the slow response, I've spent a while poking around this.
What we seem to have ended up with is that one of our container databases (out of three replicas) has a number of additional entries in it - i.e. it has entries in its object table that it thinks are undeleted that are not present in the other two replicas (which agree with each other in terms of the number of undeleted objects).
Those rows in our divergent container database no longer correspond to anything on disk (i.e. the path found by swift-get-nodes don't exist at all, no .data nor .ts)
Also, some of them appear in every container listing, but the number of objects returned varies each time (e.g. doing swift list | wc -l gives you a different answer, and it's not a monotonic sequence); is all of this weirdness likely solely a consequence of having had a disconnected node at some point in the past?
[I take your point that clients may rely on the existing behaviour, but I still think 404 on a DELETE that succeeds (i.e. AFAICT the erroneous row in the divergent database goes away, but I should probably check harder given the problems with inconsistent listing) is incorrect]
Hi,
Sorry for the slow response, I've spent a while poking around this.
What we seem to have ended up with is that one of our container databases (out of three replicas) has a number of additional entries in it - i.e. it has entries in its object table that it thinks are undeleted that are not present in the other two replicas (which agree with each other in terms of the number of undeleted objects).
Those rows in our divergent container database no longer correspond to anything on disk (i.e. the path found by swift-get-nodes don't exist at all, no .data nor .ts)
Also, some of them appear in every container listing, but the number of objects returned varies each time (e.g. doing swift list | wc -l gives you a different answer, and it's not a monotonic sequence); is all of this weirdness likely solely a consequence of having had a disconnected node at some point in the past?
[I take your point that clients may rely on the existing behaviour, but I still think 404 on a DELETE that succeeds (i.e. AFAICT the erroneous row in the divergent database goes away, but I should probably check harder given the problems with inconsistent listing) is incorrect]
Thanks,
Matthew