object replicator dont recalculate the checksum of the suffix even when do_listdir is True
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
New
|
Undecided
|
Unassigned |
Bug Description
The object replicator on every run, it recalculates 10% of the suffix checksum using the follwoing code
def _do_listdir(
return (((partition + replication_cycle) % 10) == 0)
But in __get_hashes(), the listdir is performed, but not setting the hash[suff] to None, instead setdefault(None). This dont help much in case of missing hashes in the suffix (due to xfs_repair etc)
We need to explicitely set the hashes[suff] to None, so the suffix hash is recalculated.
Follwoing is the proposed change. Please let me know if this change is good.
if do_listdir:
for suff in os.listdir(
if len(suff) == 3:
- hashes.
+ hashes[suff] = None
summary: |
- object replicator dont recalculate the checksum of the suffix even + object replicator dont recalculate the checksum of the suffix even when do_listdir is True |
The intent there was not to force a recalculate on un-dirtied suffixes every 11th replication cycle.
That hook was only to do a *single* listdir IO to check if there's any suffixes on disk that are not in the hashes.pkl - if the suffix is in the hashes.pkl (and not dirtied) the current understanding of observed operating systems is that the hash would be correct [1].
Honestly the whole endeavor seemed suspect, I'm not sure anyone really observed production systems syncing around suffixes and failing to invalidate them with a post-REPLICATE request, but it was only *one* syscall/IO and the idea of a deterministic way to do it periodically was sort of cute so I guess someone decided it was worth the complexity.
1. understood to be correct, barring a bug. Operators (myself among them) have in the past observed mis-hashed suffixes (i.e. un-dirty hashed suffix value in hashes.pkl doesn't match recalculated hash on files in the suffix) - but those have all been explainable by bugs introduced with the hashes.invalid change that have all since been fixed and I haven't yet observed the issue on clusters deployed since we fixed those bugs. But because such a class of bugs is known to exist, there is broad support/interest in having an operator tunable that sets a *timer* to force a recalc on any un-dirtied suffix after some... weeks. But doing it every 10 cycles would burn an unacceptable amount of IO on stable systems.