Race/backtrace in hash_suffix() - deleting ts files
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Reading code to see how .ts files are reclaimed I notice that both object-replicator and object-server call get_hashes, and hence hash_suffix(). I wondered if they can race and sure enough I find backtraces such as:
May 29 23:45:36 sw-aw2az1-object012 object-replicator STDOUT: ERROR:root:Error hashing suffix#012Traceback (most recent call last):#012 File "/usr/lib/
and
Apr 15 16:10:49 sw-aw2az1-object059 object-replicator STDOUT: ERROR:root:Error hashing suffix#012Traceback (most recent call last):#012 File "/usr/lib/
...and on investigation, the files don't exist anymore.
It's not very common -- on our production system a node will typically report one such backtrace for either the object-server or the object-replicator per day
A potential fix is to wrap the rmdir/unlink in a try/except and do an os.stat() in the exception to see if it exits anymore. However, I was looking for a race, but is it? I don't want to patch over some other source of the problem.
Changed in swift: | |
status: | Expired → Fix Released |
needs to be confirmed on recent versions of swift