object-updater should be more tolerant of already-removed async pendings
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
So I'm not entirely sure *why* this happened (since each worker should have been looking only at its own disk), but I've seen some tracebacks in a prod cluster:
May 10 00:21:40 ss0192 object-updater: STDERR: Traceback (most recent call last):
May 10 00:21:40 ss0192 object-updater: STDERR: File "/opt/ss/
May 10 00:21:40 ss0192 object-updater: STDERR: listener.cb(fileno)
May 10 00:21:40 ss0192 object-updater: STDERR: File "/opt/ss/
May 10 00:21:40 ss0192 object-updater: STDERR: result = function(*args, **kwargs)
May 10 00:21:40 ss0192 object-updater: STDERR: File "/opt/ss/
May 10 00:21:40 ss0192 object-updater: STDERR: renamer(
May 10 00:21:40 ss0192 object-updater: STDERR: File "/opt/ss/
May 10 00:21:40 ss0192 object-updater: STDERR: os.rename(old, new)
May 10 00:21:40 ss0192 object-updater: STDERR: OSError: [Errno 2] No such file or directory
May 10 00:21:40 ss0192 object-updater: STDERR: Removing descriptor: 7
May 10 00:22:03 ss0192 object-updater: UNCAUGHT EXCEPTION
Traceback (most recent call last):
File "/opt/ss/
run_
File "/opt/ss/
DaemonStrat
File "/opt/ss/
self.
File "/opt/ss/
return self._run_
File "/opt/ss/
self.
File "/opt/ss/
self.
File "/opt/ss/
self.
File "/opt/ss/
for update in ap_iter:
File "/opt/ss/
next_value = next(self.iterator)
File "/opt/ss/
os.
OSError: [Errno 2] No such file or directory: '/srv/node/
May 10 00:29:36 ss0192 object-updater: UNCAUGHT EXCEPTION
Traceback (most recent call last):
File "/opt/ss/
run_
File "/opt/ss/
DaemonStrat
File "/opt/ss/
self.
File "/opt/ss/
return self._run_
File "/opt/ss/
self.
File "/opt/ss/
self.
File "/opt/ss/
self.
File "/opt/ss/
for update in ap_iter:
File "/opt/ss/
next_value = next(self.iterator)
File "/opt/ss/
os.
OSError: [Errno 2] No such file or directory: '/srv/node/
The renamer may or may not be something worth logging tracebacks, but the ENOENT bombs us out for the *entire disk* and we'll have to wait a full updater cycle to try any of the rest of those pendings. It'd be much better to ignore the error and move on.
Reviewed: https:/ /review. opendev. org/726738 /git.openstack. org/cgit/ openstack/ swift/commit/ ?id=f57d4cfa718 88c887e0e8e0ce3 49f2a5befb57a5
Committed: https:/
Submitter: Zuul
Branch: master
commit f57d4cfa71888c8 87e0e8e0ce349f2 a5befb57a5
Author: Tim Burke <email address hidden>
Date: Mon May 11 00:09:49 2020 -0700
object-updater: Ignore ENOENT when trying to unlink stale pending files
Change-Id: Iaac1fb891d7070 7af38c567d9cca5 913b8355b7d
Closes-Bug: #1877924