Object-server worker hung after hitting Too many open files
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
New
|
Undecided
|
Unassigned |
Bug Description
I am running swift-rocky cluster, and was pushing data using 256 clients. After a while I noticed the following errors/warning in the log:
[4553]: STDERR: WARNING:root:Unable to perform fsync() on directory /swift_
[4553]: STDERR: Traceback (most recent call last):
[4553]: STDERR: File "/usr/lib/
[4553]: STDERR: timer()
[4553]: STDERR: File "/usr/lib/
[4553]: STDERR: cb(*args, **kw)
[4553]: STDERR: File "/usr/lib/
[4553]: STDERR: waiter.switch()
[4553]: STDERR: error: cannot switch to a different thread
No new requests are processed by this worker.
gdb shows bunch of threads waiting for GIL and rest with the following back trace:
Thread 20 (Thread 0x7f890e7fc700 (LWP 4611)):
Traceback (most recent call first):
File "/usr/lib/
sleep(seconds)
File "/usr/lib/
self.
File "/usr/lib/
return self.greenlet.
File "/usr/lib/
hubs.
File "/usr/lib/
rc = self.__
File "/usr/lib/
self.
File "/usr/lib/
self.acquire()
File "/usr/lib/
hdlr.
File "/usr/lib/
self.
File "/usr/lib/
self.
File "/usr/lib/
self.
File "/usr/lib/
root.error(msg, *args, **kwargs)
File "/usr/lib/
error(msg, *args, **kwargs)
File "/usr/lib/
logging.
File "/usr/lib/
rv = meth(*args, **kwargs)
File "/usr/lib/
self.
File "/usr/lib/
self.run()
File "/usr/lib/
self.
attaching the py-bt for all threads.
Any pointers?