Object-server worker hung after hitting Too many open files

Bug #1830600 reported by Bhaskar Singhal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
New
Undecided
Unassigned

Bug Description

I am running swift-rocky cluster, and was pushing data using 256 clients. After a while I noticed the following errors/warning in the log:

[4553]: STDERR: WARNING:root:Unable to perform fsync() on directory /swift_disk/volume-00000e51/objects/11490: Too many open files
[4553]: STDERR: Traceback (most recent call last):
[4553]: STDERR: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
[4553]: STDERR: timer()
[4553]: STDERR: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
[4553]: STDERR: cb(*args, **kw)
[4553]: STDERR: File "/usr/lib/python2.7/dist-packages/eventlet/semaphore.py", line 145, in _do_acquire
[4553]: STDERR: waiter.switch()
[4553]: STDERR: error: cannot switch to a different thread

No new requests are processed by this worker.

gdb shows bunch of threads waiting for GIL and rest with the following back trace:

Thread 20 (Thread 0x7f890e7fc700 (LWP 4611)):
Traceback (most recent call first):
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/poll.py", line 82, in wait
    sleep(seconds)
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 346, in run
    self.wait(sleep_time)
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 294, in switch
    return self.greenlet.switch()
  File "/usr/lib/python2.7/dist-packages/eventlet/semaphore.py", line 113, in acquire
    hubs.get_hub().switch()
  File "/usr/lib/python2.7/threading.py", line 174, in acquire
    rc = self.__block.acquire(blocking)
  File "/usr/lib/python2.7/logging/__init__.py", line 708, in acquire
    self.lock.acquire()
  File "/usr/lib/python2.7/logging/__init__.py", line 757, in handle
    self.acquire()
  File "/usr/lib/python2.7/logging/__init__.py", line 1336, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python2.7/logging/__init__.py", line 1296, in handle
    self.callHandlers(record)
  File "/usr/lib/python2.7/logging/__init__.py", line 1286, in _log
    self.handle(record)
  File "/usr/lib/python2.7/logging/__init__.py", line 1193, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/usr/lib/python2.7/logging/__init__.py", line 1603, in error
    root.error(msg, *args, **kwargs)
  File "/usr/lib/python2.7/logging/__init__.py", line 1611, in exception
    error(msg, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/swift/obj/diskfile.py", line 1733, in _finalize_put
    logging.exception(_('Problem cleaning up %s'), self._datadir)
  File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
    rv = meth(*args, **kwargs)
  File "/usr/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 774, in __bootstrap
    self.__bootstrap_inner()

attaching the py-bt for all threads.

Any pointers?

Revision history for this message
Bhaskar Singhal (bhaskarsinghal) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.