Quickly hit nproc limit

Bug #1264561 reported by Caleb Tennis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Unassigned

Bug Description

Most stock Linux systems come with a soft limit of nproc set to 1024, which can be low now due to threads_per_disk giving us an effective multiplier on total number of object server processes. If workers * threads_per_disk * disks > 1024, we get this error:

Dec 26 03:49:28 object-server ERROR __call__ error with HEAD /d95/236790/AUTH_sd-fuse/c0003125/lib/python2.6/site-packages/yum/pkgtag_db.pyc : #012Traceback (most recent call last):#012 File "/opt/ss/lib/python2.7/site-packages/swift/obj/server.py", line 631, in __call__#012 res = method(req)#012 File "/opt/ss/lib/python2.7/site-packages/swift/common/utils.py", line 1870, in wrapped#012 return func(*a, **kw)#012 File "/opt/ss/lib/python2.7/site-packages/swift/common/utils.py", line 686, in _timing_stats#012 resp = func(ctrl, *args, **kwargs)#012 File "/opt/ss/lib/python2.7/site-packages/swift/obj/server.py", line 514, in HEAD#012 obj)#012 File "/opt/ss/lib/python2.7/site-packages/swift/obj/server.py", line 114, in _diskfile#012 kwargs.setdefault('threadpool', self.threadpools[device])#012 File "/opt/ss/lib/python2.7/site-packages/swift/obj/server.py", line 86, in <lambda>#012 lambda: ThreadPool(nthreads=self.threads_per_disk))#012 File "/opt/ss/lib/python2.7/site-packages/swift/common/utils.py", line 2054, in __init__#012 thr.start()#012 File "/opt/ss/lib/python2.7/threading.py", line 494, in start#012 _start_new_thread(self.__bootstrap, ())#012error: can't start new thread

Since Swift already raises/sets nofile and data limits for the processes at startup, add some logic to do the same to nproc. The 8192 choice is arbitrary on my part, but seems a reasonable number.

Revision history for this message
Caleb Tennis (ctennis) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/64297
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=d1dd14395259b78ed356b26ca07c54ba22034aaa
Submitter: Jenkins
Branch: master

commit d1dd14395259b78ed356b26ca07c54ba22034aaa
Author: Caleb Tennis <email address hidden>
Date: Fri Dec 27 17:38:34 2013 -0500

    Up nproc limit on startup.

    Separate out setrlimit calls for specific exception handling.

    Closes-Bug: #1264561
    Change-Id: I5588f19f8d0393409580d17317727977758d5cb3

Changed in swift:
status: New → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/ec)

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/66462

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/ec)
Download full text (23.8 KiB)

Reviewed: https://review.openstack.org/66462
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=3895441afd1f8ca49a09a483f402a961009a8661
Submitter: Jenkins
Branch: feature/ec

commit bad52f11218a11978d1efb0832f164a60a363cc2
Author: Clay Gerrard <email address hidden>
Date: Fri Jan 10 00:31:55 2014 -0800

    Allow programmatic reloading of Swift hash path config

    New util's function validate_hash_conf allows you to programmatically reload
    swift.conf and hash path global vars HASH_PATH_SUFFIX and HASH_PATH_PREFIX
    when they are invalid.

    When you load swift.common.utils before you have a swift.conf there's no good
    way to force a re-read of swift.conf and repopulate the hash path config
    options - short of restarting the process or reloading the module - both of
    which are hard to unittest. This should be no worse in general and in some
    cases easier.

    Change-Id: I1ff22c5647f127f65589762b3026f82c9f9401c1

commit 7b9c283203479cb9916951e1ce1f466f197dea36
Author: Samuel Merritt <email address hidden>
Date: Fri Jan 10 12:57:53 2014 -0800

    Add missing license header to test file

    All the other tests have license headers, so this one should too.

    I picked 2013 for the copyright year because that's when "git log"
    says it was first and last touched.

    Change-Id: Idd41a179322a3383f6992e72d8ba3ecaabd05c47

commit 47fcc5fca2c5020b69f3c2c7f0a8032f6c77354a
Author: Christian Schwede <email address hidden>
Date: Fri Jan 10 07:14:43 2014 +0000

    Update account quota doc

    A note was added stating that the same limitations apply to
    account quotas as for container quotas. An example on uploads
    without a content-length headers was added.

    Related-Bug: 1267659
    Change-Id: Ic29b527cb71bf5903c2823844a1cf685ab6813dd

commit 6426f762d0d87063f9813630c620d880a4191046
Author: Peter Portante <email address hidden>
Date: Mon Dec 9 20:52:58 2013 -0500

    Raise diskfile.py module coverage to > 98%

    We attempt to get the code coverage (with branch coverage) to 100%,
    but fall short because due to interactions between coverage.py and
    CPython's peephole optimizer. See:

        https://bitbucket.org/ned/coveragepy/issue/198/continue-marked-as-not-covered

    In the main diskfile module, we remove the check for a valid
    "self._tmppath" since it is only one of a number of fields that could
    be verified and it was not worth trying to get coverage for it. We
    also remove the try / except around the close() method call in the
    DiskFileReader's app_iter_ranges() method since it will never be
    called in a context that will raise a quarantine exception (by
    definition ranges can't generate a quarantine event).

    We also:

    * fix where quarantine messages are checked to ensure the
      generator is actually executed before the check
    * in new and modified tests:
      * use assertTrue in place of assert_
      * use assertEqual in place of assertEquals
    * fix references to the reserved word "object"

    Change-Id: I6379be04adfc5012cb0b91748fb3ba3f11200b48

commit 5196eae...

Thierry Carrez (ttx)
Changed in swift:
milestone: none → 1.12.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.