Comment 4 for bug 861742

Revision history for this message
Robert Collins (lifeless) wrote :

frame 2 is hung on
status = fix_status(sem_wait(thelock));

The two key python lines
/srv/launchpad.net/production/launchpad-rev-14560/lib/canonical/launchpad/webapp/sigusr2.py (20): sigusr2_handler
/usr/lib/python2.6/threading.py (117): acquire

sigusr2_handler is simple - its just calling into reopenFiles

threading.py line 117 is a call to thread.get_ident() called simply as 'lock.acquire' from logging/__init__.py

however the core thinks we're on line 123:
#22 0x00000000004a1b03 in PyEval_CallObjectWithKeywords (func=<function at remote 0x75ea848>, arg=
    (12, Frame 0x1431c760, for file /usr/lib/python2.6/threading.py, line 123, in acquire (self=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x30c00f0>, _RLock__count=0) at remote 0x30fa250>, blocking=1, me=47168797054336)), kw=0x0) at ../Python/ceval.c:3619
#23 0x00000000004d861f in PyErr_CheckSignals () at ../Modules/signalmodule.c:868
#24 0x00000000004a2037 in Py_MakePendingCalls () at ../Python/ceval.c:452
#25 0x00000000004a337c in PyEval_EvalFrameEx (f=
    Frame 0x1431c760, for file /usr/lib/python2.6/threading.py, line 123, in acquire (self=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x30c00f0>, _RLock__count=0) at remote 0x30fa250>, blocking=1, me=47168797054336), throwflag=<value optimized out>) at ../Python/ceval.c:871

which is much less innocuous - that is a call to the underlying python lock.
Note this *not-threadsafe* code:
        rc = self.__block.acquire(blocking)
        if rc:
            self.__owner = me
            self.__count = 1
            if __debug__:
                self._note("%s.acquire(%s): initial success", self, blocking)
        else:
            if __debug__:
                self._note("%s.acquire(%s): failure", self, blocking)

If the low level acquire succeeds, but the low level primitive is not naturally reentrant, then until the 'if rc' code path executes, other calls to this threads acquire method will deadlock.

I need to check the low level primitive code path still, but this is a good contender for the bug.