"deadlock cycle" in threads.impure.lisp (:HASH-CACHE :SUBTYPEP)+:PARALLEL-DEFCLASS
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
Fix Released
|
Critical
|
Unassigned |
Bug Description
The (:HASH-CACHE :SUBTYPEP) test in threads.impure.lisp doesn't wait for the threads to exit, so they are still running when the :PARALLEL-DEFCLASS test runs. Something about the combination sometimes triggers
WARNING: DEADLOCK CYCLE DETECTED:
#<SB-
#<MUTEX "GC lock"
owner: #1=#<SB-
{10040A8431}>
#<SB-
#<MUTEX "World Lock"
owner: #1=#<SB-
{10040A8601}>
END OF CYCLE
fatal error encountered in SBCL pid 8127(tid 140737278113536):
Trapping to run pending handler while GC in progress.
Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>
with LDB not responding to input.
Running either test in a loop by itself runs to completion, but running both in a loop as in the following test case tends to break fairly reliably (usually within 5-50 iterations here), either with unresponsive LDB or locking up after apparently trying to start the normal debugger:
(with-test (:name :parallel-defclass)
(loop for i below 100
do
(format t "~%~%starting subtypep-
(dotimes (i 10)
(format t "starting :parallel-defclass, pass ~s...~%" i)
(progn
(defclass test-1 () ((a :initform :orig-a)))
(defclass test-2 () ((b :initform :orig-b)))
(defclass test-3 (test-1 test-2) ((c :initform :orig-c)))
(let* ((run t)
(d1 (sb-thread:
(d2 (sb-thread:
(d3 (sb-thread:
(i (sb-thread:
(format t "~%sleeping!~%")
(sleep 3.0)
(format t "~%stopping!~%")
(setf run nil)
(mapc (lambda (th)
(format t " :parallel-defclass, pass ~s done~%" i)))
Testing on SBCL 1.0.49.79-ba12c5c, x8664 linux.
Possibly related to Bug #308959?
Changed in sbcl: | |
importance: | High → Critical |
Changed in sbcl: | |
status: | Fix Committed → Fix Released |
The real question is, how and why does D2 try to acquire World Lock while holding the GC lock? Or does not it really do that, but that there's a bug in deadlock detection instead?
(The entry into LDB makes me think there's a bug not related to deadlock detection here, though.)