Bug #308959 “*world-lock* deadlock issues” : Bugs : SBCL

Nikodemus Siivola (nikodemus) on 2008-12-19

Changed in sbcl:
importance:	Undecided → Medium
status:	New → Confirmed

Revision history for this message

Attila Lendvai (attila-lendvai) wrote on 2009-12-07:

#1

PCL also calls CL:COMPILE for filling its caches, which in turn can block the world.

annoying example:

- asdf is loading something (so, there's a with-compilation-unit on the stack)

- a compile error brings up the debugger

- while investigating/fixing, user tries to use slime fuzzy completion which invokes some generics

- first call, so cache is empty for the generic, thus PCL tries to grab the lock from a random swank worker (to compile a stub?)

- slime hangs

Nikodemus Siivola (nikodemus) on 2011-12-08

Changed in sbcl:
assignee:	nobody → Nikodemus Siivola (nikodemus)

Nikodemus Siivola (nikodemus) on 2012-05-21

Changed in sbcl:
status:	Confirmed → In Progress

Revision history for this message

Attila Lendvai (attila-lendvai) wrote on 2012-12-22:

#2

it's nothing urgent on my part, just as a data point that may help: i've updated the sbcl that runs dwim.hu from this (+ small changes):

commit abb03f939ada55bdc1856df5cc48815fd0dff69d
1.0.55: will be tagged as "sbcl-1.0.55"

to the current head (+ the same small changes):

2b29a7c2b236cfab1d4d06311e84414abba71b4c
Dec 21, 2012

and when i want to start the server it hangs quite early in filling method caches:

2012-12-22T04:51:53.228687+01:00: Starting up server, PID is 11740
debugger invoked on a SB-SYS:INTERACTIVE-INTERRUPT in thread #<THREAD "main thread" RUNNING {100C5710A3}>: Interactive interrupt at #x7FFFF7488BE7.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [CONTINUE ] Return from SB-UNIX:SIGINT.
  1: [SAVE-CORE-AND-DIE] Save image to /tmp/sbcl.core and die
  2: [ABORT ] Give up starting the image and quit the VM process with exit code 2

("bogus stack frame")
0] back

  0: ("bogus stack frame")
  1: (SB-THREAD::%%WAIT-FOR-MUTEX
      #<unavailable argument>
      #<unavailable argument>
      #<unavailable argument>
      #<unavailable argument>
      #<unavailable argument>)
  2: (SB-THREAD::%WAIT-FOR-MUTEX
      #<SB-THREAD:MUTEX "World Lock" owner: #<SB-THREAD:THREAD "main thread" RUNNING {10001EA793}>>
      #<SB-THREAD:THREAD "main thread" RUNNING {100C5710A3}>
      NIL
      NIL
      NIL
      NIL
      NIL
      NIL)
  3: ((FLET #:WITHOUT-INTERRUPTS-BODY-465 :IN SB-THREAD::CALL-WITH-RECURSIVE-LOCK))
  4: (SB-THREAD::CALL-WITH-RECURSIVE-LOCK
      #<CLOSURE (FLET SB-THREAD::WITH-RECURSIVE-LOCK-THUNK :IN SB-PCL::CHECK-WRAPPER-VALIDITY) {7FFFF6FEF7EB}>
      #<SB-THREAD:MUTEX "World Lock" owner: #<SB-THREAD:THREAD "main thread" RUNNING {10001EA793}>>
      T
      NIL)
  5: (SB-PCL::CHECK-WRAPPER-VALIDITY #<error printing a HU.DWIM.HOME:HOME-SERVER: #<SB-SYS:INTERACTIVE-INTERRUPT {100C892343}>>)
  6: (SB-PCL::CACHE-MISS-VALUES
      #<STANDARD-GENERIC-FUNCTION HU.DWIM.WEB-SERVER::LISTEN-ENTRIES-OF (1)>
      (#<error printing a CONS: #<SB-SYS:INTERACTIVE-INTERRUPT {100C897103}>> SB-PCL::ACCESSOR)
  7: (SB-PCL::INITIAL-DFUN
      #<STANDARD-GENERIC-FUNCTION HU.DWIM.WEB-SERVER::LISTEN-ENTRIES-OF (1)>
      (#<error printing a CONS: #<SB-SYS:INTERACTIVE-INTERRUPT {100C89B683}>>)
[...]

the behavior is very strange, because if i C-c it and ask for a backtrace, then i get some 4 frames, and then it hangs again. then if i C-c is again a couple of times, then i get some more frames. if i repeat, i can get a full backtrace incrementally.

the application is a saved executable core, and it installs signal handlers on C-c like this:

http://dwim.hu/darcsweb/darcsweb.cgi?r=HEAD%20hu.dwim.util;a=headblob;f=/source/production.lisp#l228

it's nothing urgent on my part, just as a data point that may help: i've updated the sbcl that runs dwim.hu from this (+ small changes):

commit abb03f939ada55bdc1856df5cc48815fd0dff69d
    1.0.55: will be tagged as "sbcl-1.0.55"

to the current head (+ the same small changes):

2b29a7c2b236cfab1d4d06311e84414abba71b4c
Dec 21, 2012

and when i want to start the server it hangs quite early in filling method caches:

2012-12-22T04:51:53.228687+01:00: Starting up server, PID is 11740
debugger invoked on a SB-SYS:INTERACTIVE-INTERRUPT in thread #<THREAD "main thread" RUNNING {100C5710A3}>: Interactive interrupt at #x7FFFF7488BE7.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [CONTINUE         ] Return from SB-UNIX:SIGINT.
  1: [SAVE-CORE-AND-DIE] Save image to /tmp/sbcl.core and die
  2: [ABORT            ] Give up starting the image and quit the VM process with exit code 2

("bogus stack frame")
0] back

0: ("bogus stack frame")
  1: (SB-THREAD::%%WAIT-FOR-MUTEX
      #<unavailable argument>
      #<unavailable argument>
      #<unavailable argument>
      #<unavailable argument>
      #<unavailable argument>)
  2: (SB-THREAD::%WAIT-FOR-MUTEX
      #<SB-THREAD:MUTEX "World Lock" owner: #<SB-THREAD:THREAD "main thread" RUNNING {10001EA793}>>
      #<SB-THREAD:THREAD "main thread" RUNNING {100C5710A3}>
      NIL
      NIL
      NIL
      NIL
      NIL
      NIL)
  3: ((FLET #:WITHOUT-INTERRUPTS-BODY-465 :IN SB-THREAD::CALL-WITH-RECURSIVE-LOCK))
  4: (SB-THREAD::CALL-WITH-RECURSIVE-LOCK
      #<CLOSURE (FLET SB-THREAD::WITH-RECURSIVE-LOCK-THUNK :IN SB-PCL::CHECK-WRAPPER-VALIDITY) {7FFFF6FEF7EB}>
      #<SB-THREAD:MUTEX "World Lock" owner: #<SB-THREAD:THREAD "main thread" RUNNING {10001EA793}>>
      T
      NIL)
  5: (SB-PCL::CHECK-WRAPPER-VALIDITY #<error printing a HU.DWIM.HOME:HOME-SERVER: #<SB-SYS:INTERACTIVE-INTERRUPT {100C892343}>>)
  6: (SB-PCL::CACHE-MISS-VALUES
      #<STANDARD-GENERIC-FUNCTION HU.DWIM.WEB-SERVER::LISTEN-ENTRIES-OF (1)>
      (#<error printing a CONS: #<SB-SYS:INTERACTIVE-INTERRUPT {100C897103}>> SB-PCL::ACCESSOR)
  7: (SB-PCL::INITIAL-DFUN
      #<STANDARD-GENERIC-FUNCTION HU.DWIM.WEB-SERVER::LISTEN-ENTRIES-OF (1)>
      (#<error printing a CONS: #<SB-SYS:INTERACTIVE-INTERRUPT {100C89B683}>>)
[...]

the behavior is very strange, because if i C-c it and ask for a backtrace, then i get some 4 frames, and then it hangs again. then if i C-c is again a couple of times, then i get some more frames. if i repeat, i can get a full backtrace incrementally.

the application is a saved executable core, and it installs signal handlers on C-c like this:

http://dwim.hu/darcsweb/darcsweb.cgi?r=HEAD%20hu.dwim.util;a=headblob;f=/source/production.lisp#l228

Revision history for this message

Faré (fahree) wrote on 2012-12-22:

#3

In case it helps, you might want to try the various methods Christophe suggests for pre-compilation. See the November 2012 entries of:
http://www.advogato.org/person/crhodes/

And something like that works, I'm game to see what exactly.

Stas Boukarev (stassats) on 2017-11-18

Changed in sbcl:
status:	In Progress → Triaged

Christophe Rhodes (csr21-cantab) on 2018-12-30

Changed in sbcl:
status:	Triaged → Fix Released

SBCL

world-lock deadlock issues

Bug Description

Other bug subscribers

Remote bug watches