dx-let in with-deadline causes memory faults

Bug #2026195 reported by Jake Connor
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
Undecided
Unassigned

Bug Description

We're running into memory errors while printing thread objects that are/were waiting on a mutex while a deadline is active.

Because with-deadline uses a dynamic-extent declaration on the deadline itself, we seem to be starting to print the thread while the deadline is part of the thread-waiting-for cons, and by the time we go to print the thread-waiting-for slot, the deadline object no longer exists, causing a memory error.

Running the following function reproduces this.

(defun bug ()
  (let ((mutex (sb-thread:make-mutex))
        wait)
    (sb-thread:grab-mutex mutex)
    (let ((thread (sb-thread:make-thread
                   (lambda ()
                     (sb-impl::with-deadline (:seconds 60 :override t)
                       (sb-thread:with-mutex (mutex :wait-p t)
                         (sleep 2)))))))
      (sleep 1.0)
      (setf wait (sb-thread::thread-waiting-for thread))
      (sleep 1.0)
      (sb-thread:release-mutex mutex)
      (sleep 10)
      wait)))

Expected result:
(#S(SB-IMPL::DEADLINE :INTERNAL-TIME 116461087 :SECONDS 60)
 . #<SB-THREAD:MUTEX free owner=0 {1007D08563}>)

Actual Result:
CORRUPTION WARNING in SBCL pid 3713 tid 3713:
Memory fault at 0x7f1b (pc=0x52ca29b6 [code 0x52ca2800+0x1B6 ID 0x40ce], fp=0x7f06eafdf658, sp=0x7f06eafdf648) tid 3713
The integrity of this image is possibly compromised.
Continuing with fingers crossed.

debugger invoked on a SB-SYS:MEMORY-FAULT-ERROR in thread
#<THREAD tid=3713 "main thread" RUNNING {10011A8003}>:
  Unhandled memory fault at #x7F1B.

Changing with-deadline to use let instead of dx-let on the deadline object seems to fix the issue.

Environment stuff:

sbcl --version
> SBCL 2.3.3
and
> SBCL 2.3.6.68-a036ebd21-WIP

*features*
(:CLPM-CLIENT :ASDF3.3 :ASDF3.2 :ASDF3.1 :ASDF3 :ASDF2 :ASDF :OS-UNIX
 :NON-BASE-CHARS-EXIST-P :ASDF-UNICODE :ARENA-ALLOCATOR :X86-64 :GENCGC :64-BIT
 :ANSI-CL :COMMON-LISP :ELF :IEEE-FLOATING-POINT :LINUX :LITTLE-ENDIAN
 :PACKAGE-LOCAL-NICKNAMES :SB-LDB :SB-PACKAGE-LOCKS :SB-THREAD :SB-UNICODE
 :SBCL :UNIX)

uname -a
Linux jconnor-desktop-c3636-service-vnc.service.consul 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Stas Boukarev (stassats) wrote :

I'm don't understand your examples, why does it use sb-thread::thread-waiting-for?

Stas Boukarev (stassats)
Changed in sbcl:
status: New → Incomplete
Revision history for this message
Jake Connor (jakeconnor) wrote :

We're seeing the memory fault while printing the thread, the printed representation of the thread includes the slot thread-waiting-for, which contains the deadline. My example mocks this to avoid relying on a race condition.

The situation we're seeing appears to be:
1. The thread is waiting on the mutex under a deadline, so the dynamic-extent deadline object is in the cons in the thread-waiting-for slot
2. We start to print the thread, which will include the dynamic-extent deadline object
3. The thread gets the mutex, the deadline resolves (and becomes nothing due to dynamic-extent)
4. We continue printing the thread, and try to print the deadline object, which no longer exists, triggering a memory error.

An example of a stack trace from the actual fault:
0: (SB-KERNEL:OUTPUT-UGLY-OBJECT #<SB-IMPL::STRING-OUTPUT-STREAM {7F0E213D6663}> #S(SB-IMPL::DEADLINE :INTERNAL-TIME 145931355021 :SECONDS 0.1))
1: ((COMMON-LISP:LABELS SB-IMPL::HANDLE-IT :IN SB-KERNEL:OUTPUT-OBJECT) #<SB-IMPL::STRING-OUTPUT-STREAM {7F0E213D6663}>)
2: (COMMON-LISP:PRINC #S(SB-IMPL::DEADLINE :INTERNAL-TIME 145931355021 :SECONDS 0.1) #<SB-IMPL::STRING-OUTPUT-STREAM {7F0E213D6663}>)
3: (SB-FORMAT::A-INTERPRETER #<SB-IMPL::STRING-OUTPUT-STREAM {7F0E213D6663}> #<~A> (#<~^> #<~2I> #<~> " ") #<unused argument> NIL)
4: (SB-FORMAT::INTERPRET-DIRECTIVE-LIST #<SB-IMPL::STRING-OUTPUT-STREAM {7F0E213D6663}> (#<~A> #<~^> #<~2I> #<~> " ") ("waiting on:" #<SB-THREAD:MUTEX "mailbox lock" free owner=0> "timeout: " #S(SB-IMPL::DEADLINE...
5: ((COMMON-LISP:LABELS SB-FORMAT::DO-LOOP :IN SB-FORMAT::{-INTERPRETER) ("waiting on:" #<SB-THREAD:MUTEX "mailbox lock" free owner=0> "timeout: " #S(SB-IMPL::DEADLINE :INTERNAL-TIME 145931355021 :SECOND..
6: (SB-FORMAT::{-INTERPRETER #<SB-IMPL::STRING-OUTPUT-STREAM {7F0E213D6663}> #<~{> (#<~I> #<~A> #<~^> #<~2I> #<~> " " ...) (1293482 "CONSOLE-LOG" NIL ("waiting on:" #<SB-THREAD:MUTEX "mailbox lock" free..

Revision history for this message
Stas Boukarev (stassats) wrote :

Ok, but the issue is with the print-object method, so this is a bit too much reduction.

Changed in sbcl:
status: Incomplete → Confirmed
Revision history for this message
Jake Connor (jakeconnor) wrote :

Ah, sorry about that. Tunnel vision on the dynamic extent declaration. Thanks for clarifying

Revision history for this message
Jake Connor (jakeconnor) wrote :

so with the object in thread-waiting-for potentially being dynamic-extent, it seems best not to try to print it. I think it's still safe to check whether or not that slot is null, (exactly one access) so the attached patch retains the "waiting" information, but drops the detail of what is being waited on.

Revision history for this message
Stas Boukarev (stassats) wrote :

Printing the mutex is still useful information, my solution is to add a new slot, waiting-for-timeout.

Stas Boukarev (stassats)
Changed in sbcl:
status: Confirmed → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.