Losing finalizers

Bug #2029306 reported by Stas Boukarev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
Medium
Douglas Katzman

Bug Description

(defvar *x*)

(defun clear-registers (&optional a b c d e f g h i j k l m n o p q)
  (values q p o n m l k j i h g f e d c b a))

(defun test ()
  (setf *x* (make-hash-table))
  (clear-registers)
  (sb-ext:finalize *x* (lambda () (error "final")) :dont-save t)
  (clear-registers)
  (sb-sys:scrub-control-stack)
  (assert (sb-ext:cancel-finalization *x*)))

(sb-thread:make-thread (lambda () (loop (gc :full t) (sleep 0.01))))

(loop repeat 1
      do
      (sb-thread:make-thread (lambda () (let ((*x* nil))
                                          (loop (test))))))

Quickly fails the assertion, both x86-64 and arm64.

Revision history for this message
Douglas Katzman (dougk) wrote :

I suspect this a race between the user thread trying to cancel, and the finalizer thread which, in addition executing finalizers, has the responsibility of re-hashing the table when a key moves. So it only appears to lose finalizers in this slightly unusual case, but yeah it needs to be fixed. As support for my theory, introducing the tiniest bit of delay before cancel-finalization seems to make it not fail

Revision history for this message
Stas Boukarev (stassats) wrote :

The actual problem is from fd stream finalizers, where a finalizer does not get cancelled and calls close on the wrong fd, leading to bad results.

Revision history for this message
Douglas Katzman (dougk) wrote :

I have a sense that the problem stems from scan_finalizers not following the complete lockfree list maintenance protocol. It cheats because the world is stopped so there can't be any writers. I need to determine how little of the protocol I can get away with implementing in C, or else the garbage collector has to call into Lisp to implement SB-LOCKLESS:SO-DELETE which sounds like a terrible idea if the world is stopped. I bet I can just mark nodes to ignore, and the finalizer thread can snip them out somehow.

Douglas Katzman (dougk)
Changed in sbcl:
assignee: nobody → Douglas Katzman (dougk)
status: New → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.