with-timeout is broken when using (gc-logfile)

Bug #1754785 reported by Andrew Berkley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Incomplete
Undecided
Unassigned

Bug Description

The problem is that the post-gc hooks are wrapped with a handler-case that ignores serious-conditions and run in the thread that triggered the gc (I think). So anything that does asynchronous delivery of signals (for example with-timeout, interrupt-thread, etc) will suffer from missed signals once in awhile (notably if the timeouts are shorter than the gc time).

Here is code to reproduce the problem on sbcl 1.3.0:

(defparameter *blarg* nil)

(defun test-timeout (&optional (tries 30))
  (let ((fail nil))
    (labels ((check ()
               (let ((iterations 0))
                 (loop :until (or fail (>= iterations 100))
                    :do
                    (block test-block
                      (incf iterations)
                      (handler-bind
                          ((timeout (lambda (e) (declare (ignore e)) (setf fail nil) (return-from test-block))))
                        (with-timeout 0.1
                          (setf *blarg* (make-array 1000000))
                          (sleep 2)
                          (setf fail t))))))))
  (format t "Trying with gc-logfile off: ")
  (setf (gc-logfile) nil)
  (dotimes (x tries) (check))
  (format t "~A~%" (if fail "FAIL" "PASS"))
  (setf (gc-logfile) "/tmp/blarg")
  (format t "Trying with gc-logfile on: ")
  (dotimes (x tries) (check))
  (format t "~A~%" (if fail "FAIL" "PASS")))))

CL-USER> (test-timeout)
Trying with gc-logfile off: PASS
Trying with gc-logfile on:
WARNING:
Problem running after-GC hook #<CLOSURE SB-KERNEL::LOG-GC-REPORT-TO-FILE>:
Timeout occurred.
FAIL

Revision history for this message
Douglas Katzman (dougk) wrote :

I built SBCL 1.3.0 and ran this exactly as shown, many times over, and was unable to reproduce the problem.

I'm suspicious of the example being self-contained, as there are no post-GC hooks by default, and I can't imagine that the time it takes to do just about nothing in CALL-HOOKS would affect this.
I'd be willing to accept that stdio using malloc (as relates to file I/O performed as GC tries to write the log) is a problem, but it's not this problem.

Anyway, 1.3.0 is about 3 and a half years old and I think this report is probably not relevant, so I feel obliged to close it as "incomplete"

Changed in sbcl:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.