With Linux PREEMPT RT kernel condition-wait does not wait
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
New
|
Undecided
|
Unassigned |
Bug Description
Something in the PREEMPT RT scheduler triggers a pathological case in condition-wait almost every time.
%condition-wait:
#!+sb-futex
;; Now we go to sleep using futex-wait. If anyone else
;; manages to grab MUTEX and call CONDITION-NOTIFY during
;; this comment, it will change the token, and so futex-wait
;; returns immediately instead of sleeping. Ergo, no lost
;; wakeup. We may get spurious wakeups, but that's ok.
The fragment of code above assumes that waitqueue-token is unlikely to be modified between release-mutex and futex-wait. While that is usually true, something in the RT scheduler or its settings breaks this assumption. In a contended case when 100 threads call condition-wait at the same time, none of them ever get to futex-wait with an unmodified token and condition-wait becomes a busy loop. I can't reproduce this problem with a standard Linux kernel event if I use 50000 threads - futex_wait fails in 0.1% of cases and all the threads go to sleep very quickly.
Here is a pathological sequence:
A grabs mutex
A sets token to its id
A releases mutex
B grabs mutex
B sets token to its id
A fails futex_wait (because the current token is from B)
B releases mutex
A grabs mutex
A sets token to its id
B fails futex_wait (because the current token is from A)
and so on
Environment: Linux 3.10.0-
we could probably offer an option to use pthread condition vars, basically let "someone else" deal with it. cond_destroy( ), and translating the lisp functions almost directly into the appropriate foreign call for waiting and notifying.
This would be done by pointing to a malloc()ed thing from the lisp structure, doing a pthread_cond_init() on the C object, and attaching a finalizer which does pthread_
I wonder if the thing preventing that in the past was that finalizers were so bad for GC that having any at all pretty much trashed your application performance.