Comment 5 for bug 807475

Revision history for this message
Nikodemus Siivola (nikodemus) wrote :

commit 5abf3b4b94c8c2315777e63729293395dc54992c
Author: Nikodemus Siivola <email address hidden>
Date: Mon Aug 15 14:33:49 2011 +0300

    fix bogus deadlocks from interrupts and GCs

     lp#807475

     Going in despite the freeze since this is a regression that can
     semi-randomly break correct code. *ouch*

     Thanks to Bart Bortta and #sbcl for the analysis.

     Problem 1:

       T1 holds L1

       T2 is waiting for L1

       T2 is interrupted, interrupt handler grabs L2

       T1 starts waiting on L2

       Prior to this patch, when GET-MUTEX in T2's interrupt handler grabbed
       L2 is marked T2 as still waiting for L1 -- which is not true until
       the interrupt handler returns.

     Problem 2:

       T1 holds L1

       T2 is waiting for L1

       GC is triggered in T2 inside GET-MUTEX

       T2 grabs *ALREADY-IN-GC* lock

       GC is triggered in T1, T1 tries to get *ALREADY-IN-GC* lock.

       Prior to this patch, when T1 detects a bogus deadlock as T2 has
       been marked as waiting for L1 -- which is not true until the GC is
       finished and normal execution resumes.

     Problem 3:

       T1 holds L1

       T2 is waiting for L1

       GC is triggered in T2 inside GET-MUTEX

       T2 grabs lock L2 due to a finalizer or an after-gc-hook

       GC is triggered in T1

       T1 tries to grab L2 due to a finalizer, etc.

       Same as problem 2, but with a user-lock and POST-GC instead of
       *ALREADY-IN-GC* and SUB-GC.

     This patch fixes the issue by saving, clearing, and restoring
     the waiting-for mark in

      1) interrupt handlers

      2) SUB-GC

      3) POST-GC