Weak hash-tables are not gc-safe.

Bug #2096998 reported by Stas Boukarev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
High
Unassigned

Bug Description

(defvar *h* (make-hash-table :test #'eq :weakness :key))
(sb-thread:make-thread (lambda ()
                         (loop (gc :full t) (sleep 0.001)))
                       :name "gc stress")
(loop (setf (gethash (cons 1 2) *h*) 10))

GC invariant lost, file "gc-common.c", line 1794

cull_weak_hash_table_bucket
gc_assert(value != empty_symbol);

Bisected to

https://github.com/sbcl/sbcl/commit/4d328c3324e10e24397603d65ed755403eb365fb

commit 4d328c3324e10e24397603d65ed755403eb365fb (HEAD)
Author: Douglas Katzman <email address hidden>
Date: Sun Jan 28 14:25:55 2024 -0500

    Implement a hash-table chain correctness checker in C

    A later commit will enable the assertions. This piece is separated out
    since there is a small required change to PUTHASH which is trivial
    but interesting, and should not get lost in the noise.

Revision history for this message
Douglas Katzman (dougk) wrote :

could not repro on x86-64 but could on arm64. relaxed-memory-order is affecting it, I need to find where

Revision history for this message
Stas Boukarev (stassats) wrote :

I can reproduce both x86-64 and arm64.

Revision history for this message
Stas Boukarev (stassats) wrote :

4d328c3324e10e24397603d65ed755403eb365fb crashes much quicker.

Revision history for this message
Douglas Katzman (dougk) wrote :

my x86-64 macbook was able to get the crash in 4d328c3. Putting (sb-thread:barrier (:write)) at the end of insert-at did not fix it, so I'll keep looking

Revision history for this message
Stas Boukarev (stassats) wrote :

There's only one thread, I don't see what barriers can fix. The gc thread does stop the world, where memory reordering doesn't happen.

Revision history for this message
Robert Brown (robert-brown) wrote :

I have some old SBCL versions on my slow Intel Celeron N3450 laptop. The code crashes recent SBCL releases reliably in a few seconds for me. On my laptop, it looks like some change between 2.3.11 and 2.4.2 introduced a bug.

Revision history for this message
Douglas Katzman (dougk) wrote :

I think the failing assertion is just wrong, as the code is now. It's possible to interrupt for GC between the store to key and store to value.
The assertion was correct at a much earlier revision, namely up to and including
commit 92bb608a50a2163864b9a445ef333064d4df924f
    Don't inhibit GC during weak table operations except rehash

I want to think a tiny bit more and then will remove the assertion.

Revision history for this message
Stas Boukarev (stassats) wrote :

If it is indeed an errant assertion then it would be nice to have it removed before the release.

Douglas Katzman (dougk)
Changed in sbcl:
status: Triaged → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.