"GC invariant lost" when loading a fasl file
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
Fix Released
|
High
|
Unassigned |
Bug Description
My system runs fine in production on SBCL 1.3.9. However, when I try to upgrade to the latest SBCL (1.4.16), the compilation crashes. To ensure that this has nothing to do with the configuration of my development box, I launched a fresh instance on the EC2. The problem presists. Next, I did a binary search on the old SBCL releases and found out that on 1.3.15 everything is fine, but on 1.3.16 the crashes happens. The crash occurs also with the current head of the Git master branch.
In 1.3.16 release notes, there is the entry "optimization: various small tweaks give around 5% faster garbage collection" - maybe this optimization is the culprit?
I managed to isolate the source of the crash into certain (autogenerated) piece of code (attached). It is not hard to circumvent the problem by modifying the code, but I'm still not upgrading to 1.4.6 on the production servers with this bug hiding somewhere.
With the isolated problem code, SBCL 1.3.16 drops to LDB with the following message:
GC invariant lost, file "gc-common.c", line 309
on SBCL 1.4.6, the message is a bit different:
no scavenge function for object 0x358b48fff402733d (widetag 0x3d)
but when I compile the full system, 1.4.6 also usually gives the "GC invariant lost" error. So, apparently, this is due to some kind of memory corruption.
To reproduce, just launch SBCL, (compile-file "crash-sbcl.lisp") and then (load "crash-sbcl.fasl"). Compilation and loading do not even need to happen on the same session - just starting SBCL and loading the FASL triggers the problem.
Note that specifying a larger dynamic space size (e.g. 16G) makes the problem disappear on the test case, but the problem still occurs with the full software. The default one-gigabyte heap causes the problem every time.
Changed in sbcl: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in sbcl: | |
status: | Triaged → Fix Released |
Reduced to
(progn
(eval `(lambda () ,@(loop for i below 40000 collect `(setf * ,(float i)))))
(gc))
caused by constants overflowing the header.