SBCL

Difficult to reproduce bug

Bug #2059888 reported by Paul F. Dietz on 2024-04-01

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	SBCL	Fix Released	Undecided	Unassigned

Bug Description

I'm seeing a failure in random testing that I can't convert to code that reliably shows the bug. This report will tell what I can.

It manifests as a simple lambda expression that happens to compile incorrectly, but then soon afterwards it compiled correctly. So, I can't just give you the lambda and have it reliably fail. Producing the failure takes a random amount of time running one of the RTP tests in the ansi-tests random test suite (specifically, RPT.RATIONALIZE).

Here's an example of such a lambda:

(LAMBDA (P1)
  (DECLARE (OPTIMIZE (SPEED 3) (SAFETY 2) (DEBUG 1) (SPACE 0))
           (TYPE
            (OR
             (MEMBER 576460752303423488 -1327529088978 2147483656
                     -618503095027453586 -10206226986129535)
             (MEMBER 24325/2704 3405008/1402941 -51275393
                     -158996776224437617/397337938509631333
                     -4665664940572965/4106957828165858
                     12300800207741/32429720173890))
            P1))
  (RATIONALIZE (THE (OR T (EQL C)) P1)))

The failure was that it reported the value -51275393 was not in the declared type (although it is). When I disassemble the compiled form of this lambda, the result is this:

; disassembly for (LAMBDA (P1))
; Size: 5 bytes. Origin: #x554EFCDA ; (LAMBDA (P1))
; A: C9 LEAVE
; B: F8 CLC
; C: C3 RET
; D: CC10 INT3 16 ; Invalid argument count trap

When I recompile that lambda and disassemble, I get something slightly different, that doesn't show the bug:

; disassembly for (LAMBDA (P1))
; Size: 8 bytes. Origin: #x554A5BDA ; (LAMBDA (P1))
; DA: 488BD3 MOV RDX, RBX
; DD: C9 LEAVE
; DE: F8 CLC
; DF: C3 RET
; E0: CC10 INT3 16 ; Invalid argument count trap

This is x86-64, "2.4.3.31-297df3308"

I will try to distill the random testing that generates this failure down to a smaller piece of code that can reliably stimulate the bug, but I haven't done that yet.

Revision history for this message

Paul F. Dietz (paul-f-dietz) wrote on 2024-04-01:

Additional information: the compiled code that fails on that input works fine on the elements of that declared type. It's just -51275393 that causes the error.

Revision history for this message

Christophe Rhodes (csr21-cantab) wrote on 2024-04-01:

Can you give step-by-step instructions to attempt to reproduce elsewhere?

Revision history for this message

Paul F. Dietz (paul-f-dietz) wrote on 2024-04-01:

bug376.lisp Edit (4.6 KiB, text/plain)

I've attached a file that stimulates the bug. Load this file, then do (bug376:bug376). On my computer (running under Slime) this causes a failure in a few minutes.

Stas Boukarev (stassats) on 2024-04-02