cannot compile sbcl on 32-bit linux

Bug #1432529 reported by Andrey Grozin
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
SBCL
Invalid
Undecided
Unassigned

Bug Description

Trying to compile sbcl-1.2.9 (and 1.2.7) on 2 32-bit linux computers
(Gentoo), I get

+ echo //doing warm init - compilation phase
//doing warm init - compilation phase
+ ./src/runtime/sbcl --core output/cold-sbcl.core --lose-on-corruption
--no-sysinit --no-userinit
This is SBCL 1.2.9, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
COLD-INIT... (PACKAGE = "COMMON-LISP")
("Length(TLFs)= " 24692)
CORRUPTION WARNING in SBCL pid 2441:
Memory fault at (nil) (pc=0x80625d0, sp=0xb39becb8)
The integrity of this image is possibly compromised.
Exiting.
Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>

On a 64-bit Gentoo computer compilation succeeds. In the past, I
successfully compiled sbcl on both of these 32-bit systems (at least up to
1.2.7). I don't know which software upgrade leads to this problem; maybe,
a newer version of glibc? I'm sure I upgraded many things between the last
successful build of 1.2.7 and the first unsuccessful build of 1.2.9 and
1.2.7.

Revision history for this message
Stas Boukarev (stassats) wrote :

32-bit SBCL obviously builds fine for others, so it's about your setup (or gentoo), things like hardened gcc and other security tricks may affect it.

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

In the past, I had no problems building sbcl on the same 2 x86 boxes. I don't use hardened. Recent software upgrades were glibc 2.19 -> 2.20, gcc 4.8.3 -> 4.9.2. The failure is identical on 2 computers, so, hardware memory fauts are out of question.
Is there any way to investigate what's going on? From the buildlog I see that the (downloaded) binary sbcl-1.2.7 has done its work normally (or so it seems), and has produced an sbcl-1.2.9 executable. But warm init fails:

./src/runtime/sbcl --core output/cold-sbcl.core --lose-on-corruption --no-sysinit --no-userinit
<skipped>
COLD-INIT... (PACKAGE = "COMMON-LISP")
("Length(TLFs)= " 24692)
CORRUPTION WARNING in SBCL pid 2441:
Memory fault at (nil) (pc=0x80625d0, sp=0xb39becb8)
The integrity of this image is possibly compromised.

What can I investigate further? I'm the maintainer of maxima and fricas, as well as common lisps which can compile them, in Gentoo. I *have* to find out what's going wrong on x86.

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

ldb> backtrace
Backtrace:
   0: Foreign function ldb_monitor, fp = 0xb3c02a38, ra = 0x805bfca
ldb>

Revision history for this message
Stas Boukarev (stassats) wrote :

And the gdb backtrace?

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

Why text/plain files are shown in this bug tracker as if they contain no \n's?

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

Sorry for the noise, this last attached text shows normally.

Revision history for this message
Douglas Katzman (dougk) wrote :

urgh. building with sb-show fails for a different reason.
Please pull in http://sourceforge.net/p/sbcl/sbcl/ci/0c45fb49d4b92d5a15a2cb92f1e8514315d1f236/ and try again.

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :
Download full text (6.7 KiB)

Without --disable-debugger, starting sbcl as in make-host-2.sh I get

../sbcl-binary/run-sbcl.sh --no-sysinit --no-userinit < make-host-2.lisp
(running SBCL from: ../sbcl-binary)
This is SBCL 1.2.7, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
*
5
* target features *SHEBANG-FEATURES*=(:SB-CORE-COMPRESSION :OS-PROVIDES-POLL
                                      :OS-PROVIDES-GETPROTOBY-R
                                      :OS-PROVIDES-SUSECONDS-T
                                      :OS-PROVIDES-BLKSIZE-T :OS-PROVIDES-PUTWC
                                      :OS-PROVIDES-DLADDR :OS-PROVIDES-DLOPEN
                                      :LITTLE-ENDIAN :LINKAGE-TABLE
                                      :SYMBOL-INFO-VOPS :ASH-RIGHT-VOPS
                                      :MULTIPLY-HIGH-VOPS :MEMORY-BARRIER-VOPS
                                      :PRECISE-ARG-COUNT-ERROR
                                      :INLINE-CONSTANTS :CYCLE-COUNTER
                                      :ALIEN-CALLBACKS
                                      :STACK-ALLOCATABLE-FIXED-OBJECTS
                                      :STACK-ALLOCATABLE-LISTS
                                      :STACK-ALLOCATABLE-VECTORS
                                      :STACK-ALLOCATABLE-CLOSURES
                                      :RAW-INSTANCE-INIT-VOPS
                                      :UNWIND-TO-FRAME-AND-CALL-VOP
                                      :COMPARE-AND-SWAP-VOPS
                                      :C-STACK-IS-CONTROL-STACK
                                      :STACK-GROWS-DOWNWARD-NOT-UPWARD :GENCGC
                                      :LARGEFILE :SB-FUTEX :LINUX :ELF :UNIX
                                      :X86 :IEEE-FLOATING-POINT
                                      :PACKAGE-LOCAL-NICKNAMES
                                      :SB-SOURCE-LOCATIONS :SB-EVAL :SB-UNICODE
                                      :SB-PACKAGE-LOCKS :SB-LDB :SB-SHOW
                                      :SB-DOC :SBCL :COMMON-LISP :ANSI-CL)
target backend-subfeatures *SHEBANG-BACKEND-FEATURES*=NIL
T
*
#<PACKAGE "SB-COLD">
*
"obj/from-xc/"

<skipped>

; compilation unit finished
; Undefined type:
; CODE-DELETION-NOTE
; Undefined functions:
; SB!C::DEBUG-INFO-FOR-COMPONENT SB!C::DEBUG-SOURCE-FOR-INFO DUMP-CHARACTERS-OF-STRING DUMP-MULTI-DIM-ARRAY DUMP-RAW-BYTES DUMP-SIMPLE-CHARACTER-STRING FIXUP-FLAVOR FIXUP-NAME FIXUP-NOTE-FIXUP FIXUP-NOTE-KIND FIXUP-NOTE-POSITION NATIVE-NAMESTRING STRING-TO-OCTETS
; caught 74 WARNING conditions
; caught 18 STYLE-WARNING conditions
; printed 74 notes

; /var/tmp/portage/dev-lisp/sbcl-1.2.9/work/sbcl-1.2.9/obj/from-xc/src/compiler/dump.lisp-obj-tmp written
; compilation finished in 0:00:00.630

debugger invoked on a SIMPLE-ERROR in thread
#<THREAD "main thread" RUNNING {AF43971}>:
  FAILURE-P was set when creating "obj/from-x...

Read more...

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

Thanks, after applying the patch http://sourceforge.net/p/sbcl/sbcl/ci/0c45fb49d4b92d5a15a2cb92f1e8514315d1f236/ the previous failure gas disappeared, and the build proceeds farther. We are back at the original failure in make-target-2.

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

How to produce a meaningful gdb backtrace?

elrond sbcl-1.2.9 # gdb --args src/runtime/sbcl --core output/cold-sbcl.core --lose-on-corruption --no-sysinit --no-userinit < make-target-2-load.lisp
GNU gdb (Gentoo 7.9 vanilla) 7.9
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from src/runtime/sbcl...done.
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) (gdb) (gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) (gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) (gdb) (gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) (gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) (gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) Undefined command: "". Try "help".
(gdb) quit
elrond sbcl-1.2.9 #

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

Sorry for my stupidity. Here we are:

elrond sbcl-1.2.9 # gdb --args src/runtime/sbcl --core output/cold-sbcl.core --lose-on-corruption --no-sysinit --no-userinit
GNU gdb (Gentoo 7.9 vanilla) 7.9
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from src/runtime/sbcl...done.
(gdb) run
Starting program: /var/tmp/portage/dev-lisp/sbcl-1.2.9/work/sbcl-1.2.9/src/runtime/sbcl --core output/cold-sbcl.core --lose-on-corruption --no-sysinit --no-userinit

Program received signal SIGUSR1, User defined signal 1.
0xb7fdc424 in __kernel_vsyscall ()
(gdb) backtrace
#0 0xb7fdc424 in __kernel_vsyscall ()
#1 0xb7db3d56 in kill () from /lib/libc.so.6
#2 0x0805754c in see_if_sigaction_nodefer_works () at interrupt.c:1804
#3 0x0805af6a in interrupt_init () at interrupt.c:2046
#4 0x080511a4 in main (argc=6, argv=0xbfffede4, envp=0xbfffee00) at runtime.c:458
(gdb)

Revision history for this message
Stas Boukarev (stassats) wrote :

Still not right, you need gdb -p pid-of-sbcl

Revision history for this message
Douglas Katzman (dougk) wrote :

in comment #11, the reason it stopped where it did is simply that gdb intercepted a signal that occurs way before it ever gets to the same point of failure in your original comment. To make it proceed past this you'd have to enter 'handle sigusr1 nostop pass' in gdb.
Stas is suggesting that you simply let it crash in the build and then attach gdb though.

Do you have the build log with 'sb-show' but without attempting to run it under gdb? That should reveal something.
You just snip everything above LENGTH(TLFS) = ... and show from there onward.

Revision history for this message
Douglas Katzman (dougk) wrote :

oh, sorry I didn't see the log...
So it's making it as far a DEBUG-INFO-COLD-INIT.
Here is a small patch to narrow down the failure within there:

--- a/src/code/debug-info.lisp
+++ b/src/code/debug-info.lisp
@@ -303,9 +303,15 @@
 (defvar *!initial-debug-sources*)

 (defun !debug-info-cold-init ()
+ (write-string "Before G-U-T") (terpri)
   (let ((now (get-universal-time)))
+ (write-string "After G-U-T") (terpri)
     (dolist (debug-source *!initial-debug-sources*)
+ (write-string "Before FILE-WRITE-DATE ")
+ (write (debug-source-namestring debug-source))
+ (terpri)
       (let* ((namestring (debug-source-namestring debug-source))
              (timestamp (file-write-date namestring)))
+ (write-string "After FILE-WRITE-DATE") (terpri)
         (setf (debug-source-created debug-source) timestamp
               (debug-source-compiled debug-source) now)))))

You can't use FORMAT - the patch must be applied exactly as-is.
If successful, it should print:
COLD-INIT... (PACKAGE = "COMMON-LISP")
("Length(TLFs)= " 14243)
Before G-U-T
After G-U-T
Before FILE-WRITE-DATE "SYS:SRC;PCL;WALK.LISP"
After FILE-WRITE-DATE
... and many more

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

> Still not right, you need gdb -p pid-of-sbcl

elrond sbcl-1.2.9 # gdb -p 10175
GNU gdb (Gentoo 7.9 vanilla) 7.9
<skipped>
Attaching to process 10175
Reading symbols from /var/tmp/portage/dev-lisp/sbcl-1.2.9/work/sbcl-1.2.9/src/runtime/sbcl...done.
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /lib/libz.so.1...(no debugging symbols found)...done.
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
0xb7fdc424 in __kernel_vsyscall ()
(gdb) backtrace
#0 0xb7fdc424 in __kernel_vsyscall ()
#1 0xb7e60e73 in read () from /lib/libc.so.6
#2 0xb7df46c7 in _IO_file_underflow () from /lib/libc.so.6
#3 0xb7df55e7 in _IO_default_uflow () from /lib/libc.so.6
#4 0xb7df53fc in __uflow () from /lib/libc.so.6
#5 0xb7de9a45 in _IO_getline_info () from /lib/libc.so.6
#6 0xb7de9b6e in _IO_getline () from /lib/libc.so.6
#7 0xb7de8952 in fgets () from /lib/libc.so.6
#8 0x0805bef2 in fgets (__stream=<optimized out>, __n=256, __s=0xb38d191c "\370\276\375\267\001") at /usr/include/bits/stdio2.h:263
#9 sub_monitor () at monitor.c:480
#10 ldb_monitor () at monitor.c:524
#11 0x08056ddc in call_lossage_handler () at interr.c:106
#12 0x08056ffe in corruption_warning_and_maybe_lose (fmt=0x806dda4 "Memory fault at %p (pc=%p, sp=%p)") at interr.c:148
#13 0x0805b0d9 in lisp_memory_fault_error (context=0xb38d1c8c, addr=0x0) at interrupt.c:2090
#14 0x08057402 in low_level_handle_now_handler (signal=11, info=0xb38d1c0c, void_context=0xb38d1c8c) at interrupt.c:1925
#15 <signal handler called>
#16 copy_to_stat_wrapper (from=0xb368dd48, to=0xb388dfb4) at wrap.c:184
#17 stat_wrapper (file_name=0xc6b1268 "/var/tmp/portage/dev-lisp/sbcl-1.2.9/work/sbcl-1.2.9/src/pcl/walk.lisp", buf=0xb388dfb4)
    at wrap.c:225
#18 0x092d7f27 in ?? ()
#19 0x0a95c8de in ?? ()
#20 0x0a95e9e3 in ?? ()
#21 0x09e77412 in ?? ()
#22 0x0af5ae67 in ?? ()
#23 0x0806a2eb in Ldone () at x86-assem.S:449
#24 0x0af5a4d5 in ?? ()
Backtrace stopped: frame did not save the PC
(gdb)

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

With the patch from the comment #14

Before G-U-T
After G-U-T
Before FILE-WRITE-DATE "SYS:SRC;PCL;WALK.LISP"
CORRUPTION WARNING in SBCL pid 12583:
Memory fault at (nil) (pc=0x8062490, sp=0xb39bed48)
The integrity of this image is possibly compromised.
Exiting.
Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>

Revision history for this message
Stas Boukarev (stassats) wrote :

Show us disassemble copy_to_stat_wrapper

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

Stas, Douglas, maybe, it would be easier to give you access to elrond.inp.nsk.su? I can send you username/password in private mails.

Revision history for this message
Stas Boukarev (stassats) wrote :

That could work.

Revision history for this message
Stas Boukarev (stassats) wrote :

Did I do something wrong? I just built sbcl from git, and the only failures were ipv6 sb-bsd-sockets tests. What patches does gentoo apply?

Revision history for this message
Andrey Grozin (a-g-grozin) wrote :

Gentoo always uses the user's CFLAGS rather then the ones recommended by the upstream. I found that
-march=native -O2
-march=pentium4 -O2
-O3
are OK, but
-march=native -O3
-march=pentium4 -O3
lead to the error described above. And this happens for gcc-4.9.2; I suspect that for gcc-4.8.4 (before the last gcc upgrade on my x86 box) all these combinations were OK.
I think this bug can be closed. This is more a gcc problem than an sbcl one.

Stas Boukarev (stassats)
Changed in sbcl:
status: New → Incomplete
Douglas Katzman (dougk)
Changed in sbcl:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.