Failed to compile SBCL under Termux (Android)

Bug #1856377 reported by Xin Wang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Invalid
Undecided
Unassigned

Bug Description

When trying to bootstrap SBCL under Termux (a Linux environment for Android) using ECL:

% sh make.sh --xc-host=ecl

Following error is encountered:

cc -g -o sbcl alloc.o backtrace.o breakpoint.o coalesce.o coreparse.o dynbind.o funcall.o gc-common.o globals.o hopscotch.o interr.o interrupt.o largefile.o main.o monitor.o murmur_hash.o os-common.o parse.o print.o purify.o pthread-futex.o regnames.o run-program.o runtime.o safepoint.o save.o sc-offset.o search.o thread.o time.o validate.o var-io.o vars.o wrap.o arm64-arch.o linux-os.o linux-mman.o arm64-linux-os.o fullcgc.o gencgc.o traceroot.o arm64-assem.o ldso-stubs.o -ldl -lpthread -lm
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: run-program.o: in function `spawn':
/data/data/com.termux/files/home/Live/Lisp/code/sbcl/src/runtime/run-program.c:201: undefined reference to `getdtablesize'
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: arm64-assem.o: in function `call_into_lisp':
/data/data/com.termux/files/home/Live/Lisp/code/sbcl/src/runtime/arm64-assem.S:91: undefined reference to `current_thread'
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: BFD (GNU Binutils) 2.33.1 assertion fail /home/builder/.termux-build/binutils/src/bfd/elfnn-aarch64.c:5088
clang-9: error: unable to execute command: Segmentation fault
clang-9: error: linker command failed due to signal (use -v to see invocation)
make: *** [GNUmakefile:98: sbcl] Error 254

% uname -a
Linux localhost 3.18.120-perf #1 SMP PREEMPT Tue Jan 22 18:45:05 CST 2019 aarch64 Android

Revision history for this message
Stas Boukarev (stassats) wrote :

aarch64-linux-android-ld actually segfaults, so not really the fault of sbcl itself.

Changed in sbcl:
status: New → Invalid
Revision history for this message
Xin Wang (dramwang) wrote :

If sb-thread is disabled, linker will not segfault, but still failed:

cc -g -o sbcl alloc.o backtrace.o breakpoint.o coalesce.o coreparse.o dynbind.o funcall.o gc-common.o globals.o hopscotch.o interr.o interrupt.o largefile.o main.o monitor.o murmur_hash.o os-common.o parse.o print.o purify.o pthread-futex.o regnames.o run-program.o runtime.o safepoint.o save.o sc-offset.o search.o thread.o time.o validate.o var-io.o vars.o wrap.o arm64-arch.o linux-os.o linux-mman.o arm64-linux-os.o fullcgc.o gencgc.o traceroot.o arm64-assem.o ldso-stubs.o -ldl -lm
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: run-program.o: in function `spawn':
/data/data/com.termux/files/home/Live/Lisp/code/sbcl/src/runtime/run-program.c:201: undefined reference to `getdtablesize'
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: ldso-stubs.o: in function `ldso_stub__tcsetattr':
/data/data/com.termux/files/home/Live/Lisp/code/sbcl/src/runtime/ldso-stubs.S:138: undefined reference to `getdtablesize'
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: /data/data/com.termux/files/home/Live/Lisp/code/sbcl/src/runtime/ldso-stubs.S:138: undefined reference to `wait3' /data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: warning: creating a DT_TEXTREL in a shared object
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [GNUmakefile:98: sbcl] Error 1

Revision history for this message
Douglas Katzman (dougk) wrote :

See if hardwiring in a value for getdtablesize(), maybe 1024, and commenting out the call to wait3 allows the C linkage step to complete.

Revision history for this message
Xin Wang (dramwang) wrote :

With following change, and run `make.sh' with `--with-android', there will be no error related to `getdtablesize()':

diff --git a/src/runtime/run-program.c b/src/runtime/run-program.c
index 9be922cd1..00c5b823e 100644
--- a/src/runtime/run-program.c
+++ b/src/runtime/run-program.c
@@ -194,7 +194,7 @@ int spawn(char *program, char *argv[], int sin, int sout, int serr,
         dup2(serr, 2);
     }
     /* Close all other fds. */
-#ifdef SVR4
+#if defined SVR4 || defined LISP_FEATURE_ANDROID
     for (fd = sysconf(_SC_OPEN_MAX)-1; fd >= 3; fd--)
         if (fd != channel[1]) close(fd);
 #else
diff --git a/tools-for-build/ldso-stubs.lisp b/tools-for-build/ldso-stubs.lisp
index d08c620a1..e96196718 100644
--- a/tools-for-build/ldso-stubs.lisp
+++ b/tools-for-build/ldso-stubs.lisp
@@ -250,7 +250,7 @@ ldso_stub__ ## fct: ; \\
                    "fsync"
                    "ftruncate"
                    "getcwd"
- "getdtablesize"
+ #-android "getdtablesize"
                    "getegid"
                    "getenv"
                    "getgid"

And now error messages reduced to:

cc -g -o sbcl alloc.o backtrace.o breakpoint.o coalesce.o coreparse.o dynbind.o
funcall.o gc-common.o globals.o hopscotch.o interr.o interrupt.o largefile.o main.o monitor.o murmur_hash.o os-common.o parse.o print.o purify.o pthread-futex.o regnames.o run-program.o runtime.o safepoint.o save.o sc-offset.o search.o thread.o time.o validate.o var-io.o vars.o wrap.o arm64-arch.o linux-os.o linux-mman.o arm64-linux-os.o fullcgc.o gencgc.o traceroot.o arm64-assem.o ldso-stubs.o -ldl -lm
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: ldso-stubs.o: in function `ldso_stub__dladdr':
/data/data/com.termux/files/home/Live/Lisp/code/sbcl/src/runtime/ldso-stubs.S:124: undefined reference to `wait3'
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: warning: creating a DT_TEXTREL in a shared object
clang-9: error: linker command failed with exit code 1 (use -v to see invocation
)
make: *** [GNUmakefile:98: sbcl] Error 1

It is quite confusing, as `wait3' exists in Android, e.g. following code can be compiled:

#include <sys/wait.h>

int main()
{
        wait3(0, 0, 0);
}

Revision history for this message
Douglas Katzman (dougk) wrote :

I've removed the reference to wait3 from ldso-stubs, it's no longer needed.

Out of curiosity, can you try "cc -S" on your test program to see what symbol it calls for wait3?
I suspect that there may be a preprocessor macro which turns it into something else, or some similar substitution in the linker.

Revision history for this message
Xin Wang (dramwang) wrote :

Thanks.

Following is result of "cc -S":

        .text
        .file "foo.c"
        .globl main // -- Begin function main
        .p2align 2
        .type main,@function
main: // @main
// %bb.0:
        sub sp, sp, #32 // =32
        stp x29, x30, [sp, #16] // 16-byte Folded Spill
        add x29, sp, #16 // =16
        mov x8, #0
        mov x0, x8
        mov w9, #0
        mov w1, w9
        mov x2, x8
        stur w9, [x29, #-4] // 4-byte Folded Spill
        bl wait3
        ldur w9, [x29, #-4] // 4-byte Folded Reload
        mov w0, w9
        ldp x29, x30, [sp, #16] // 16-byte Folded Reload
        add sp, sp, #32 // =32
        ret
.Lfunc_end0:
        .size main, .Lfunc_end0-main
                                        // -- End function
        .p2align 2 // -- Begin function wait3
        .type wait3,@function
wait3: // @wait3
// %bb.0:
        sub sp, sp, #48 // =48
        stp x29, x30, [sp, #32] // 16-byte Folded Spill
        add x29, sp, #32 // =32
        mov w8, #-1
        stur x0, [x29, #-8]
        stur w1, [x29, #-12]
        str x2, [sp, #8]
        ldur x1, [x29, #-8]
        ldur w2, [x29, #-12]
        ldr x3, [sp, #8]
        mov w0, w8
        bl wait4
        ldp x29, x30, [sp, #32] // 16-byte Folded Reload
        add sp, sp, #48 // =48
        ret
.Lfunc_end1:
        .size wait3, .Lfunc_end1-wait3
                                        // -- End function

        .ident "clang version 9.0.0 (tags/RELEASE_900/final)"
        .section ".note.GNU-stack","",@progbits

Revision history for this message
Xin Wang (dramwang) wrote :

With the `wait3()` fix, C linkage step is passed, but following error occurred after that:

[218/320] src/compiler/arm64/insts
;;; OPTIMIZE levels: Safety=2, Space=0, Speed=3, Debug=0
;;;
;;; End of Pass 1.
;;; OPTIMIZE levels: Safety=2, Space=0, Speed=3, Debug=0
;;;
;;; End of Pass 1.
Condition of type: FORMAT-ERROR
Error in format: Cannot mix ~W, ~_, ~<...~:>, ~I, or ~T with ~<...~:;...~>
  ~:[This~;~:*~A~] is not a ~
                       ~<~%~9T~:;~/sb-impl:print-type/:~>~% ~S
                                                           ^
while processing indirect format string:
  ~?
   ^

Revision history for this message
Douglas Katzman (dougk) wrote :

superficially this would be an ECL bug, because you can use ~T inside ~<...~>.
Can you try using a host lisp that works? Otherwise this becomes a yak-shaving exercise.

Revision history for this message
Xin Wang (dramwang) wrote :

OK, I filed a bug for ECL: https://gitlab.com/embeddable-common-lisp/ecl/issues/540

Will go further after it is fixed.

Revision history for this message
Xin Wang (dramwang) wrote :

With ECL's fix of FORMAT-ERROR, now bootstrapping stopped with following error in `make-genesis-2.sh` step:

Condition of type: SIMPLE-ERROR
The foreign symbol "software_version" is undefined.
Available restarts:

1. (ABORT-BUILD) Abort building SBCL.
2. (RESTART-TOPLEVEL) Go back to Top-Level REPL.

Broken at SI:BYTECODES. [Evaluation of: (SB-COLD:GENESIS :OBJECT-FILE-NAMES (LET
 (LIST) (SB-COLD::DO-STEMS-AND-FLAGS (SB-COLD::STEM SB-COLD::FLAGS 2) (UNLESS (M
EMBER :NOT-TARGET SB-COLD::FLAGS) (PUSH (SB-COLD::STEM-OBJECT-PATH SB-COLD::STEM
 SB-COLD::FLAGS :TARGET-COMPILE) LIST))) (NREVERSE LIST)) :TLS-INIT (SB-COLD:REA
D-FROM-FILE (SB-COLD::STEM-OBJECT-PATH "tls-init.lisp-expr" '(:EXTRA-ARTIFACT) :
TARGET-COMPILE)) :C-HEADER-DIR-NAME "output/genesis-2" :SYMBOL-TABLE-FILE-NAME "
src/runtime/sbcl.nm" :CORE-FILE-NAME "output/cold-sbcl.core" :MAP-FILE-NAME "out
put/cold-sbcl.map")] In: #<process TOP-LEVEL 0x5a1ff04f80>.
 File: #P"/data/data/com.termux/files/home/Live/Lisp/code/sbcl/make-genesis-2.li
sp" (Position #380)

Revision history for this message
Douglas Katzman (dougk) wrote :

Try hacking the DEFUN of SOFTWARE-VERSION in 'src/code/android-os.lisp' to return a constant string as a workaround. There are enough things to fix; this one doesn't need to stop the build.

Revision history for this message
Xin Wang (dramwang) wrote :

With SOFTWARE-VERSION hack, build stopped with "duplicate DEFUN for SOFTWARE-TYPE" error, it seems that both android-os.lisp and linux-os.lisp define SOFTWARE-TYPE.

Currently I'm building SBCL with following command:

  sh make.sh --xc-host=ecl --without-sb-thread --with-android

When looking around commit log, it seems that Android support for SBCL have not been updated for quite a long time, so maybe it's better to try without `--with-android`?

Revision history for this message
Stas Boukarev (stassats) wrote :

Your environment is not a real android, is it? That was for running directly as a process, without any additional layers.

Revision history for this message
Xin Wang (dramwang) wrote :

I do not know the mechanism behind Termux[1], but according to this list[2], many packages are supported. It would be nice if SBCL can be supported.

[1] https://termux.com/
[2] https://github.com/termux/termux-packages/tree/master/packages

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.