Ubuntu

fakeroot Illegal instruction in lucid armv7 on the beagleboard

Reported by Robert Nelson on 2009-12-11
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
fakeroot (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: fakeroot

Test Setup:

Rev Bx Beagleboard running Debian Squeeze with multiple chroots.

Test script:

wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.32.tar.bz2
tar xjf linux-2.6.32.tar.bz2
cd linux-2.6.32
make omap3_beagle_defconfig
fakeroot --version
fakeroot make ARCH=arm deb-pkg

In Karmic:

(karmic)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot --version
fakeroot version 1.12.4

(karmic)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot make ARCH=arm deb-pkg
scripts/kconfig/conf -s arch/arm/Kconfig
make KBUILD_SRC=
  CHK include/linux/version.h
  UPD include/linux/version.h
  Generating include/asm-arm/mach-types.h
  CHK include/linux/utsrelease.h
  UPD include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-arm
....

In Lucid:

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot --version
fakeroot version 1.12.4

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot make ARCH=arm deb-pkg
Illegal instruction
semop(2): encountered an error: Invalid argument
/usr/bin/fakeroot: line 1: kill: (4476) - No such process

Robert Nelson (robertcnelson) wrote :

It's going to take more then just a simple rebuild, after copying the tar.gz and *.dsc to the updated lucid chroot

apt-get build-dep fakeroot
dpkg-source -x fakeroot_1.12.4ubuntu1.dsc
cd fakeroot-1.12.4ubuntu1/
dpkg-buildpackage -uc -b

make[3]: Entering directory `/home/voodoo/fakeroot-1.12.4ubuntu1/obj-sysv/test'
PASS: t.chmod_dev
PASS: t.echoarg
PASS: t.falsereturn
PASS: t.mknod
PASS: t.no_ld_preload
PASS: t.no_ld_preload_link
PASS: t.option
FAIL: t.tar
PASS: t.touchinstall
PASS: t.truereturn
==================================
1 of 10 tests failed
Please report to <email address hidden>
==================================

Robert Nelson (robertcnelson) wrote :

Debian Squeeze's: fakeroot_1.14.4-1 is a little worse.

make[3]: Entering directory `/home/voodoo/debian/fakeroot-1.14.4/obj-sysv/test'
FAIL: t.chmod_dev
PASS: t.echoarg
PASS: t.falsereturn
PASS: t.mknod
PASS: t.no_ld_preload
PASS: t.no_ld_preload_link
PASS: t.option
FAIL: t.tar
PASS: t.touchinstall
PASS: t.truereturn
==================================
2 of 10 tests failed
Please report to <email address hidden>
==================================

Robert Nelson (robertcnelson) wrote :

add -marm makes it worse.

added to debian/rules

ifeq ($(DEB_HOST_ARCH),armel)
CFLAGS += -marm
CXXFLAGS += -marm
endif

make[3]: Entering directory `/home/voodoo/fakeroot-1.12.4ubuntu1+nmu1/obj-sysv/test'
FAIL: t.chmod_dev
PASS: t.echoarg
PASS: t.falsereturn
PASS: t.mknod
PASS: t.no_ld_preload
PASS: t.no_ld_preload_link
PASS: t.option
FAIL: t.tar
PASS: t.touchinstall
PASS: t.truereturn
==================================
2 of 10 tests failed
Please report to <email address hidden>
==================================

Robert Nelson (robertcnelson) wrote :

with an upgrade this morning to "gcc version 4.4.2 (Ubuntu 4.4.2-5ubuntu1)"

It's now:

(lucid)root@beagle-128mb-0:~/linux-2.6.32# fakeroot --version
fakeroot version 1.12.4

(lucid)root@beagle-128mb-0:~/linux-2.6.32# fakeroot make ARCH=arm deb-pkg
semop(2): encountered an error: Invalid argument
make[1]: *** [scripts_basic] Error 1
semop(1): encountered an error: Invalid argument
/usr/bin/fakeroot: line 1: kill: (10181) - No such process

Robert Nelson (robertcnelson) wrote :

Just for clarification, without fakeroot...

(lucid)root@beagle-128mb-0:~/linux-2.6.32# make ARCH=arm deb-pkg
scripts/kconfig/conf -s arch/arm/Kconfig
make KBUILD_SRC=
  CHK include/linux/version.h
  UPD include/linux/version.h
  Generating include/asm-arm/mach-types.h
  CHK include/linux/utsrelease.h
  UPD include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-arm
  CC kernel/bounds.s
  GEN include/linux/bounds.h
  CC arch/arm/kernel/asm-offsets.s
  GEN include/asm/asm-offsets.h
  CALL scripts/checksyscalls.sh
  HOSTCC scripts/genksyms/genksyms.o

Robert Nelson (robertcnelson) wrote :

Thanks

It looks like it's fixed with 1.14.4... Will do some more testing before asking to close this...

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot --version
fakeroot version 1.14.4
(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot make ARCH=arm deb-pkg
make KBUILD_SRC=
  CHK include/linux/version.h
make[3]: `include/asm-arm/mach-types.h' is up to date.
  CHK include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-arm
  HOSTCC scripts/basic/fixdep

Robert Nelson (robertcnelson) wrote :

Well fakeroot seems to be working... Safe to close this bug report...

Although I'm now ICE'ing gcc... But randomly, will attempt it replicate it with a small testcase and add a bug to gcc..

(B5 ES2.1 Beagle): http://rcn-ee.homeip.net:81/dl/farm/log/2.6.32.3-x3.0_1.0-lucid-gcc-ICE.txt
(C2 ES3.0 Beagle): http://rcn-ee.homeip.net:81/dl/farm/log/2.6.32.3-x3.0_1.0-lucid-gcc-ICE2.txt

Robert Nelson (robertcnelson) wrote :

Just an update with lucid, as things are worse...
(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot --version
fakeroot version 1.14.4

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot make ARCH=arm deb-pkg
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Segmentation fault
Illegal instruction
Illegal instruction
make KBUILD_SRC=
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
Illegal instruction
  CHK include/linux/version.h
Illegal instruction
make[3]: `include/asm-arm/mach-types.h' is up to date.
Illegal instruction
Illegal instruction
  CHK include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-arm
Illegal instruction
Illegal instruction
  CC kernel/bounds.s
  GEN include/linux/bounds.h
  CC arch/arm/kernel/asm-offsets.s
Illegal instruction
make[3]: *** [arch/arm/kernel/asm-offsets.s] Error 132
make[2]: *** [prepare0] Error 2
make[1]: *** [deb-pkg] Error 2
make: *** [deb-pkg] Error 2

Missing Instruction seem to be:

[92740.005920] Alignment trap: not handling instruction f1b80f00 at [<0000a628>]
[92740.014373] Unhandled fault: alignment exception (0x801) at 0xbea444dd
[92740.179504] Alignment trap: not handling instruction f1b80f00 at [<0000a628>]
[92740.187561] Unhandled fault: alignment exception (0x801) at 0xbedae4eb
[92740.363128] Alignment trap: not handling instruction d002 at [<0000a62c>]
[92740.370880] Unhandled fault: alignment exception (0x801) at 0xbef4b4c1
[92740.535339] Alignment trap: not handling instruction 2b2f at [<0000a630>]
[92740.543060] Unhandled fault: alignment exception (0x801) at 0xbe9e74cb

Dave Martin (dave-martin-arm) wrote :

Can you confirm what kernel you're using?

Also, do you know which binary the fault is occurring in? Can you attach the affected binary, or a disassembly of the relevant part (around address 0xa628-0xa630)?

The instruction codes printed out by the alignment fault handler do not appear to be load or store instructions at all and should not be generating alignment faults, but it might be that the information printed out by the alignment fault handler is misleading for some reason. It may be that you are missing some required kernel patches.

Also, ensure that your kernel is definitely built with CONFIG_ARM_THUMB=y

Robert Nelson (robertcnelson) wrote :

Hi Dave, this is with the current stable mainline release... 2.6.32.5 + omap3 related patches, I'm queue'ing up a a 2.6.33-rc5 build for testing, but I don't see any alignment changes...

Sure i'll get that disassembly info..

THUMB is enabled in my defconfig
cat defconfig | grep THUMB
CONFIG_ARM_THUMB=y
CONFIG_ARM_THUMBEE=y
# CONFIG_THUMB2_KERNEL is not set

Thinking, maybe i should enable THUMB2 support, i've also tried making that change..
CONFIG_THUMB2_KERNEL=y

But that's failing to build... (building in karmic chroot)
arch/arm/kernel/relocate_kernel.S: Assembler messages:
arch/arm/kernel/relocate_kernel.S:10: Error: invalid offset, value too big (0xFFFFFFFC)
arch/arm/kernel/relocate_kernel.S:11: Error: invalid offset, value too big (0xFFFFFFFC)
arch/arm/kernel/relocate_kernel.S:52: Error: invalid offset, value too big (0xFFFFFFFC)
arch/arm/kernel/relocate_kernel.S:53: Error: invalid offset, value too big (0xFFFFFFFC)
make[3]: *** [arch/arm/kernel/relocate_kernel.o] Error 1
make[2]: *** [arch/arm/kernel] Error 2
make[1]: *** [deb-pkg] Error 2
make: *** [deb-pkg] Error 2

Thanks.

Dave Martin (dave-martin-arm) wrote :

OK... it looks like the relevant alignment fixup patches are in there.

CONFIG_THUMB2_KERNEL is for building the kernel itself in Thumb-2. That may not work out of the box; I suggest you don't try that for now. It shouldn't be related to the problem you're having.

Did you get anywhere finding which binary the fault is happening in?

Robert Nelson (robertcnelson) wrote :

Thanks Dave,

Here is a little more debugging info, strace log..

http://pastebin.com/f3da6952d
line 417 is the first illegal instruction..

I had forgotten, /usr/bin/fakeroot is actually a shell script that eventually calls faked-sysv so i haven't been able to run it thru gdb to get the dissemble info from that address...

Dave Martin (dave-martin-arm) wrote :

Hmmm, I can see no execve, and no open("/usr/bin/make") so I'm guessing we may still be in bash.

What's your version of bash, make and fakeroot? I looked at the relevant addresses in /bin/bash (and /usr/bin/make and /usr/sbin/faked-sysv for good measure) but the content didn't seem to match what's printed in the alignment fault messages.

Robert Nelson (robertcnelson) wrote :

Hi Dave,

Here's the version info, lucid updated as of this morning...

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ bash --version
GNU bash, version 4.1.0(1)-release (arm-unknown-linux-gnueabi)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ make --version
GNU Make 3.81
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

This program built for arm-unknown-linux-gnueabi

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot --version
fakeroot version 1.14.4

This was a little interesting, ran 'make' thru gdb after entering the fakeroot environment.. However I had to manually stop it (ctrl-c) after the illegal instruction, (gdb was setup to stop in that situation 'info handle' below..)

(lucid)voodoo@beagle-256mb-0:~/linux-2.6.32$ fakeroot
(lucid)root@beagle-256mb-0:~/linux-2.6.32# gdb make
GNU gdb (GDB) 7.0.1-ubuntu
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/make...(no debugging symbols found)...done.
(gdb) set args ARCH=arm deb-pkg
(gdb) r
Starting program: /usr/bin/make ARCH=arm deb-pkg
[Thread debugging using libthread_db enabled]
Illegal instruction
Illegal instruction
^C
Program received signal SIGINT, Interrupt.
0x400c560c in read () at ../sysdeps/unix/syscall-template.S:82
82 ../sysdeps/unix/syscall-template.S: No such file or directory.
 in ../sysdeps/unix/syscall-template.S
Current language: auto
The current source language is "auto; currently asm".
(gdb)

(gdb) info handle
Signal Stop Print Pass to program Description
<snip>
SIGILL Yes Yes Yes Illegal instruction
EXC_BAD_INSTRUCTION Yes Yes Yes Illegal instruction/operand

Dave Martin (dave-martin-arm) wrote :

If you're getting alignment fault errors in the kernel in parallel with this, maybe you could dump the instructions at the unhandled fault address? i.e., where the kernel log says "not handling instruction <blah> at [<address>]"

x/i might work better than the disassemble command, but it depends on the process memory map.

Robert Nelson (robertcnelson) wrote :

Here's what i get...

[87841.441467] Alignment trap: not handling instruction d002 at [<0000a62c>]
[87841.449401] Unhandled fault: alignment exception (0x801) at 0xbeda720b

(gdb) x/i 0x0000a62c
0xa62c: undefined instruction 0xf9b8f002

(gdb) x/i 0xbeda720b
0xbeda720b: Cannot access memory at address 0xbeda720a

Robert Nelson (robertcnelson) wrote :

(gdb) x/3 0x0000a628
0xa628: strdcs sp, [r1], -r8
0xa62c: undefined instruction 0xf9b8f002
0xa630: eorle r2, r4, r3, lsl #26

Dave Martin (dave-martin-arm) wrote :

Hmmm, I'm still none the wiser. Dumping make, I get the data corresponding to what you just disassembled:

 0a618 236833b1 01200021 06f036fd 2368002b
 0a628 f8d10120 02f0b8f9 032d24d0 fff7f2e9
 0a638 2946fff7 32ea0028 20db70bd 134c2368

I get the same data whether inside or outside fakeroot.

...but it still doesn't match the alignment fault messages— the instructions f1b80f00 and d002 don't appear at that location, so I still don't see a clue as to why the faults are occurring.

Maybe boot with cachepolicy=uncached and see if that makes any difference (apart from slowness), or add a printk in arch/arm/mm/fault.c:do_DataAbort() to print out the process name

Robert Nelson (robertcnelson) wrote :

Thanks Dave... Still a mystery.. cachepolicy=uncache really makes the it slow on the omap3530...

[27564.130035] Alignment trap: not handling instruction 2b2f at [<0000a630>]
[27564.142974] Unhandled fault: alignment exception (0x801) at 0xbeb7237d

(gdb) x/12x 0x0000a618
0xa618: 0xb1336823 0x21002001 0xfd36f006 0x2b006823
0xa628: 0x2001d1f8 0xf9b8f002 0xd0242d03 0xe9f2f7ff
0xa638: 0xf7ff4629 0x2800ea32 0xbd70db20 0x68234c13

(gdb) x/12i 0x0000a618
0xa618: teqlt r3, r3, lsr #16
0xa61c: tstcs r0, r1
0xa620: ldc2 0, cr15, [r6, #-24]! ; 0xffffffe8
0xa624: blcs 0x246b8
0xa628: strdcs sp, [r1], -r8
0xa62c: undefined instruction 0xf9b8f002
0xa630: eorle r2, r4, r3, lsl #26
0xa634:
    ldmib r2!, {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10, r12, sp, lr, pc}^
0xa638: undefined instruction 0xf7ff4629
0xa63c: stmdacs r0, {r1, r4, r5, r9, r11, sp, lr, pc}
0xa640: vldmdblt r0!, {d29-<overflow reg d44>}
0xa644: stmdavs r3!, {r0, r1, r4, r10, r11, lr}

Heading to work, so it'll take a bit, but I'll add that printk..

Thanks...

Robert Nelson (robertcnelson) wrote :

Humm, a possible result, but I'm not sure if it's valid, so I'm attaching the patch i used to output the PID in do_DataAbort()... It's similar to something that was posted but never merged into the sh fault.c.... After running it a couple of times i really should have printed the task name...

[53080.361297] Alignment trap: not handling instruction 2b2f at [<0000a630>]
[53080.369415] Unhandled fault: alignment exception (0x801) at 0xbe92537f
[53080.376617] Task pid 2672
[53083.439971] Alignment trap: not handling instruction 2b2f at [<0000a630>]
[53083.447692] Unhandled fault: alignment exception (0x801) at 0xbebf556f
[53083.454895] Task pid 2788

Loggin "ps -Af"'s output:

UID PID PPID C STIME TTY TIME CMD
root 1 0 0 13:48 ? 00:00:01 init [2]

voodoo@beagle-256mb-0:~$ cat test.log | grep 2672
voodoo 2672 2671 0 14:29 pts/0 00:00:00 gcc -D__KERNEL__ -mlittle-endian -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -Os -marm -mtune=xscale -c -xc /dev/null -o .2671.tmp

Aha, looks like you found the right process:

Looking more closely at /usr/bin/gcc:

$ objdump -d /usr/bin/gcc
[...]
    a606: f7fe eff6 blx 95f4 <error-0x2fc>
    a60a: 1963 adds r3, r4, r5
    a60c: f813 2c01 ldrb.w r2, [r3, #-1]
    a610: 2a2f cmp r2, #47 ; 0x2f
    a612: d003 beq.n a61c <error+0xd2c>
    a614: 3301 adds r3, #1
    a616: 222f movs r2, #47 ; 0x2f
    a618: f805 2036 strb.w r2, [r5, r6, lsl #3]
    a61c: 461a mov r2, r3
    a61e: 212e movs r1, #46 ; 0x2e
    a620: f802 1b01 strb.w r1, [r2], #1
    a624: 2100 movs r1, #0
    a626: 7059 strb r1, [r3, #1]
    a628: f1b8 0f00 cmp.w r8, #0
    a62c: d002 beq.n a634 <error+0xd44>
    a62e: 7823 ldrb r3, [r4, #0]
    a630: 2b2f cmp r3, #47 ; 0x2f
    a632: d013 beq.n a65c <error+0xd6c>
    a634: 2003 movs r0, #3
    a636: 4621 mov r1, r4
    a638: 463a mov r2, r7
    a63a: f7fe ef04 blx 9444 <error-0x4ac>
[...]

Oddly, there are no instructions in this sequence which should be causing
alignment faults (as already appeared to be the case).

One thing you could try is to remove the problem instructions and emulate
them in gdb.

$ cp /usr/bin/gcc ~
#Use a hex editor (hexedit works well) to fill the bytes at file offsets
0x2628..0x262d and 0x2630..0x2631 with nops (0x00, 0xBF)
#the binary should now look like this:
...
00002620 02 f8 01 1b 00 21 59 70 00 bf 00 bf 00 bf 23 78
|.....!Yp......#x|
00002630 00 bf 13 d0 03 20 21 46 3a 46 fe f7 04 ef 00 28 |.....
!F:F.....(|
...
(The modified bytes correspond to the faulting instructions at 0xa628...
assuming your binary is still the same as mine.)

The attached GDB script contains some macro magic which breaks on the NOPs
and emulates the effect of the instructions instead using GDB commands.

I tested it by compiling a silly C program:

$ cat <<EOF >hello.c
#include <stdio.h>

int main(void)
{
        char name[80];

        fputs("Hello, what's your name? ", stderr);
        fgets(name, sizeof name, stdin);
        printf("Hello, %s", name);

        return 0;
}
EOF

$ gdb -x gdbscript.txt --args ~/gcc -B/usr/lib/gcc/arm-linux-gnueabi/4.4.3
-O99 -o hello hello.c

(The -B argument tells gcc where its brain is... the default location search
is relative to the location of the gcc binary and didn't work for me since
my copy is in a different place from the installed one.)

This appeared to run correctly for me, creating a working output file. See
gdb.log.

(You can type "set $nostop=1" in GDB before continuing to stop breaking on
every emulated instruction.)

It would be interesting to see whether this works for you, or whether you
still get faults or other bad things happening...

Dave Martin (dave-martin-arm) wrote :

One other thing...

I've been told that cachepolicy=uncached may not do anything useful :/

So if we want to eliminate cache problems from our enquiries, can you
rebuild your kernel with CONFIG_CPU_ICACHE_DISABLE=y and
CONFIG_CPU_DCACHE_DISABLE=y and retry?

Also, you should make sure you enable all errata workaround config options
that are relevant to Cortex-A8.

Robert Nelson (robertcnelson) wrote :

Thanks Dave for all your help!

So far enabling the errata workarounds seems to be the secret, I built 4 combinations with those defconfig changes...

I'm going to run a few more tests on the other beagles in my farm, but those 3 errata workarounds seem to have fixed it...

CONFIG_ARM_ERRATA_430973=y
CONFIG_ARM_ERRATA_458693=y
CONFIG_ARM_ERRATA_460075=y

Running:
fakeroot
gdb make
set args ARCH=arm deb-pkg
r

It's interesting to note, while reading at the errata notes here:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/Kconfig;h=4c33ca82f9b1f268bfdae70f0aee29fefaaf9cf7;hb=HEAD

ARM_ERRATA_430973 notes should probably be extended to r1p3 (currently only notes (r1p0..r1p2)) as the Omap3530's in early Bx's where ES2.0 r1p2's, B5+/ C2/3 where ES3.0 r1p3's... I'll disable the other two to prove it's that errata, and ping rmk with a quick documentation patch..

Thanks

Dave Martin (dave-martin-arm) wrote :

OK, let me know if you can confirm that CONFIG_ARM_ERRATA_430973=y fixes it and then we can retire this bug.

I found out how to do some exciting new things in gdb anyway :)

Robert Nelson (robertcnelson) wrote :

This bug is safe to close...

After a day of testing on multiple boards i haven't been able to get lucid to fail... Sounds like a good time for me to generate a demo alpha-2 rootfs and dump it on the beagleboard community.. ;)

Dave Martin (dave-martin-arm) wrote :

OK, great.

Changed in fakeroot (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers