system_server crashes when using Browser

Bug #978060 reported by Tixy (Jon Medhurst) on 2012-04-10
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro Android
Fix Released
High
vishal
Linaro big.LITTLE MP
Fix Released
Undecided
Unassigned
linaro-landing-team-arm
Fix Released
High
Unassigned

Bug Description

I can consistently crash the system by opening the browser, clicking on the 'News' tab on the Google home page, then selecting either of the menus to change region or appearance. On my board, the are shown as 'U.K. Edition' and 'Compact'.

I observed this on a Versatile Express A9 board, build https://android-build.linaro.org/builds/~linaro-android/vexpress-ics-gcc46-armlt-stable-open-12.04-release/#build=3

Here's an example of crash dump, and attached is full logcat with repeated crashes and reboots.

I/DEBUG ( 1718): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
I/DEBUG ( 1718): Build fingerprint: 'vexpress/vexpress/vexpress:4.0.4/IMM76D/3:eng/test-keys'
I/DEBUG ( 1718): pid: 1835, tid: 1846 >>> system_server <<<
I/DEBUG ( 1718): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000020
I/DEBUG ( 1718): r0 02080f40 r1 00000000 r2 00000000 r3 986e2dd7
I/DEBUG ( 1718): r4 9ee63122 r5 9b9b2c88 r6 01fa41a0 r7 000000f0
I/DEBUG ( 1718): r8 b661d980 r9 a007e7e0 10 9b9b2c3c fp 9bab2ca0
I/DEBUG ( 1718): ip 000000f0 sp 9bab2ba8 lr 986e2de9 pc b6621594 cpsr 20000010
I/DEBUG ( 1718): d0 000000100000004c d1 00002725000009ac
I/DEBUG ( 1718): d2 0000000000000044 d3 9bab30f09bab30ac
I/DEBUG ( 1718): d4 006c0061006e0072 d5 006500690076002e
I/DEBUG ( 1718): d6 00490049002e0077 d7 007400750070006e
I/DEBUG ( 1718): d8 0000000000000000 d9 0000000000000000
I/DEBUG ( 1718): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 1718): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 1718): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 1718): d16 000000002329a000 d17 00000000094ec000
I/DEBUG ( 1718): d18 0000000000000000 d19 0000000000000000
I/DEBUG ( 1718): d20 00000028fad93a01 d21 bfb1be5a93a83e1d
I/DEBUG ( 1718): d22 3f4de16b9c24a98f d23 be206435816bc5ca
I/DEBUG ( 1718): d24 3fede16b9c24a98f d25 3f733c3b597385d3
I/DEBUG ( 1718): d26 bf66c0c55ca9076a d27 3f1155e54e7e8408
I/DEBUG ( 1718): d28 bebbbc6c1a570a20 d29 3e66376972bea4d0
I/DEBUG ( 1718): d30 0000022200000222 d31 000000620000006c
I/DEBUG ( 1718): scr 60000012
I/DEBUG ( 1718):
I/DEBUG ( 1718): #00 pc 00022594 /system/lib/libdvm.so
I/DEBUG ( 1718): #01 pc 00034eb4 /system/lib/libdvm.so (_Z12dvmInterpretP6ThreadPK6MethodP6JValue)
I/DEBUG ( 1718): #02 pc 0007c12c /system/lib/libdvm.so (_Z14dvmCallMethodVP6ThreadPK6MethodP6ObjectbP6JValueSt9__va_list)
I/DEBUG ( 1718): #03 pc 0006163e /system/lib/libdvm.so
I/DEBUG ( 1718): #04 pc 0005106e /system/lib/libdvm.so
I/DEBUG ( 1718): #05 pc 00043d84 /system/lib/libandroid_runtime.so
I/DEBUG ( 1718): #06 pc 00063c16 /system/lib/libandroid_runtime.so
I/DEBUG ( 1718): #07 pc 000170fe /system/lib/libbinder.so (_ZN7android7BBinder8transactEjRKNS_6ParcelEPS1_j)
I/DEBUG ( 1718): #08 pc 0001aca0 /system/lib/libbinder.so (_ZN7android14IPCThreadState14executeCommandEi)
I/DEBUG ( 1718): #09 pc 0001b13e /system/lib/libbinder.so (_ZN7android14IPCThreadState14joinThreadPoolEb)
I/DEBUG ( 1718): #10 pc 00020698 /system/lib/libbinder.so
I/DEBUG ( 1718): #11 pc 0002432a /system/lib/libutils.so (_ZN7android6Thread11_threadLoopEPv)
I/DEBUG ( 1718): #12 pc 00040728 /system/lib/libandroid_runtime.so (_ZN7android14AndroidRuntime15javaThreadShellEPv)
I/DEBUG ( 1718): #13 pc 00023eda /system/lib/libutils.so
I/DEBUG ( 1718): #14 pc 000120b8 /system/lib/libc.so (__thread_entry)
I/DEBUG ( 1718): #15 pc 00011bd4 /system/lib/libc.so (pthread_create)
I/DEBUG ( 1718):
I/DEBUG ( 1718): code around pc:
I/DEBUG ( 1718): b6621574 e1f470b6 e207c0ff e088f30c e1d410b4 .p..............
I/DEBUG ( 1718): b6621584 e7950101 e3500000 0a0038ce e5901000 ......P..8......
I/DEBUG ( 1718): b6621594 e5912020 e3120102 1a001470 e1d612b8 ......p.......
I/DEBUG ( 1718): b66215a4 e2111008 1a001473 e1f470b6 e207c0ff ....s....p......
I/DEBUG ( 1718): b66215b4 e088f30c e320f000 e320f000 f57ff05e ...... ... .^...
I/DEBUG ( 1718):
I/DEBUG ( 1718): code around lr:
I/DEBUG ( 1718): 986e2dc8 9eb5f636 b6ecbe6c 020fb8e8 f85f0020 6...l....... ._.
I/DEBUG ( 1718): 986e2dd8 68010008 60013101 0028f8df 47886ef1 ...h.1.`..(..n.G
I/DEBUG ( 1718): 986e2de8 4300e000 47806e70 9ee63128 00000001 ...Cpn.G(1......
I/DEBUG ( 1718): 986e2df8 b6690000 9f8d21d8 ffff0101 00000001 ..i..!..........
I/DEBUG ( 1718): 986e2e08 00000000 9ee63122 020fb8ec f85f005c ...."1......\._.
I/DEBUG ( 1718):
I/DEBUG ( 1718): stack:
I/DEBUG ( 1718): 9bab2b68 9b9b2f70
I/DEBUG ( 1718): 9bab2b6c 01fa41a0 [heap]
I/DEBUG ( 1718): 9bab2b70 01fa41b0 [heap]
I/DEBUG ( 1718): 9bab2b74 b661d980 /system/lib/libdvm.so
I/DEBUG ( 1718): 9bab2b78 00000000
I/DEBUG ( 1718): 9bab2b7c 9b9b2f40
I/DEBUG ( 1718): 9bab2b80 9bab2ca0
I/DEBUG ( 1718): 9bab2b84 b66460c9 /system/lib/libdvm.so
I/DEBUG ( 1718): 9bab2b88 00000001
I/DEBUG ( 1718): 9bab2b8c 9b9b2f70
I/DEBUG ( 1718): 9bab2b90 9ee43122 /data/dalvik-cache/system@<email address hidden>@classes.dex
I/DEBUG ( 1718): 9bab2b94 9b9b2c88
I/DEBUG ( 1718): 9bab2b98 01fa41a0 [heap]
I/DEBUG ( 1718): 9bab2b9c 000000b2
I/DEBUG ( 1718): 9bab2ba0 df0027ad
I/DEBUG ( 1718): 9bab2ba4 00000000
I/DEBUG ( 1718): #00 9bab2ba8 a0678038 /dev/ashmem/dalvik-heap (deleted)
I/DEBUG ( 1718): 9bab2bac 01fa41a0 [heap]
I/DEBUG ( 1718): 9bab2bb0 9f84b548 /dev/ashmem/dalvik-LinearAlloc (deleted)
I/DEBUG ( 1718): 9bab2bb4 9bab2c1c
I/DEBUG ( 1718): 9bab2bb8 00000000
I/DEBUG ( 1718): 9bab2bbc b66d0f48 /system/lib/libdvm.so
I/DEBUG ( 1718): 9bab2bc0 fffffe64
I/DEBUG ( 1718): 9bab2bc4 00000000
I/DEBUG ( 1718): 9bab2bc8 9bab2ca0
I/DEBUG ( 1718): 9bab2bcc b6633eb8 /system/lib/libdvm.so
I/DEBUG ( 1718): #01 9bab2bd0 00000000
I/DEBUG ( 1718): 9bab2bd4 00000011
I/DEBUG ( 1718): 9bab2bd8 00000000
I/DEBUG ( 1718): 9bab2bdc 00000000
I/DEBUG ( 1718): 9bab2be0 00000000
I/DEBUG ( 1718): 9bab2be4 00000000
I/DEBUG ( 1718): 9bab2be8 00000000
I/DEBUG ( 1718): 9bab2bec 00000000
I/DEBUG ( 1718): 9bab2bf0 00000000
I/DEBUG ( 1718): 9bab2bf4 00000000
I/DEBUG ( 1718): 9bab2bf8 00000000
I/DEBUG ( 1718): 9bab2bfc 00000000
I/DEBUG ( 1718): 9bab2c00 00000000
I/DEBUG ( 1718): 9bab2c04 00000000
I/DEBUG ( 1718): 9bab2c08 00000000
I/DEBUG ( 1718): 9bab2c0c 00000000
I/DEBUG ( 1718): 9bab2c10 00000000
I/DEBUG ( 1718): 9bab2c14 00000000
I/DEBUG ( 1718): 9bab2c18 b66aff00 /system/lib/libdvm.so
I/DEBUG ( 1718): 9bab2c1c 00000000
I/DEBUG ( 1718): 9bab2c20 9b9b2f98
I/DEBUG ( 1718): 9bab2c24 00000000
I/DEBUG ( 1718): 9bab2c28 00000000
I/DEBUG ( 1718): 9bab2c2c 00000000
I/DEBUG ( 1718): 9bab2c30 00000000
I/DEBUG ( 1718): 9bab2c34 00000000
I/DEBUG ( 1718): 9bab2c38 00000000
I/DEBUG ( 1718): 9bab2c3c 00000000
I/DEBUG ( 1718): 9bab2c40 b66d5f64 /system/lib/libdvm.so
I/DEBUG ( 1718): 9bab2c44 01fa41a0 [heap]
I/DEBUG ( 1718): 9bab2c48 9f84b548 /dev/ashmem/dalvik-LinearAlloc (deleted)
I/DEBUG ( 1718): 9bab2c4c 00000001
I/DEBUG ( 1718): 9bab2c50 9bab2ca0
I/DEBUG ( 1718): 9bab2c54 9b9b2fc4
I/DEBUG ( 1718): 9bab2c58 9bab2d24
I/DEBUG ( 1718): 9bab2c5c 9f30127a /data/dalvik-cache/system@<email address hidden>@classes.dex
I/DEBUG ( 1718): 9bab2c60 a0678038 /dev/ashmem/dalvik-heap (deleted)
I/DEBUG ( 1718): 9bab2c64 b667b12f /system/lib/libdvm.so

Tixy (Jon Medhurst) (tixy) wrote :
Tixy (Jon Medhurst) (tixy) wrote :

I can reproduce this on Snowball build https://android-build.linaro.org/builds/~linaro-android/snowball-ics-gcc46-igloo-stable-blob-12.04-release/#build=2

It seems that to trigger it, you don't need to actually click on anything on the Google new page, moving the mouse around or even just going to the page seems to be enough.

Changed in linaro-landing-team-arm:
status: New → Triaged
importance: Undecided → High
Zach Pfeffer (pfefferz) on 2012-07-05
Changed in linaro-android:
assignee: nobody → Zach Pfeffer (pfefferz)
milestone: none → 12.07
Paul Larson (pwlars) wrote :

Yeah, it took a bit of clicking around, but I was able to reproduce this on snowball as well by select UK as the region, then clicking on world news on the left hand side. Initially, my board locked up when trying this, but after upping the vmalloc to 400M I got behavior closer to what is described here. It went back to the "android" graphic on the screen, then to the home screen.

As a control, people can use:

https://android-build.linaro.org/builds/~linaro-android/panda-ics-gcc47-omapzoom-stable-blob/

If that solid, then its probably kernel related.

On 5 July 2012 09:40, Paul Larson <email address hidden> wrote:
> Yeah, it took a bit of clicking around, but I was able to reproduce this
> on snowball as well by select UK as the region, then clicking on world
> news on the left hand side. Initially, my board locked up when trying
> this, but after upping the vmalloc to 400M I got behavior closer to what
> is described here. It went back to the "android" graphic on the screen,
> then to the home screen.
>
> --
> You received this bug notification because you are a member of Linaro
> Android Team, which is subscribed to Linaro Android.
> Matching subscriptions: all-android-bugs
> https://bugs.launchpad.net/bugs/978060
>
> Title:
> system_server crashes when using Browser
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linaro-android/+bug/978060/+subscriptions

--
Zach Pfeffer
Android Platform Team Lead, Linaro Platform Teams
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Paul Larson (pwlars) wrote :

I hit what appears to be a fairly similar bug in the browser when trying to get youtube.com to load on panda 4430 with the 12.06 release. It doesn't seem to happen every time though. I also tried reproducing on panda with news.google.com but was unable to do so.

Zach Pfeffer (pfefferz) on 2012-07-09
Changed in linaro-android:
importance: Undecided → Medium
Zach Pfeffer (pfefferz) on 2012-07-23
Changed in linaro-android:
assignee: Zach Pfeffer (pfefferz) → nobody
milestone: 12.07 → 12.08
Zach Pfeffer (pfefferz) wrote :

Went back to https://android-build.linaro.org/builds/~linaro-android/panda-ics-gcc47-tilt-tracking-blob-12.05-release/#build=5 with acceleration and everything worked okay. I did these experiments:

After turning on Ethernet with a workaround:

ifconfig eth0 192.168.1.8 netmask 255.255.255.0 up
route add default gw 192.168.1.1 dev eth0
setprop net.dns1 192.168.1.1
setprop net.dns2 8.8.8.8

I went to news.google.com

Change the region to Japan then to Cuba then to the UK then clicked on the World News link.

Everything worked okay.

Zach Pfeffer (pfefferz) wrote :

Tested on https://android-build.linaro.org/builds/~linaro-android/panda-ics-gcc47-tilt-tracking-blob-12.06-release/#build=4 which is not accelerated. Did the same experiment and it failed on switching to Japan.

Zach Pfeffer (pfefferz) wrote :
Download full text (5.0 KiB)

Relevant log:

F/libc ( 1265): Fatal signal 11 (SIGSEGV) at 0x000c6312 (code=1)
I/DEBUG ( 1164): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
I/DEBUG ( 1164): Build fingerprint: 'pandaboard/pandaboard/pandaboard:4.0.4/IMM76L/4:eng/test-keys'
I/DEBUG ( 1164): pid: 1265, tid: 1276 >>> system_server <<<
I/DEBUG ( 1164): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 000c6312
I/DEBUG ( 1164): r0 b01d8298 r1 000c62f2 r2 00000000 r3 00000001
I/DEBUG ( 1164): r4 9ee62122 r5 9b7bbc88 r6 012f5608 r7 000000f0
I/DEBUG ( 1164): r8 b659a940 r9 a007eaa0 10 9b7bbc3c fp 9b7bbfc4
I/DEBUG ( 1164): ip 000000f0 sp 9b8bbb88 lr 9848d9d9 pc b659e554 cpsr a00f0110
I/DEBUG ( 1164): d0 006800740065004d d1 000a00290064006f
I/DEBUG ( 1164): d2 0000000000000000 d3 4000000000000000
I/DEBUG ( 1164): d4 bfd4e54853e39056 d5 3f800000c2ccccd0
I/DEBUG ( 1164): d6 4b05008800850088 d7 42c800004b281fe0
I/DEBUG ( 1164): d8 0000000000000000 d9 0000000000000000
I/DEBUG ( 1164): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 1164): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 1164): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 1164): d16 0000000000000000 d17 003a00730069006c
I/DEBUG ( 1164): d18 3ff0000000000000 d19 bfd607da0a89d841
I/DEBUG ( 1164): d20 be628b2d20911660 d21 3fc691be91f1192a
I/DEBUG ( 1164): d22 3f114a9ed41c505e d23 bebbaa4f9d702bc9
I/DEBUG ( 1164): d24 3fe62e4300000000 d25 3fc7161b2540b34f
I/DEBUG ( 1164): d26 3fd1d3c9b57e9962 d27 3e66376972bea4d0
I/DEBUG ( 1164): d28 bfa88bf4f327b79c d29 c002c0fb41513b08
I/DEBUG ( 1164): d30 bfa88bf4f327b79e d31 be58be6055b2d877
I/DEBUG ( 1164): scr 60000012

I/DEBUG ( 1164):
I/DEBUG ( 1164): #00 pc 00022554 /system/lib/libdvm.so
I/DEBUG ( 1164): #01 pc 00035000 /system/lib/libdvm.so (_Z12dvmInterpretP6ThreadPK6MethodP6JValue)
I/DEBUG ( 1164): #02 pc 000807e2 /system/lib/libdvm.so (_Z14dvmCallMethodVP6ThreadPK6MethodP6ObjectbP6JValueSt9__va_list)
I/DEBUG ( 1164): #03 pc 00065500 /system/lib/libdvm.so
I/DEBUG ( 1164): #04 pc 0004c246 /system/lib/libdvm.so
I/DEBUG ( 1164): #05 pc 00048852 /system/lib/libandroid_runtime.so
I/DEBUG ( 1164): #06 pc 00069c84 /system/lib/libandroid_runtime.so
I/DEBUG ( 1164): #07 pc 00016fbc /system/lib/libbinder.so (_ZN7android7BBinder8transactEjRKNS_6ParcelEPS1_j)
I/DEBUG ( 1164): #08 pc 0001ad60 /system/lib/libbinder.so (_ZN7android14IPCThreadState14executeCommandEi)
I/DEBUG ( 1164): #09 pc 0001b2d2 /system/lib/libbinder.so (_ZN7android14IPCThreadState14joinThreadPoolEb)
I/DEBUG ( 1164): #10 pc 00020b1c /system/lib/libbinder.so
I/DEBUG ( 1164): #11 pc 00025090 /system/lib/libutils.so (_ZN7android6Thread11_threadLoopEPv)
I/DEBUG ( 1164): #12 pc 00044e08 /system/lib/libandroid_runtime.so (_ZN7android14AndroidRuntime15javaThreadShellEPv)
I/DEBUG ( 1164): #13 pc 00024c28 /system/lib/libutils.so
I/DEBUG ( 1164): #14 pc 0001223c /system/lib/libc.so (__thread_...

Read more...

Download full text (3.6 KiB)

We see the same problem on vexpress with TC2 tile with an ICS 4.0.1 build. It can be triggered by running BBench 2.0 in the webkit browser. We traced down this issue to the point that it happens when the browser sends one of multiple GET_MEMORY_INFO_TRANSACTION messages to the system_server. signal 11 (SIGSEGV), code 1 (SEGV_MAPERR) happens in DVM interpreter in one of system_server's Binder threads. The same problem can be seen by using a simple test application sending this message constantly.

log files are: system_server_16072012.txt, logcat_16072012.txt, tombstone_00_16072012.txt (see attachements)

Callstack of '56 tid: 1544 Binder Thread #5 native #00 pc 00022c1c /system/lib/libdvm.so <-- signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)'

#00 pc 00022c1c /system/lib/libdvm.so
#01 pc 000342b4 /system/lib/libdvm.so (_Z12dvmInterpretP6ThreadPK6MethodP6JValue)
#02 pc 0006c7d8 /system/lib/libdvm.so (Z14dvmCallMethodVP6ThreadPK6MethodP6ObjectbP6JValueSt9_va_list)
#03 pc 0005818c /system/lib/libdvm.so
#04 pc 0004c450 /system/lib/libdvm.so
#05 pc 000434f6 /system/lib/libandroid_runtime.so
#06 pc 0005dbd6 /system/lib/libandroid_runtime.so
#07 pc 00017ec0 /system/lib/libbinder.so (_ZN7android7BBinder8transactEjRKNS_6ParcelEPS1_j)
#08 pc 0001b202 /system/lib/libbinder.so (_ZN7android14IPCThreadState14executeCommandEi)
#09 pc 0001b3de /system/lib/libbinder.so (_ZN7android14IPCThreadState14joinThreadPoolEb)
#10 pc 000206bc /system/lib/libbinder.so
#11 pc 00020cd6 /system/lib/libutils.so (_ZN7android6Thread11_threadLoopEPv)
#12 pc 00040ac4 /system/lib/libandroid_runtime.so (_ZN7android14AndroidRuntime15javaThreadShellEPv)
#13 pc 0002131c /system/lib/libutils.so
#14 pc 00012bf4 /system/lib/libc.so (__thread_entry)
#15 pc 00012748 /system/lib/libc.so (pthread_create)

#00 dalvik_inst
/home/dieegg01/work/repo/aosp/dalvik/vm/mterp/out/InterpAsm-armv7-a-neon.S:7484

#01 dvmInterpret(Thread*, Method const*, JValue*)
/home/dieegg01/work/repo/aosp/dalvik/vm/interp/Interp.cpp:1965

#02 dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)
/home/dieegg01/work/repo/aosp/dalvik/vm/interp/Stack.cpp:523

#03 CallBooleanMethodV
/home/dieegg01/work/repo/aosp/dalvik/vm/Jni.cpp:2018

#04 Check_CallBooleanMethodV
/home/dieegg01/work/repo/aosp/dalvik/vm/CheckJni.cpp:1666

#05 _JNIEnv::CallBooleanMethod(_jobject*, _jmethodID*, ...)
/home/dieegg01/work/repo/aosp/dalvik/libnativehelper/include/nativehelper/jni.h:633

#06 JavaBBinder::onTransact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)
/home/dieegg01/work/repo/aosp/frameworks/base/core/jni/android_util_Binder.cpp:290

#07 android::BBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)
/home/dieegg01/work/repo/aosp/frameworks/base/libs/binder/Binder.cpp:107

#08 android::IPCThreadState::executeCommand(int)
/home/dieegg01/work/repo/aosp/frameworks/base/libs/binder/IPCThreadState.cpp:1027

#09 android::IPCThreadState::joinThreadPool(bool)
/home/dieegg01/work/repo/aosp/frameworks/base/libs/binder/IPCThreadState.cpp:468

#10 android::PoolThread::threadLoop()
/home/dieegg01/work/repo/aosp/frameworks/base/libs/binder/ProcessState.cpp:67

#11 android:...

Read more...

David Zinman (dzinman) on 2012-07-24
Changed in linaro-android:
importance: Medium → High
Zach Pfeffer (pfefferz) on 2012-07-25
Changed in linaro-android:
assignee: nobody → Zach Pfeffer (pfefferz)
status: New → Confirmed
Zach Pfeffer (pfefferz) wrote :

Here's a bugreport, generated via:

adb bugreport

after catching the SIGSEGV

Zach Pfeffer (pfefferz) wrote :

I think this post may point the right direction: http://stackoverflow.com/questions/11298274/strange-jvm-error-with-ics

Zach Pfeffer (pfefferz) wrote :

Added

adb shell setprop debug.checkjni 1

Got the SIGSEGV, but no other output. Saw

D/AndroidRuntime( 1158): CheckJNI is ON

but nothing else. Attaching log.

I see 2 different error scenarios, I get both of them on vexpress/TC2 tile with ICS 4.0.1 build too.

(1) signal 11 (SIGSEGV), code 1 (SEGV_MAPERR) in one of the binder threads of system_server (#9, #10)

(2) signal 11 (SIGSEGV), code 1 (SEGV_MAPERR) in one of the threads of com.android.browser (#11)

If I change the execution mode of the Dalvik VM from int:jit (default) to int:fast or int:portable, (1) does not happen anymore.

adb shell setprop dalvik.vm.execution-mode int:fast
adb shell stop; adb shell start <== so that system_server is affected by this change

Paul Larson (pwlars) wrote :

Talked to ARM this morning and they are able to work around by disabling jit, but obviously this isn't a nice way to work around it.
Recommendation from Android team is to test this under jellybean as all the builds are moving to it, and they believe it will work better there.

Zach Pfeffer (pfefferz) wrote :

based on this post:

http://code.google.com/p/android/issues/detail?id=14498

Seems to be a basic thread-safe issue (plus the fact that it happens every so often). Perhaps some more locking around accesses to /dev/ashmem/dalvik-LinearAlloc would help. LinearAlloc is managed in dalvik/vm/LinearAlloc.cpp. We just need to make sure all accesses through all fd's to this region are properly locked.

I'm pretty sure this buffer is what the code is actually getting compiled into for the JIT, need to check. So it makes sense that turning off JIT helps.

Dietmar, on the TC2, what's the performance penalty look like?

Zach Pfeffer (pfefferz) wrote :

Turning off the active wallpaper had no effect. The unit still crashed:

D/AlarmManagerService( 1253): Kernel timezone updated to 0 minutes west of GMT
V/AlarmClock( 1569): AlarmInitReceiver finished
V/tiny_hw ( 1159): out_standby(0x1d87040) closing PCM
D/dalvikvm( 1736): GC_CONCURRENT freed 392K, 7% free 7747K/8263K, paused 2ms+4ms
D/dalvikvm( 1253): GC_CONCURRENT freed 398K, 12% free 8442K/9543K, paused 2ms+4ms
F/libc ( 1253): Fatal signal 11 (SIGSEGV) at 0x64f2042f (code=1)
I/DEBUG ( 1154): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
I/DEBUG ( 1154): Build fingerprint: 'pandaboard/pandaboard/pandaboard:4.0.4/IMM76L/eng.pfefferz.20120725.133443:eng/test-keys'
I/DEBUG ( 1154): pid: 1253, tid: 1265 >>> system_server <<<
I/DEBUG ( 1154): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 64f2042f
I/DEBUG ( 1154): r0 b02a12aa r1 64f2040f r2 00000000 r3 00000001
I/DEBUG ( 1154): r4 9ee63122 r5 9b74bc88 r6 007f2818 r7 000000f0
I/DEBUG ( 1154): r8 b6663940 r9 a007eb00 10 9b74bc3c fp 9b74bfc4
I/DEBUG ( 1154): ip 000000f0 sp 9b84bb88 lr 98504ded pc b6667554 cpsr a0030110
I/DEBUG ( 1154): d0 006800740065004d d1 000a00290064006f
I/DEBUG ( 1154): d2 0000000000000000 d3 394377ce858a5d48
I/DEBUG ( 1154): d4 3ff0000000000000 d5 3f800000c2ccccd0
I/DEBUG ( 1154): d6 3f4ccccd3f99999a d7 3f8000003f800000
I/DEBUG ( 1154): d8 0000000000000000 d9 0000000000000000
I/DEBUG ( 1154): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 1154): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 1154): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 1154): d16 0000000000000000 d17 003a00730069006c
I/DEBUG ( 1154): d18 3ff0000000000000 d19 bf56c16c16c15177
I/DEBUG ( 1154): d20 3efa01a019cb1590 d21 be927e4f809c52ad
I/DEBUG ( 1154): d22 3e21ee9ebdb4b1c4 d23 3fe0000000000000
I/DEBUG ( 1154): d24 392f1976b7ed8fc0 d25 b94377ce858a5d48
I/DEBUG ( 1154): d26 3ff0000000000000 d27 4000000000000000
I/DEBUG ( 1154): d28 3dd0b4611a600000 d29 3ba3198a2e037073
I/DEBUG ( 1154): d30 3de0b4611a600000 d31 bca1a60000000000
I/DEBUG ( 1154): scr 80000012
I/DEBUG ( 1154):
I/DEBUG ( 1154): #00 pc 00022554 /system/lib/libdvm.so
I/DEBUG ( 1154): #01 pc 00035000 /system/lib/libdvm.so (_Z12dvmInterpretP6ThreadPK6MethodP6JValue)
I/DEBUG ( 1154): #0

Zach Pfeffer (pfefferz) wrote :

I don't have any test data right now to figure out the performance penalty when changing the execution mode of DalvikVM.

OTAH, by applying the following patch:

diff --git a/core/java/android/webkit/JniUtil.java b/core/java/android/webkit/JniUtil.java
index 7759ff3..ae92b8b 100644
--- a/core/java/android/webkit/JniUtil.java
+++ b/core/java/android/webkit/JniUtil.java
@@ -176,13 +176,14 @@ class JniUtil {
     }

     private static boolean canSatisfyMemoryAllocation(long bytesRequested) {
- checkInitialized();
- ActivityManager manager = (ActivityManager) sContext.getSystemService(
- Context.ACTIVITY_SERVICE);
- ActivityManager.MemoryInfo memInfo = new ActivityManager.MemoryInfo();
- manager.getMemoryInfo(memInfo);
- long leftToAllocate = memInfo.availMem - memInfo.threshold;
- return !memInfo.lowMemory && bytesRequested < leftToAllocate;
+// checkInitialized();
+// ActivityManager manager = (ActivityManager) sContext.getSystemService(
+// Context.ACTIVITY_SERVICE);
+// ActivityManager.MemoryInfo memInfo = new ActivityManager.MemoryInfo();
+// manager.getMemoryInfo(memInfo);
+// long leftToAllocate = memInfo.availMem - memInfo.threshold;
+// return !memInfo.lowMemory && bytesRequested < leftToAllocate;
+ return true;
     }

we're able to run BBench 2.0 in the default browser with execution mode of DalvikVM set to int:jit.

The patch just prevents the GET_MEMORY_INFO_TRANSACTION storm (mentioned under #10) which the browser sends to the system_server which eventually brings system_server down.

The method 'canSatisfyMemoryAllocation' is called in ImageBuffer::ImageBuffer and PassRefPtr<Image> ImageBuffer::copyImage() [external/webkit/Source/WebCore/platform/graphics/android/ImageBufferAndroid.cpp].

I monitored the calls to 'canSatisfyMemoryAllocation' when running BBench 2.0 and the argument 'bytesRequested' is either 92, 100 or 104. sytem_server is always answering that there is no memory shortage and there is plenty memory available.

With this kludge, I never saw the crash in system_server or browser again when running BBench 2.0.

I have still to figure out for which graphic elements webkit is using the class ImageBuffer.

Zach Pfeffer (pfefferz) wrote :

Hey cool. Maybe the request input queue on the system_server could also be increased. binder calls are all synchronous (so says the documentation) so I would expect things to be stuck on: manager.getMemoryInfo(memInfo);

I tried a few experiments:

1. Just click the edition button in the browser, over and over. I didn't see any crash.
2. Click between, US and World. I see the crash pretty quick.
3. Set dalvik.vm.heapgrowthlimit to 128m then click between, US and World. I see the crash almost immediately.

Seems interesting that the storm would cause the system_server to crash and not just cause things to back up. Perhaps by the time the system_server gets around to satisfying the call the callee is no longer alive.

Zach Pfeffer (pfefferz) wrote :

Looking back over some crashes I do see:

"WebViewCoreThread" prio=5 tid=11 NATIVE
  | group="main" sCount=1 dsCount=0 obj=0xa06b19b0 self=0x143c400
  | sysTid=1765 nice=0 sched=0/0 cgrp=[fopen-error:2] handle=21231720
  | schedstat=( 0 0 0 ) utm=887 stm=43 core=1
  at android.os.BinderProxy.transact(Native Method)
  at android.app.ActivityManagerProxy.getMemoryInfo(ActivityManagerNative.java:2726)
  at android.app.ActivityManager.getMemoryInfo(ActivityManager.java:970)
  at android.webkit.JniUtil.canSatisfyMemoryAllocation(JniUtil.java:189)
  at android.webkit.WebViewCore.nativeRecordContent(Native Method)
  at android.webkit.WebViewCore.webkitDraw(WebViewCore.java:2042)
  at android.webkit.WebViewCore.access$900(WebViewCore.java:55)
  at android.webkit.WebViewCore$EventHub$1.handleMessage(WebViewCore.java:1107)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:137)
  at android.webkit.WebViewCore$WebCoreThread.run(WebViewCore.java:728)
  at java.lang.Thread.run(Thread.java:856)

in each instance (via adb bugreport command)

Also got the bug on Jelly Bean.

I attached a sourcecode for an apk that trigger the bug.

Also got the bug on Jelly Bean.

I attached a sourcecode for an apk that trigger the bug.

vishal (vishalbhoj) wrote :

Its reproducible on JB on origen but doesn't show up on galaxy nexus running JB with the test apk.

12.06 rebuilt with AOSP's toolchain on panda crashes as well

Some test I've done:
- Running on only one A15 (uniprocessor, all other core removed from dtb): crash
- Running on JB I build from AOSP + armboard driver : crash
- Replacing the flush_cache() syscall of the kernel by flushing all icache and user cache: crash
- Modified Binder to always have the same thread in system_server that would process the request: crash
- Running without JIT: NO CRASH

I checked:
- that the object pointer used to call the Binder java class from JNI JavaBBinder::onTransact() is valid:
....- It is always valid even just before a SIGSEGV or any Exception
....- Worst, it's reused successfully just after.
- that the parcel used in java is right before and after the exception or the SIGSEGV.
- that no Binder or object inherited from Binder was created or finalized before/after the Exception or the crash.
- that no other thread of system_server were running at the same time, this thread was the only one to run in system server for at least 0.5s.

The crash could be:
- NullPointerException in android.app.ActivityManagerNative.onTransact(ActivityManagerNative.java) where the pointer given by JNI I JavaBBinder::onTransact() was valid!
- .ArithmeticException: divide by zero om android.app.ActivityManagerNative.onTransact(ActivityManagerNative.java) were there is no divide!
- java.lang.IncompatibleClassChangeError in android.app.ActivityManagerNative.onTransact(ActivityManagerNative.java) were the methods is obviuosly there!
- SIGSEGV in dvmInterpret() called from JNI BBinder::transact() -> JavaBBinder::onTransact(). But I checked the object just before, so the method/object should be there.
- Heap corrupt (abort==SIGSEGV deadbaad) in mspace_free() called from dvmCollectGarbageInternal().

The SIGSEGV in dvmInterpret() is in the jit code.
Most of the time the current JIT instruction is OP_IGET_OBJECT_QUICK (0xf4) or invoke-virtual-quick (0xf8).

In the first case, the object pointer can be 0x1 (r3 == 0x1 as it was in the stack of the interpreter), and so it SIGSEGV just after the checking if the pointer is NULL, while to access the field (r1 == 8).

In the second, the object pointer is valid (r9), so no problem with the check, but the clazz field is -1, so it SIGSEGV while accessing the vtable.

vishal (vishalbhoj) wrote :

Unable to reproduce on Galaxy nexus built with linaro toolchain using the apk:
https://android-build.linaro.org/builds/~linaro-android/galaxynexus-jb-gcc47-aosp-blob/#build=20

Zach Pfeffer (pfefferz) on 2012-08-09
Changed in linaro-android:
milestone: 12.08 → 12.09
assignee: Zach Pfeffer (pfefferz) → vishal (vishalbhoj)
vishal (vishalbhoj) wrote :

This bug seems to do with the kernel version . Here is the observation with the bug apk:

Android JB with AOSP 3.0.8 kernel on galaxy nexus doesn't crash:
https://android.googlesource.com/kernel/omap/+/android-omap-tuna-3.0

Android JB with the tilt 3.2 kernel The bug doesn't crash:
http://android.git.linaro.org/gitweb?p=kernel/panda.git;a=shortlog;h=refs/heads/linaro-tilt-android-tracking

Android JB with tilt 3.4 kernel with software graphics does crash:
http://git.linaro.org/gitweb?p=landing-teams/working/ti/kernel.git;a=shortlog;h=refs/heads/tilt-3.4

Android JB with samsunlt 3.4 kernel with hardware graphics crashes:
http://git.linaro.org/gitweb?p=landing-teams/working/samsung/kernel.git;a=shortlog;h=refs/heads/android-stable

The bug seems to be present only on recent kernel 3.4+ .

Anmar Oueja (anmar) on 2012-08-21
Changed in linaro-android:
status: Confirmed → In Progress
Anmar Oueja (anmar) wrote :

Turning off JIT as discussed in comment #16 is the current workaround until the matter is resolved.

vishal (vishalbhoj) wrote :

There seems to be some difference in the Jitted code . I ran the the bug application with dalvik in self verification mode and it reports a control divergence :

D/dalvikvm( 3376): ~~~ DbgIntp(62): CONTROL DIVERGENCE!
D/dalvikvm( 3376): startPC: 0x9e449ee8 endPC: 0x9e449ee6 currPC: 0x9f0e784c
D/dalvikvm( 3376): ********** SHADOW STATE DUMP **********
D/dalvikvm( 3376): CurrentPC: 0x9f0e784c, Offset: 0x001c
D/dalvikvm( 3376): Class: Landroid/app/ActivityManagerNative;
D/dalvikvm( 3376): Method: onTransact
D/dalvikvm( 3376): Dalvik PC: 0x9e449ee8 endPC: 0x9e449ee6
D/dalvikvm( 3376): Interp FP: 0x9a0e5c38 endFP: 0x9a0e5afc
D/dalvikvm( 3376): Shadow FP: 0xb7c61710 endFP: 0xb7c61710
D/dalvikvm( 3376): Frame1 Bytes: 744 Frame2 Local: 28 Bytes: 288
D/dalvikvm( 3376): Trace length: 100 State: 9
D/dalvikvm( 3376): ********** SHADOW TRACE DUMP **********
D/dalvikvm( 3376): 0x9e449ee8: (0xff9b136a) packed-switch
D/dalvikvm( 3376): 0x9e44bfa8: (0xff9b23ca) const-string

vishal (vishalbhoj) wrote :

This shows up origen JB stable build (which has a 3.4 kernel) but doesn't show up on stable panda JB build (which has a 3.2 kernel) .

No need to continue looking for this bug.
We are in the process of validating our patch that solve the root of the problem, should be sent soon.

Zach Pfeffer (pfefferz) wrote :

Cool. Are you going to send it to AOSP?

On 6 September 2012 05:42, Olivier Cozette <email address hidden> wrote:
> No need to continue looking for this bug.
> We are in the process of validating our patch that solve the root of the problem, should be sent soon.
>
> --
> You received this bug notification because you are a member of Linaro
> Android Team, which is subscribed to Linaro Android.
> Matching subscriptions: all-android-bugs
> https://bugs.launchpad.net/bugs/978060
>
> Title:
> system_server crashes when using Browser
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linaro-android/+bug/978060/+subscriptions

--
Zach Pfeffer
Android Platform Team Lead, Linaro Platform Teams
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Probably :-)

Zach Pfeffer (pfefferz) wrote :

Can you guys give a little more info and share the patch?

The bug is in https://code.google.com/p/android/issues/detail?id=37561
That contains the patch that solve it.

Tixy (Jon Medhurst) (tixy) wrote :
Changed in linaro-android:
status: In Progress → Fix Committed
Changed in linaro-landing-team-arm:
status: Triaged → Fix Committed
milestone: none → 2012.09
Changed in linaro-landing-team-arm:
status: Fix Committed → Fix Released
Changed in linaro-android:
status: Fix Committed → Fix Released
Changed in linaro-big.little.mp:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.