firefox fails to build from source with Linaro toolchain

Bug #604874 reported by Michael Hope
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Fix Released
High
Ulrich Weigand
4.4
Fix Released
High
Ulrich Weigand
4.5
Fix Released
High
Ulrich Weigand
Linaro GCC Tracking
Fix Released
Undecided
Unassigned
Mozilla Firefox
Invalid
Medium
gcc
Fix Released
Medium
firefox (Ubuntu)
Fix Released
High
Unassigned
gcc-4.4 (Ubuntu)
Fix Released
High
Unassigned
gcc-4.5 (Ubuntu)
Fix Released
High
Unassigned
xulrunner-1.9.2 (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Seen with gcc 4.4.4-6ubuntu5~ppa2 and firefox 3.6.6+nobinonly-0ubuntu1

Fails when running a Javascript file under an interpreter as part of the build process. I suspect this chunk of Javascript is a template that generates code used in the rest of the build.

http://launchpadlibrarian.net/51627119/buildlog_ubuntu-maverick-armel.firefox_3.6.6%2Bnobinonly-0ubuntu1_FAILEDTOBUILD.txt.gz

gcc -c -x c -E -P -I. imacro_asm.js.in > imacro_asm.js
./../../dist/bin/run-mozilla.sh ./../../dist/bin/js imacro_asm.js ./imacros.jsasm > imacros.c.tmp
Segmentation fault
make[4]: *** [libs] Error 139

Related branches

Loïc Minier (lool)
Changed in gcc-linaro:
importance: Undecided → High
Loïc Minier (lool)
Changed in gcc-linaro:
milestone: none → 4.4-2010.07-1
Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Seg fault still exist in gcc-4.4 Ubuntu 4.4.4-7ubuntu1~ppa2.

gcc -c -x c -E -P -I. imacro_asm.js.in > imacro_asm.js
./../../dist/bin/run-mozilla.sh ./../../dist/bin/js imacro_asm.js ./imacros.jsasm > imacros.c.tmp
Segmentation fault
make[4]: *** [libs] Error 139

# gcc --version
gcc-4.4.real (Ubuntu 4.4.4-7ubuntu1~ppa2) 4.4.4 20100712 (Linaro) [release 2010.07-0]

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Seg fault is in

# pwd
/home/yao/maverick/root/firefox-3.6.7+build2+nobinonly/build-tree/mozilla/js/src
# ./../../dist/bin/js imacro_asm.js ./imacros.jsasm
Segmentation fault

Run js with gdb,
Program received signal SIGSEGV, Segmentation fault.
js_CheckForStringIndex (id=705479548) at jsobj.cpp:3412
3412 jschar ch = *s;
(gdb) bt
#0 js_CheckForStringIndex (id=705479548) at jsobj.cpp:3412
#1 0x2a044156 in js_DefineNativeProperty (cx=0x2a0d6b80, obj=0x2a0f80c0, id=705479548, value=22,
    getter=0x2a03fb65 <block_getProperty>, setter=0x2a03fb01 <block_setProperty>, attrs=69, flags=4,
    shortid=0, propp=0x0, defineHow=0) at jsobj.cpp:3605
#2 0x2a044572 in js_DefineBlockVariable (cx=0x2a0cc37c, obj=0x2a0cc378, id=705523208,
    index=<value optimized out>) at jsobj.cpp:2667
#3 0x2a053b8c in BindLet (cx=0x2a0d6b80, data=0xbefbd59c, atom=0x2a0cc37c, tc=<value optimized out>)
    at jsparse.cpp:3067
#4 0x2a0581bc in Variables (cx=0x2a0d6b80, ts=0xbefbd818, tc=0xbefbd6a4,
    inLetHead=<value optimized out>) at jsparse.cpp:5623
#5 0x2a05898c in Statement (cx=0x2a0d6b80, ts=0xbefbd818, tc=<value optimized out>)
    at jsparse.cpp:4632
....
....
(gdb) p s
$1 = (const jschar *) 0x20000001
(gdb) p *s
Cannot access memory at address 0x20000001
(gdb)

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Relace -O2 with -O0 in FLAGS, and rebuild. Segfault still exists, and every time, it fails on accessing magic address '0x20000001'.

Revision history for this message
Loïc Minier (lool) wrote :

Looks similar to https://bugzilla.mozilla.org/show_bug.cgi?id=560464

I saw a bunch of commits landing on the mozilla-central hg repo, under js/; I wonder whether we can disable JIT when calling the /js interpreter to see whether it's the jit or not

Revision history for this message
Loïc Minier (lool) wrote :
Changed in gcc-linaro:
status: New → In Progress
Changed in firefox (Ubuntu):
importance: Undecided → High
status: New → In Progress
Revision history for this message
Loïc Minier (lool) wrote :

I tried building mozilla-central tip as of today; in the js subdir I ran autoconf, configure, make, then dist/bin/js would still crash.

I can't get a clean backtrace though:
(gdb) bt
#0 0x00061536 in js_SetPropertyHelper ()
#1 0x00130716 in js_Interpret ()
#2 0x00130716 in js_Interpret ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Changed in xulrunner-1.9.2 (Ubuntu):
status: New → In Progress
importance: Undecided → High
Revision history for this message
Loïc Minier (lool) wrote :

Built with:
autoconf
CFLAGS="-g -ggdb -O0 -fno-strict-aliasing -Wall" CXXFLAGS="$CFLAGS" ./configure --disable-optimize --enable-debug
make -j2 MODULE_OPTIMIZE_FLAGS="" INTERP_OPTIMIZER="" MOZ_OPTIMIZE_FLAGS=""

I still get the same issue

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Configured and built firefox on pavo1 with different options,

./configure --disable-optimize --enable-debug
CFLAGS="-g -ggdb -O0 -fno-strict-aliasing -Wall" CXXFLAGS="$CFLAGS" make MODULE_OPTIMIZE_FLAGS="" INTERP_OPTIMIZER="" MOZ_OPTIMIZE_FLAGS=""
No segfault.

./configure --enable-debug
CFLAGS="-g -ggdb -O0 -fno-strict-aliasing -Wall" CXXFLAGS="$CFLAGS" make MODULE_OPTIMIZE_FLAGS="" INTERP_OPTIMIZER="" MOZ_OPTIMIZE_FLAGS=""
No segfault.

./configure --enable-debug
CFLAGS="-g -ggdb -O0 -fno-strict-aliasing -Wall" CXXFLAGS="$CFLAGS" make
No segfault.

./configure
CFLAGS="-g -ggdb -O0 -fno-strict-aliasing -Wall" CXXFLAGS="$CFLAGS" make
Segfault

./configure
CFLAGS="-g -ggdb -O0 -fno-strict-aliasing -Wall" CXXFLAGS="$CFLAGS" make MODULE_OPTIMIZE_FLAGS="" INTERP_OPTIMIZER="" MOZ_OPTIMIZE_FLAGS=""
Segfault.

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

./configure --enable-debug
make
No segfault. Attached generated imacros.c.tmp

However,
./configure
Add -fno-inline -fno-strict-aliasing in INTERP_OPTIMIZER, MODULE_OPTIMIZE_FLAGS, and MOZ_OPTIMIZE_FLAGS. Segfaults.

Revision history for this message
Loïc Minier (lool) wrote :

Cross-compile recipe:
export HOST_CC=gcc HOST_CXX=g++ CROSS_COMPILE=1 CC=arm-linux-gnueabi-gcc CXX=arm-linux-gnueabi-g++
mkdir foobuild; cd foobuild
../configure --build=x86_64-linux-gnu --host=arm-linux-gnueabi --target=arm-linux-gnueabi --enable-debug
make -j2

Revision history for this message
Loïc Minier (lool) wrote :

I used Marcin's cross packages, and confirm that --enable-debug in the above recipe makes the segfault go away.

Revision history for this message
Loïc Minier (lool) wrote :

I tried cross-building with -marm (this was a bit fragile) but it didn't help the segfault.

Revision history for this message
Michael Hope (michaelh1) wrote :

Runs OK using gcc 4.4.4-6ubuntu2 (i.e. not the Linaro GCC) and ./configure --disable-debug --enable-optimize

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Checkout firefox code from mozilla.org, configured, and built. Still segfaults.
Here is a list of steps I did,
1. hg clone http://hg.mozilla.org/releases/mozilla-1.9.2/ 192src
2. cd 192src/js/src
3. ~/autoconf-2.13-install/bin/autoconf
4. ./configure
5. make
./dist/bin/js imacro_asm.js ./imacros.jsasm > imacros.c.tmp
Segmentation fault

Revision history for this message
Ulrich Weigand (uweigand) wrote :

Confirmed the build failure and segfault in firefox-3.6.8+build1+nobinonly with 4.4.4-7ubuntu3.
I'm looking into this now.

Changed in gcc-linaro:
assignee: nobody → Ulrich Weigand (uweigand)
Revision history for this message
Ulrich Weigand (uweigand) wrote :

Ok, so the immediate cause of the bug is that a global array is only 4-byte aligned, but the code silently assumes 8-byte alignment.

The problem triggers in jsatom.cpp:js_AtomizeString:

            return (JSAtom *) STRING_TO_JSVAL(JSString::unitString(c));

jsapi.h:STRING_TO_JSVAL uses the low 3 bits of the pointer value to encode the type of the atom. This works only if all pointers passed to STRING_TO_JSVAL are guaranteed to be 8-byte aligned.

This is usually not a problem if the value is dynamically allocated. But in this particular instance, JSString::unitString returns the address of a global variable (jsstrinlines.h):

inline JSString *
JSString::unitString(jschar c)
{
    JS_ASSERT(c < UNIT_STRING_LIMIT);
    return &unitStringTable[c];
}

The array unitString Table is a static member of the JSString class (jsstr.h):

    static JSString unitStringTable[];

The compiler assumes that the alignment requirement of that static variable derive from the alignment requirement of the JSString type. Since this type has only two members, a size_t and a union of two pointer types, the total alignment requirement on a platform with 32-bit pointers like ARM is 4 bytes.

As it so happens, in the build of the "js" executable with this compiler and options, the variable does actually turn out to reside at an address that is only 4-byte aligned, but not 8-byte aligned:

000cc034 d JSString::unitStringTable

It seems to me that this is not a compiler bug, just something that can be triggered by random changes in the compiler ...

Interestingly enough, there is code in jsstr.h to ensure 8-byte alignment for unitStringTable, but only on Solaris:

#ifdef __SUNPRO_CC
#pragma align 8 (__1cIJSStringPunitStringTable_, __1cIJSStringOintStringTable_)
#endif

    static JSString unitStringTable[];
    static JSString intStringTable[];
    static const char *deflatedIntStringTable[];

I guess it looks like a more generic fix for this problem is required. It would be good to get some feedback from Firefox developers on how this is supposed to work ...

Revision history for this message
In , Ulrich Weigand (uweigand) wrote :
Download full text (3.3 KiB)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.8) Gecko/20100723 Ubuntu/9.04 (CK-IBM) (CK-IBM) Firefox/3.6.8
Build Identifier: Ubuntu firefox-3.6.8+build1+nobinonly on armel-linux

The Ubuntu firefox-3.6.8+build1+nobinonly package fails to build from source on armv7l-unknown-linux-gnueabi using the GCC 4.4.4-7ubuntu3 compiler, due to a crash (segmentation fault) of the "js" interpreter used during the build.

Analysis seems to point to a bug in the underlying Mozilla sources that just happen to trigger due to random compiler and/or code changes.

The immediate cause of the crash is that a global array is only 4-byte aligned, but the code silently assumes 8-byte alignment.

The problem triggers in jsatom.cpp:js_AtomizeString:

            return (JSAtom *) STRING_TO_JSVAL(JSString::unitString(c));

jsapi.h:STRING_TO_JSVAL uses the low 3 bits of the pointer value to encode the type of the atom. This works only if all pointers passed to STRING_TO_JSVAL are guaranteed to be 8-byte aligned.

This is usually not a problem if the value is dynamically allocated. But in this particular instance, JSString::unitString returns the address of a global variable (jsstrinlines.h):

inline JSString *
JSString::unitString(jschar c)
{
    JS_ASSERT(c < UNIT_STRING_LIMIT);
    return &unitStringTable[c];
}

The array unitString Table is a static member of the JSString class (jsstr.h):

    static JSString unitStringTable[];

The compiler assumes that the alignment requirement of that static variable derive from the alignment requirement of the JSString type. Since this type has only two members, a size_t and a union of two pointer types, the total alignment requirement on a platform with 32-bit pointers like ARM is 4 bytes.

As it so happens, in the build of the "js" executable with this compiler and options, the variable does actually turn out to reside at an address that is only 4-byte aligned, but not 8-byte aligned:

000cc034 d JSString::unitStringTable

It seems to me that this is not a compiler bug, just something that can be triggered by random changes in the compiler ...

Interestingly enough, there is code in jsstr.h to ensure 8-byte alignment for unitStringTable, but only on Solaris:

#ifdef __SUNPRO_CC
#pragma align 8 (__1cIJSStringPunitStringTable_, __1cIJSStringOintStringTable_)
#endif

    static JSString unitStringTable[];
    static JSString intStringTable[];
    static const char *deflatedIntStringTable[];

I guess it looks like a more generic fix for this problem is required. I'm wondering why this never triggered elsewhere ... Is there anything I'm missing that would make this work reliably?

Reproducible: Always

Steps to Reproduce:
1. Install Ubuntu maverick on armel-linux
2. Build the firefox-3.6.8+build1+nobinonly package from source
3. Build aborts with a JS crash:
gcc -c -x c -E -P -I. imacro_asm.js.in > imacro_asm.js
./../../dist/bin/run-mozilla.sh ./../../dist/bin/js imacro_asm.js ./imacros.jsasm > imacros.c.tmp
Segmentation fault
make[4]: *** [libs] Error 139
make[4]: Leaving directory `/build/buildd/firefox-3.6.6+nobinonly/build-tree/mozilla/js/src'

Actual Results:
http://launchpadlibrarian.net/51627119/buildl...

Read more...

Changed in firefox:
status: Unknown → New
Revision history for this message
Ulrich Weigand (uweigand) wrote :

Hmm ... actually there *is* code in the Mozilla sources to attempt to ensure proper alignment, in jsstr.cpp:

#ifdef __SUNPRO_CC
#pragma pack(8)
#else
#pragma pack(push, 8)
#endif

JSString JSString::unitStringTable[]
#ifdef __GNUC__
__attribute__ ((aligned (8)))
#endif
= {
    U(0x00), U(0x01), U(0x02), U(0x03), U(0x04), U(0x05), U(0x06), U(0x07),

However, the compiler does not respect the attribute.

Minimal test case is:

struct JSString
{
  unsigned int mLength;
  static JSString unitStringTable[];
};

#pragma pack(push, 8)
JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };
#pragma pack(pop)

Building this with g++ -S results in:

        .global _ZN8JSString15unitStringTableE
        .data
        .align 2
        .type _ZN8JSString15unitStringTableE, %object
        .size _ZN8JSString15unitStringTableE, 4
_ZN8JSString15unitStringTableE:
        .word 1

.align 2 is wrong; this should be .align 3. Vanilla FSF GCC 4.4.4 gets this correct, so it does look like a Linaro GCC bug after all.

Interestingly enough, the bug only happens with C++ class static variables; for "normal" global variables, the alignment attribute is respected correctly. Looking into GCC now ...

Revision history for this message
Ulrich Weigand (uweigand) wrote :

Linaro GCC shows this same problem not just on ARM, but also on i386. So it seems this may now cause problem elsewhere as well ...

Revision history for this message
Ulrich Weigand (uweigand) wrote :

Looks like the culprit may be this patch, which also went upstream. I checked upstream GCC, and it seems broken on the test case (on all platforms) as well ...

2009-06-11 Nathan Froyd <email address hidden>

       Backport from upstream:

       2009-06-10 Nathan Froyd <email address hidden>
       gcc/
       * tree.h (tree_base): Add packed_flag and user_align fields.
       Decrease size of spare field.
       (TYPE_USER_ALIGN): Use user_align from tree_base.
       (DECL_USER_ALIGN): Likewise.
       (TYPE_PACKED): Use packed_flag from tree_base.
       (DECL_PACKED): Likewise.
       (tree_type): Delete packed_flag and user_align fields. Widen
       precision field. Widen mode field and shuffle fields to align
       mode on an 8-bit boundary.
       (tree_decl_common): Delete decl_flag_1 and user_align fields.
       Renumber decl_flag_* fields. Fix comments. Widen
       decl_common_unused field.
       (DECL_HAS_VALUE_EXPR_P): Adjust for renumbering of decl_flag_*
       fields.
       (DECL_EXTERNAL): Likewise.
       (DECL_BIT_FIELD): Likewise.
       (DECL_NONADDRESSABLE_P): Likewise.
       (TYPE_DECL_SUPRESS_DEBUG): Likewise.
       * config/arm/arm-modes.def (XImode): Make it an INT_MODE.

http://gcc.gnu.org/ml/gcc-patches/2009-06/msg00763.html

Revision history for this message
Ulrich Weigand (uweigand) wrote :

The problem is that in cp/decl.c:duplicate_decls, when merging two decls for one variable, the code uses a memcpy to copy all fields in tree_decl_common *except* those in tree_common (which are handled manually). Since the patch moves the user_align and packed_flag bits from tree_decl_common into tree_base (which is part of tree_common), these bits -which used to be copied automatically by the mempy- are now no longer handled automatically and should have been handled manually. No such code was added however.

Revision history for this message
Ulrich Weigand (uweigand) wrote :

Yes, this does indeed look like the root cause. With the following patch I'm getting correct behavior back:

--- gcc/cp/decl.c.orig 2010-07-28 19:26:52.000000000 +0200
+++ gcc/cp/decl.c 2010-07-28 19:28:13.000000000 +0200
@@ -2068,6 +2068,10 @@
       SET_DECL_INIT_PRIORITY (olddecl, DECL_INIT_PRIORITY (newdecl));
       DECL_HAS_INIT_PRIORITY_P (olddecl) = 1;
     }
+ /* Likewise for DECL_USER_ALIGN and DECL_PACKED. */
+ DECL_USER_ALIGN (olddecl) = DECL_USER_ALIGN (newdecl);
+ if (TREE_CODE (newdecl) == FIELD_DECL)
+ DECL_PACKED (olddecl) = DECL_PACKED (newdecl);

   /* The DECL_LANG_SPECIFIC information in OLDDECL will be replaced
      with that from NEWDECL below. */

Revision history for this message
In , Alexander Sack (asac) wrote :
Revision history for this message
In , Jorendorff (jorendorff) wrote :

In tracemonkey tip, we have this in jsstr.cpp:

  JSString JSString::unitStringTable[]
  #ifdef __GNUC__
  __attribute__ ((aligned (8)))
  #endif
  = {
      U(0x00), U(0x01), U(0x02), U(0x03), U(0x04), U(0x05), U(0x06), U(0x07),
      U(0x08), U(0x09), U(0x0a), U(0x0b), U(0x0c), U(0x0d), U(0x0e), U(0x0f),
      ...

Is that __attribute__ in your jsstr.cpp?

If not, try updating to tip and building that.

Revision history for this message
In , Alexander Sack (asac) wrote :

Thanks Jason.

Ulrich, if this isn't in your current code, I can probably make a tip build available for you more or less soonish. Let me know.

Revision history for this message
In , Ulrich Weigand (uweigand) wrote :

Thanks for the hint, Jason! I do have the aligned attribute in the code, I just missed it since I was looking only at the declaration site ...

Unfortunately, it seems we have a GCC bug here: starting with GCC 4.5, the C++ front-end appears to simply ignore the aligned attribute in code like this. (We have apparently back-ported the patch that introduced this regression to the Ubuntu/Linaro compiler, which is why we're seeing it on a 4.4-based compiler too.)

I've now opened a GCC bugzilla to track the bug there:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112

Revision history for this message
In , Ulrich Weigand (uweigand) wrote :

Setting this bug to INVALID since it's not a Mozilla problem after all.

Revision history for this message
Ulrich Weigand (uweigand) wrote :

Opened GCC bugzilla PR c++/45112 to track this now.

Changed in gcc:
status: Unknown → New
Changed in firefox:
status: New → Invalid
Changed in gcc:
status: New → In Progress
Revision history for this message
Ulrich Weigand (uweigand) wrote :

Fix approved for GCC mainline and checked in. Will check in to 4.5 branch after 4.5.1 release.
Preparing backport to Linaro GCC now.

Revision history for this message
Ulrich Weigand (uweigand) wrote :

Fix checked in to 4.5 branch as well.

Revision history for this message
Matthias Klose (doko) wrote :

fixed in gcc-4.5 4.5.1-1ubuntu1

Changed in gcc-4.5 (Ubuntu):
importance: Undecided → High
milestone: none → maverick-alpha-3
status: New → Fix Released
Revision history for this message
Matthias Klose (doko) wrote :

fixed in gcc-4.4 4.4.4-8ubuntu1

Changed in gcc-4.4 (Ubuntu):
importance: Undecided → High
milestone: none → maverick-alpha-3
status: New → Fix Released
Changed in gcc:
status: In Progress → Fix Released
Revision history for this message
Matthias Klose (doko) wrote :

firefox and xulrunner built on armel

Changed in firefox (Ubuntu):
status: In Progress → Fix Released
Changed in xulrunner-1.9.2 (Ubuntu):
status: In Progress → Fix Released
Michael Hope (michaelh1)
Changed in gcc-linaro-tracking:
status: New → Fix Committed
milestone: none → 4.5.2
Revision history for this message
Ulrich Weigand (uweigand) wrote :

Merge requests for both Linaro GCC 4.4 and Linaro GCC 4.5 now pending.

Revision history for this message
Michael Hope (michaelh1) wrote :
Revision history for this message
Michael Hope (michaelh1) wrote :
Changed in firefox:
importance: Unknown → Medium
Changed in gcc-linaro-tracking:
status: Fix Committed → Fix Released
Changed in gcc:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.