Firefox 39 crashes on startup or within a few seconds on Precise/x86

Bug #1471949 reported by Chris Coulson on 2015-07-06
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
firefox (Ubuntu)
Undecided
Unassigned
Precise
Critical
Chris Coulson
Trusty
Undecided
Unassigned
gcc-4.8 (Ubuntu)
Undecided
Unassigned
Trusty
Undecided
Unassigned
gcc-mozilla (Ubuntu)
Undecided
Unassigned
Precise
Critical
Chris Coulson

Bug Description

This is blocking publication of Firefox 39.

The build for x86 on 12.04 currently crashes on startup, or within a few seconds of startup. It's basically unusable. An example crash report is: https://crash-stats.mozilla.com/report/index/d0d97dbb-f6bc-4e4d-88ff-e5fff2150702.

Unfortunately, despite the warning in the PPA description for https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa, ordinary users are still installing pre-release packages from it.

It works on all other releases and on Precise/x86-64. I did test Firefox 39 with this toolchain when it was still the nightly version whilst preparing the switch to GCC 4.8, and it worked fine.

Changed in firefox (Ubuntu Precise):
importance: Undecided → Critical
status: New → Triaged
Changed in firefox (Ubuntu):
status: New → Invalid
Chris Coulson (chrisccoulson) wrote :

I've tried this now with binutils 2.24 from Trusty (which produced a good build), and also the gold linker in Precise and I get the same issue. So it looks like this is specific to the compiler.

Chris Coulson (chrisccoulson) wrote :

This broken build is also responsible for https://bugzilla.mozilla.org/show_bug.cgi?id=1172059

Chris Coulson (chrisccoulson) wrote :
Download full text (7.2 KiB)

The crash occurs because js::ScopeCoordinateName returns nullptr. The only way it can do that is if |id| is zero at http://hg.mozilla.org/releases/mozilla-release/file/7665b8d4d51f/js/src/vm/ScopeObject.cpp#l96.

Unfortunately, it works fine in a debug build.

I added a "MOZ_RELEASE_ASSERT(!JSID_IS_ZERO(id))" after http://hg.mozilla.org/releases/mozilla-release/file/7665b8d4d51f/js/src/vm/ScopeObject.cpp#l89 so I could crash Firefox when this happens and catch it in gdb.

At the point we hit this assert in gdb, |sc| has already been optimised out. However, by examining the incoming bytecode:

(gdb) p *pc
$2 = 136 '\210'
(gdb) x pc+1
0xb68b2e2: 0x00000000

... we see that this is a JSOP_GETALIASEDVAR instruction and the |slot| argument is 0x000000. After adding some printf's to http://hg.mozilla.org/releases/mozilla-release/file/7665b8d4d51f/js/src/vm/ScopeObject.cpp#l72, I verified that no properties with slot=0 ever exist.

I added further asserts in js::frontend::BytecodeEmitter::emitScopeCoordOp() and verified that this instruction sequence was not being emitted here. After creating a small helper function to scan the bytecode at various stages in BytecodeEmitter, I managed to narrow the point at which this instruction sequence is emitted to js::frontend::BytecodeEmitter::emitNameOp().

Here, at http://hg.mozilla.org/releases/mozilla-release/file/7665b8d4d51f/js/src/frontend/BytecodeEmitter.cpp#l2272:

        if (!pn->pn_cookie.isFree()) {
             MOZ_ASSERT(JOF_OPTYPE(op) != JOF_ATOM);
             if (!emitVarOp(pn, op))
                 return false;
         } else {
             if (!emitAtomOp(pn, op))
                 return false;
         }

... despite |op| being equal to JSOP_GETALIASEDVAR, this code calls emitAtomOp() because the "if (!pn->pn_cookie.isFree()) {" check evaluates false. This is wrong - it should be calling emitVarOp().

Changing the MOZ_ASSERT at the top of js::frontend::BytecodeEmitter::emitAtomOp() to a MOZ_RELEASE_ASSERT verifies that we hit this. In gdb:

(gdb) f 1
#1 0xf58099bf in emitAtomOp (op=<optimised out>, pn=0x825bc60, this=0xffffa408) at /home/chr1s/src/firefox/build-area/firefox-39.0+build5/js/src/frontend/BytecodeEmitter.cpp:1071
1071 return emitAtomOp(pn->pn_atom, op);
(gdb) p pn->pn_u.name.cookie
$2 = {level_ = 255, slot_ = 4, static FREE_LEVEL = 255}

So, the correct branch is taken. Which means that |pn| is invalid.

In fact, it goes wrong in js::frontend::BytecodeEmitter::tryConvertFreeName(), just here at http://hg.mozilla.org/releases/mozilla-release/file/7665b8d4d51f/js/src/frontend/BytecodeEmitter.cpp#l1567:

                    JSOp op;
                    switch (pn->getOp()) {
                      case JSOP_GETNAME: op = JSOP_GETALIASEDVAR; break;
                      case JSOP_SETNAME: op = JSOP_SETALIASEDVAR; break;
                      default: return false;
                    }

                    pn->setOp(op);
                    JS_ALWAYS_TRUE(pn->pn_cookie.set(parser->tokenStream, hops, slot));
                    return true;

If I add a "MOZ_RELEASE_ASSERT(!pn->pn_cookie.isFree());" just before the "return true", then we hit this in our broken build. T...

Read more...

Karlchen (karlchen) wrote :

Hello, Chris.

Today the ppa:ubuntu-mozilla-security/ppa has brought me firefox 39.0+build5-0ubuntu0.12.04.2.
On Mint13 32-bit xfce, based on Ubuntu 12.04.5 32-bit, the reported problem seems to have been solved.
Firefox 39.0 started up without crashing.
Thanks a lot for all your time and efforts.

Cheers,
Karl

Daniel (hackie) wrote :

Same problem on an up-to-date ubuntu trusty. Now I downgraded from firefox:amd64 39.0+build5-0ubuntu0.14.04.1 back to 38.0+build3-0ubuntu0.14.04.1 and everything works again

darko (darko) wrote :

Thanks for all infos..!

Chris Coulson (chrisccoulson) wrote :

Note, this bug isn't fixed but we have worked around it by building Firefox with -fno-tree-pre, which disables the optimization that breaks it.

Chris Coulson (chrisccoulson) wrote :

Building Firefox with the gcc-4.8 package in trusty-updates also produces the same broken build, so it's a regression between 4.8.2 and our build of gcc 4.8.4

Chris Coulson (chrisccoulson) wrote :

Building with a vanilla gcc 4.8.4 does work though. It turns out that the svn-updates.diff patch that is applied to our build of gcc introduces the bug - so it's a regression between 4.8.4 and 4.8.5

Chris Coulson (chrisccoulson) wrote :

Here's the disassembly from a good build with vanilla gcc 4.8.4. It's basically identical, but it contains 3 extra instructions that are missing from the broken build.

   0xf57fe991 <+1729>: mov 0xa0(%ebp),%edx
   0xf57fe997 <+1735>: mov 0x84(%esp),%esi // %esi now points to |pn|
   0xf57fe99e <+1742>: add $0x18,%edx
   0xf57fe9a1 <+1745>: cmpl $0xfe,0x28(%esp) // Compare |hops| with 254 (FREE_LEVEL - 1)
   0xf57fe9a9 <+1753>: mov %al,0x2(%esi) // Calls pn->SetOp(op)
   0xf57fe9ac <+1756>: mov 0x34(%esp),%eax // %eax now contains |slot|
   0xf57fe9b0 <+1760>: ja 0xf57fea10 <js::frontend::BytecodeEmitter::tryConvertFreeName(js::frontend::ParseNode*)+1856> // Jump if |hops| > 254
   0xf57fe9b2 <+1762>: cmp $0xffffff,%eax // Compare |slot| with 0xffffff
   0xf57fe9b7 <+1767>: ja 0xf57fe9f9 <js::frontend::BytecodeEmitter::tryConvertFreeName(js::frontend::ParseNode*)+1833> // Jump if |slot| > 0xffffff
   0xf57fe9b9 <+1769>: mov 0x84(%esp),%esi // %esi now points to |pn|
   0xf57fe9c0 <+1776>: shl $0x8,%eax // Left shift new |slot| value by 8-bits

// These next 3 instructions are missing in the broken build
   0xf57fe9c3 <+1779>: mov $0x1,%edi
   0xf57fe9c8 <+1784>: movzbl 0x28(%esp),%edx // %edx now contains |hops|
   0xf57fe9cd <+1789>: mov %dl,0x20(%esi) // Save |hops| in to |level_| in pn->pn_u.name.cookie

   0xf57fe9d0 <+1792>: mov %eax,%edx // %edx now contains |slot|
   0xf57fe9d2 <+1794>: movzbl 0x20(%esi),%eax // Load |level_| from pn->pn_u.name.cookie in to %eax
   0xf57fe9d6 <+1798>: or %edx,%eax // %eax now contains the bitwise-OR of |level_| and new |slot| value
   0xf57fe9d8 <+1800>: mov %eax,0x20(%esi) // Save the new values to |level_| and |slot_| in pn->pn_u.name.cookie

Chris Coulson (chrisccoulson) wrote :

In particular, the fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64557 is the culprit

Karlchen (karlchen) wrote :

Which gcc version do the Mozilla developers use, because it seems as if their builds of Firefox 39 do not crash right on startup or within a few seconds thereafter.
Would it not be acceptable for the moment to drop back to using gcc 4.8.2, till the bug introduced by gcc > 4.8.2 has been fixed?

Changed in gcc-mozilla (Ubuntu):
status: New → Invalid
Changed in gcc-mozilla (Ubuntu Precise):
status: New → Fix Committed
importance: Undecided → Critical
Changed in firefox (Ubuntu Precise):
status: Triaged → Fix Released
assignee: nobody → Chris Coulson (chrisccoulson)
Changed in gcc-mozilla (Ubuntu Precise):
assignee: nobody → Chris Coulson (chrisccoulson)
Matthias Klose (doko) on 2016-05-06
no longer affects: gcc-4.8 (Ubuntu Precise)
Matthias Klose (doko) on 2016-05-06
no longer affects: gcc-mozilla (Ubuntu Trusty)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gcc-4.8 - 4.8.5-4ubuntu3

---------------
gcc-4.8 (4.8.5-4ubuntu3) yakkety; urgency=medium

  * Fix libjava testsuite with dejagnu 1.6, taken from the trunk.
  * Fix PR rtl-optimization/68955, PR rtl-optimization/64557, taken from 4.9.
    LP: #1471949.
  * Bump standards version.

 -- Matthias Klose <email address hidden> Fri, 06 May 2016 19:03:29 +0200

Changed in gcc-4.8 (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.