firefox crash on javascript page

Bug #429476 reported by Chris Bainbridge on 2009-09-14
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mozilla Firefox
Fix Released
Medium
firefox (Ubuntu)
Medium
Unassigned
firefox-3.0 (Ubuntu)
Medium
Unassigned
firefox-3.5 (Ubuntu)
Medium
Unassigned

Bug Description

On a Thinkpad T42p with 1.5GB memory, no swap space, running Jaunty the following page causes firefox-3.0.14 to crash:

http://bbarker.co.uk/animate/map.php?file=Edinburgh_to_Holy_Island_cycle.gpx

It appears that memory usage spikes until resident memory is used up.

Additionally, under KDE the desktop bar disappears once firefox has terminated, and occasionally xorg will also hang.

Clearly firefox should not be using so much memory, Chromium under the same page shows only around 80MB of resident memory used. It would also be nice if the KDE desktop and xorg were a bit more resilient to low memory conditions - one app crashing shouldn't leave the desktop in an unusable state.

The same happens with firefox-3.6 from PPA http://ppa.launchpad.net/fta/ppa/ubuntu

Same for today's 3.7 nightly build.

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.0.14) Gecko/2009090216 Ubuntu/9.04 (jaunty) Firefox/3.0.14
Build Identifier: firefox-3.7a1pre.en-US.linux-i686.tar.bz2

I'm filing this as a Javascript bug as it seems likely, but the fault may lie elsewhere.

Viewing this URL on a Thinkpad T42p with 1.5GB memory running Ubuntu Jaunty: http://bbarker.co.uk/animate/map.php?file=Edinburgh_to_Holy_Island_cycle.gpx

results in resident memory rising until desktop becomes unresponsive at around 1GB resident, usage is still increasing (highest I've seen is 1.2GB resident then top becomes unresponsive). At this point either firefox is killed, or xorg hangs.

Confirmed with firefox-3.0.14 and firefox-3.7a1pre nightly.

I already reported this bug in Ubuntu: https://bugs.launchpad.net/ubuntu/+source/firefox-3.0/+bug/429476

Reproducible: Always

Steps to Reproduce:
1.go to http://bbarker.co.uk/animate/map.php?file=Edinburgh_to_Holy_Island_cycle.gpx
2. that's it
3.
Actual Results:
resident memory increases, eventually firefox is killed or desktop hangs

Expected Results:
Chromium uses about 80MB resident memory for the same page.

Changed in firefox:
status: Unknown → New

Firefox 3.5.3 with 32-bit Windows XP SP 3: Memory consumption rises to 200 MB, then goes down to 20-30 MB, but virtual memory size is higher than 700 MB.

Confirmed. For some reason the GC doesn't kick in and our RSS size grows out of control. We have a patch for much better GC scheduling. Either we land that, or we have to investigate how to better address this problem here with the current heuristics. I saw a JSON failure instead of a crash, so my suspicion would be a ton of strings being created.

I tried it on MacOS 10.5 with a debug build and it worked (mostly). When I first loaded the page, CPU went to 100% and FF became unresponsive for at least 10 seconds. During this time memory usage (Real Memory in Activity Monitor) went up to about 1GB but then it went back down to reasonable levels. I clicked "Continue" to the slow script dialog. Eventually the map with the track did come up, and then it worked normally.

I found that Safari had somewhat similar behavior: slow startup, but then worked normally. I tried Chrome on Windows and startup was pretty quick, and then it worked normally.

So what actually should be the bug here? Is it that memory usage goes up to ~1GB during the startup phase?

Do we GC during that 1GB phase? If we don't, its a bug. If we do, I am not sure its a bug.

In a run of this page from start until the animated map appears, this is the log of the times at which a GC was done:

2:214:GC t=0.175912
2:370:GC t=0.925824
2:595:GC t=1.469851
2:645:GC t=1.532591
2:2284:GC t=23.293265
2:10953:GC t=85.411238
2:11016:GC t=87.652958
2:11034:GC t=87.910751
2:11038:GC t=88.201925
2:11039:GC t=88.234485
2:11040:GC t=88.316330
2:11041:GC t=88.328484

As you can see, we go 20 seconds without any GC, and then 60 more after that until the next GC. Here are the last few calls to IsGCThresholdReached before the 23s GC:

IsGCThresholdReached t=18.488329 mb=7437487/33554432, gb=2427140/4390912
IsGCThresholdReached t=18.640668 mb=7439791/33554432, gb=2427140/4390912
IsGCThresholdReached t=18.869077 mb=7443295/33554432, gb=2427140/4390912
IsGCThresholdReached t=19.843841 mb=7446751/33554432, gb=2431236/4390912
IsGCThresholdReached t=19.930377 mb=7447423/33554432, gb=2431236/4390912
IsGCThresholdReached t=21.518867 mb=7450891/33554432, gb=2431236/4390912

'mb' is the mallocBytes (cur/max) and 'gb' is the gcBytes (cur/max). The last few calls just before the 85s GC are:

IsGCThresholdReached t=84.934737 mb=7226436/33554432, gb=7086000/34246720
IsGCThresholdReached t=84.935768 mb=7230504/33554432, gb=7090096/34246720
IsGCThresholdReached t=84.936678 mb=7234144/33554432, gb=7090096/34246720
IsGCThresholdReached t=84.937600 mb=7238212/33554432, gb=7094192/34246720
IsGCThresholdReached t=84.938453 mb=7241852/33554432, gb=7094192/34246720
IsGCThresholdReached t=84.942108 mb=7245272/33554432, gb=7098288/34246720

Whatever is causing the memory usage, it doesn't show up in the metrics used by IsGCThresholdReached, so we are not GC'ing.

Micah Gersten (micahg) wrote :

Upstream is currently determining if this is a bug or not.

Changed in firefox-3.0 (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Changed in firefox-3.5 (Ubuntu):
importance: Undecided → Medium
status: New → Triaged

Here is the key part of a trace-malloc snapshot with 1.4GB allocated. Almost all of the memory is created through nsXMLHttpRequest::ConvertBodyToText, which I believe is called to access the property |requestText|.

What I don't understand is why the nsAStrings returned by ConvertBodyToText are not freed in XPCWrappedNative::CallMethod after the call. Who would know about this?

1411376623 malloc
  1407026664 my_malloc_logger (pldhash.c:)
    1394617050 malloc_zone_malloc (/usr/lib/libSystem.B.dylib)
      1393036801 malloc (/usr/lib/libSystem.B.dylib)
        1376888066 nsAString_internal::MutatePrep(unsigned int, unsigned short**, unsigned int*) (pldhash.c:)
          1376865590 nsAString_internal::ReplacePrep(unsigned int, unsigned int, unsigned int) (pldhash.c:)
            1376853800 nsAString_internal::Assign(unsigned short const*, unsigned int) (pldhash.c:)
              1376771840 nsXMLHttpRequest::ConvertBodyToText(nsAString_internal&) (nsUnicharUtils.cpp:)
                1376771840 NS_InvokeByIndex_P (pldhash.c:)
                  1376771840 XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/xpconnect/src/dom_quickst
ubs.h:)
                    1376771840 XPC_WN_GetterSetter(JSContext*, JSObject*, unsigned int, long*, long*) (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/xpconnect/src/dom_quickstu
bs.h:)
                      1376771840 js_Invoke (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                        1376771840 js_InternalInvoke (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                          1376771840 js_InternalGetOrSet (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                            1376771840 js_NativeGet (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                              1375691234 js_Interpret (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                              1080606 js_GetPropertyHelper (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)

Changed in firefox:
status: New → Confirmed

Nasty. Sounds like bug 506125 would fix this very nicely.

So do we know if the data is leaked in the sense that we loose track of it without freeing it, or is it leaked in the sense that we are keeping track of it but think it's being used and so we don't free it. Or finally, are things leaking in the sense that if we only ran a GC it would get freed, but for some reason a GC isn't running.

cc'ing jst who is the resident expert on XPConnect string stuff :)

From my observation memory use drops back down. The problem is that we don't count this in our current cheesy memory pressure heuristics so we don't know its time to GC (which would free all this stuff). Hence my pointer to the actual working-set based heuristics bug.

(In reply to comment #8)
> So do we know if the data is leaked in the sense that we loose track of it
> without freeing it, or is it leaked in the sense that we are keeping track of
> it but think it's being used and so we don't free it. Or finally, are things
> leaking in the sense that if we only ran a GC it would get freed, but for some
> reason a GC isn't running.

I don't know the full answer yet. The memory does get freed eventually, so presumably we don't lose track of it. But I haven't yet figured out exactly what event causes the memory to get freed.

In particular, what I don't know is whether these DOM strings are (a) sent back to JS as wrapped DOM strings, or (b) converted to JS strings. If (a), then running js_GC would do the trick, and the problem is that the JS GC doesn't know it could free a lot of memory if it ran. If (b), then something weird is going on. So I guess it's probably (a).

The DOM strings are probably wrapped in JSStrings and I don't think the JS engine gets notified about that memory (JS_NewExternalString just sets the length using initFlat). The wrapping would happen in XPCStringConvert::ReadableToJSVal.

Created attachment 401579
Patch

(In reply to comment #11)
> The DOM strings are probably wrapped in JSStrings and I don't think the JS
> engine gets notified about that memory (JS_NewExternalString just sets the
> length using initFlat). The wrapping would happen in
> XPCStringConvert::ReadableToJSVal.

Thanks! That's exactly what I needed to know. This patch accounts for the memory in the 'gcMallocBytes' statistic. It prevents the memory spike on this test--it doesn't go over 125MB or so on my machine with this site. The only thing I'm not sure about is if I need to take a lock or increment a thread-local counter instead. I was hoping Igor could educate me on that. :-)

Changed in firefox:
status: Confirmed → In Progress

Comment on attachment 401579
Patch

>diff -r 7e985a173416 -r 892ea3ad67be js/src/jsapi.cpp
>--- a/js/src/jsapi.cpp Fri Sep 18 14:27:07 2009 -0700
>+++ b/js/src/jsapi.cpp Fri Sep 18 17:33:53 2009 -0700
>@@ -2626,6 +2626,7 @@
> if (!str)
> return NULL;
> str->initFlat(chars, length);
>+ cx->runtime->gcMallocBytes += length * sizeof(jschar);
> return str;

This is not thread-safe. Use cx->updateMallocCounter(size) here. r+ with that fixed. Also it would be nice to have a a diff with -p -U8.

gcMallocBytes is just an approximate counter. Precision is not required here so collisions and loss of increments can be tolerated.

(In reply to comment #14)
> gcMallocBytes is just an approximate counter. Precision is not required here so
> collisions and loss of increments can be tolerated.

updateMallocCounter is the function to use here. Optimizing that function is for another bug.

Created attachment 401890
Patch 2 (updates per review)

Updated per review & pushed to TM as 9131d01c2d2a.

That should be (length + 1) * sizeof(jschar), right? rs=me on followup.

/be

(In reply to comment #17)
> That should be (length + 1) * sizeof(jschar), right? rs=me on followup.

I intentionally left the 1 out because it's really just an approximation anyway so I figured I'd save a cycle (which I admit is equally pointless if not more). I suppose it does hurt accuracy notably if someone allocates a million short strings, so pushed to TM as ee059a50f204.

Micah Gersten (micahg) on 2009-09-29
tags: added: fixed-3.7
Changed in firefox:
status: In Progress → Fix Released
Micah Gersten (micahg) on 2009-10-12
tags: added: fixed-3.6
Micah Gersten (micahg) on 2009-11-26
Changed in firefox:
milestone: none → 3.6
Micah Gersten (micahg) wrote :

Firefox will be the source for Firefox 3.6, so adding this task

Changed in firefox (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Launchpad Janitor (janitor) wrote :
Download full text (9.6 KiB)

This bug was fixed in the package firefox - 3.6+nobinonly-0ubuntu1

---------------
firefox (3.6+nobinonly-0ubuntu1) lucid; urgency=low

  * New upstream release v3.6 (FIREFOX_3_6_RELEASE)
    + fix LP: #449744 - Firefox crashes when attempting to load Firebug 1.5
    + fix LP: #66015 - Duplicate spell checking dictionaries for every entry
    + fix LP: #132938 - tooltips dont work in sidebar
    + fix LP: #195698 - Password asked separately for each tab that requires it
                        (proxy)
    + fix LP: #239462 - tooltips disappear too fast
    + fix LP: #385816 - Resize corner grab stays visible after maximize
    + fix LP: #429476 - firefox crash on javascript page
    + fix LP: #432876 - Icons missing in Firefox searchbox drop down list
    + fix LP: #486284 - maxlength on input box can be overriden by autocomplete
    + fix LP: #501393 - Integrate Firefox notifications with notify-osd bling

  [ H. Montoliu <email address hidden> ]
  * fix LP: #361052 - firefox apport hook fails to retrieve pluginreg.dat file
  * update debian/apport/firefox-3.6.py - removed unused code and minor refactoring.

  [ Fabien Tassin <email address hidden> ]
  * Update the location of the upsteam branch now that 3.6/Namoroka has its own
    branch, and trunk moved on to 3.7
    - update debian/mozclient/firefox-3.6.conf
  * Use Namoroka instead of Shiretoko as brand name and use it for snapshots.
    Name it Namoroka in the Preferred Application UI too
    - update debian/firefox-3.6-shiretoko.desktop => debian/firefox-3.6-namoroka.desktop
    - update debian/firefox-3.6.xml
    - update debian/rules
  * Target the 'default' branch instead of tip
    - add debian/moz-rev.sh
    - update debian/mozclient/firefox-3.6.conf
  * Add firefox 3.6 to the list of Preferred Applications in Gnome
    - add debian/firefox-3.6.xml
    - update debian/firefox-3.6-gnome-support.install
  * Add ${misc:Depends} to all non-transitional packages, make firefox-3.6-dbg
    depend on firefox-3.6 with the exact same version, move -dbg packges to
    priority extra and add firefox-3.6-gnome-support-dbg
    - update debian/control
  * Update diverged patches:
    - update debian/patches/browser_branding.patch
    - update debian/patches/firefox-profilename
    - update debian/patches/ubuntu_bookmarks.patch
    - update debian/patches/lp185622_system_path_default_browser.patch
    - update debian/patches/dont_depend_on_nspr_sources.patch

  [ Alexander Sack <email address hidden> ]
  * add libnotify-dev to build-depends
    - update debian/control
  * add libiw-dev to build-depends to fix build failure
    - update debian/control
  * until we move searchplugins to a separate package provided only by the current default
    firefox, we need to make firefox-3.6 replace all the older firefox binary packages:
    firefox-3.5, firefox-3.2, firefox-3.1, firefox-3.0
    - update debian/control
  * implement MIN_SYS_DEPS approach that does not use system xulrunner
    and only a minimal set of system dependencies.
    + drop patches not required anymore:
      - delete debian/patches/dont_depend_on_nspr_sources.patch
      - update debian/patches/series
    + update browser directory provider...

Read more...

Changed in firefox (Ubuntu):
status: Triaged → Fix Released
Changed in firefox:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.