MASTER firefox crash [@nsHTMLDocument::GetElementById] [@XPTC_InvokeByIndex]

Bug #91584 reported by dim6003
38
Affects Status Importance Assigned to Milestone
Mozilla Firefox
Fix Released
Critical
firefox (Ubuntu)
Fix Released
High
Mozilla Bugs

Bug Description

Binary package hint: firefox

I have a bug report file (~23.4mb) but don't know how to send it ...

From retraced stack trace:

...
#3 <signal handler called>
 #4 ?? ()
 #5 nsHTMLDocument::GetElementById (this=0xa89b848, aElementId=@0x9992310,
 #6 XPTC_InvokeByIndex () at xptcinvoke_gcc_x86_unix.cpp:50
 #7 XPCWrappedNative::CallMethod (ccx=@0xbfde184c, mode=XPCWrappedNative::CALL_METHOD)
 #8 XPC_WN_CallMethod (cx=0x8420820, obj=0x8c42818, argc=1, argv=0xaa59dd0,
 #9 js_Invoke (cx=0x8420820, argc=1, flags=2) at jsinterp.c:1396
 #10 js_InternalInvoke (cx=0x8420820, obj=0x8c42818, fval=160952368, flags=2, argc=1,
 #11 JS_CallFunctionValue (cx=0x8420820, obj=0x8c42818, fval=160952368, argc=1,
 #12 XPC_NW_FunctionWrapper (cx=0x8420820, obj=0x8c427b8, argc=1, argv=0xaa59dbc,
 #13 js_Invoke (cx=0x8420820, argc=1, flags=0) at jsinterp.c:1396
...

Revision history for this message
In , Nrthomas (nrthomas) wrote :

Please give a few talkback ID's so we can trace the cause of the crashes. See http://kb.mozillazine.org/Talkback for how to get them.

Revision history for this message
In , Stavandesai (stavandesai) wrote :

Here are the Incident IDs so far
TB25657363Q, TB25651372H, TB25650554W, TB25624849K, TB25590893G, TB25590064Z, TB25572066W, TB25564431H, TB25563660X, TB25504882Z, TB25503020X, TB25481533Q, TB25480652M, TB25479335Z, TB25428351Y, TB25380519W

Revision history for this message
In , Nrthomas (nrthomas) wrote :

Looks like the crashes all occur at
  http://www.gamespot.com/xbox360
I couldn't reproduce the crash surfing around a bit and leaving it sitting for 15 minutes or so (using Firefox 2 w/ blank profile, Flash 8.0 r24, win32)

The stacks end at a random memory address, or at nsHTMLDocument::GetElementById - about a 50/50 split, with 3 other stacks that are empty. We need someone who knows stack-fu to say more.

Revision history for this message
In , Adam Guthrie (ispiked) wrote :

Are you using any extensions? If so, do you still crash in safe mode (http://kb.mozillazine.org/Safe_mode).

Incident ID: 25480652
Stack Signature nsHTMLDocument::GetElementById cb202024
Product ID Firefox2
Build ID 2006101023
Trigger Time 2006-11-03 20:15:07.0
Platform Win32
Operating System Windows NT 5.1 build 2600
Module firefox.exe + (00196e33)
URL visited http://www.gamespot.com/xbox360
User Comments
Since Last Crash 182 sec
Total Uptime 75242 sec
Trigger Reason Access violation
Source File, Line No. c:/builds/tinderbox/Fx-Mozilla1.8-release/WINNT_5.2_Depend/mozilla/content/html/document/src/nsHTMLDocument.cpp, line 2479
Stack Trace
nsHTMLDocument::GetElementById [mozilla/content/html/document/src/nsHTMLDocument.cpp, line 2479]
XPTC_InvokeByIndex [mozilla/xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp, line 102]
XPCWrappedNative::CallMethod [mozilla/js/src/xpconnect/src/xpcwrappednative.cpp, line 2169]
XPC_WN_CallMethod [mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp, line 1455]
js_Invoke [mozilla/js/src/jsinterp.c, line 1377]
js_InternalInvoke [mozilla/js/src/jsinterp.c, line 1471]
JS_CallFunctionValue [mozilla/js/src/jsapi.c, line 4419]
XPC_NW_FunctionWrapper [mozilla/js/src/xpconnect/src/XPCNativeWrapper.cpp, line 387]
js_Invoke [mozilla/js/src/jsinterp.c, line 1377]
js_Interpret [mozilla/js/src/jsinterp.c, line 4121]
js_Invoke [mozilla/js/src/jsinterp.c, line 1396]
js_InternalInvoke [mozilla/js/src/jsinterp.c, line 1471]
JS_CallFunctionValue [mozilla/js/src/jsapi.c, line 4419]
nsJSContext::CallEventHandler [mozilla/dom/src/base/nsJSEnvironment.cpp, line 1493]
nsGlobalWindow::RunTimeout [mozilla/dom/src/base/nsGlobalWindow.cpp, line 6714]
nsGlobalWindow::TimerCallback [mozilla/dom/src/base/nsGlobalWindow.cpp, line 7086]
nsAppStartup::Run [mozilla/toolkit/components/startup/src/nsAppStartup.cpp, line 152]
main [mozilla/browser/app/nsBrowserApp.cpp, line 61]
kernel32.dll + 0x16fd7 (0x7c816fd7)

Revision history for this message
In , Stavandesai (stavandesai) wrote :

I am using extensions, but I have used them all for a long time, and it still does crash in safemode. Also, someone mentioned that all the crashes were at www.gamespot.com/xbox360. I have had crashes at other sites as well, such as my.yahoo.com, www.cnn.com etc. Sometimes, the talkback feedback agent doesn't come up, just the windows send error report.

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

This is #67 FF2 crasher.

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

Boris, you may remember something about getelementbyid ;)

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

That line is a CallQI call on something that's guaranteed to not be null. Hence I conclude that it's dead. Possibly the document is dead as well, though I'd expect that to crash sooner.

As for how that would happen... no idea. This code is pretty different on the trunk.

Revision history for this message
In , Adam Guthrie (ispiked) wrote :

*** Bug 359973 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mstrumyla (mstrumyla) wrote :

Another thing to check is to try with a clean profile.

Revision history for this message
In , Dveditz (dveditz) wrote :

Would love to see a fix, moved up to #23 on the topcrash list.

Jonas: any thoughts on who to assign this to if not you?

Revision history for this message
In , Laurabean113 (laurabean113) wrote :

I just installed Firefox 2.0 on my laptop and it is now crashing as well. It crashed 2 times within the first 10 minutes of having 2.0 installed. I am using the default theme.

Revision history for this message
In , Robert-dominy (robert-dominy) wrote :

We are also seeing random, somewhat frequent exceptions being thrown by getElementById in our web application. We haven't been able to boil it down to a simple test case, but would love to see this one get fixed. Our web application also interacts with Flash on the page.

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

Robert, I assume you're seeing these in 2.0 and not 1.5? If so, would you possibly be willing to try some of the intermediate builds and see whether you can narrow down when the problem started? Or give someone access to your web app?

Revision history for this message
In , Robert-dominy (robert-dominy) wrote :

Yes we are seeing the crashes in the 2.0 release. Which intermediate releases would you suggest we test against?

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

Robert, see http://archive.mozilla.org/pub/firefox/nightly/ -- it might take a bit to load the first time, sadly. :(

The directories you're looking for have names like "2006-09-15-04-mozilla1.8/" -- that would be a build from the 1.8 Gecko branch (which Firefox 2 is based on), compiled on September 15, 2006, at 4am Pacific time. There will generally be 2-3 directories like that per day (Linux, Windows, Mac builds). These aren't releases, but development snapshots, but they should be pretty stable in general.

The earliest 1.8 branch build (pretty identical to Firefox 1.5) looks like 2005-08-09-12-mozilla1.8/ and the most recent one on archive right now is 2006-10-16-05-mozilla1.8/ -- more recent builds would be at <ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/>.

I've had good luck in the past with a binary search on the builds in cases like this. For about a year's worth of builds, as in this case, only about 8-9 need to be tested...

Revision history for this message
In , Stavandesai (stavandesai) wrote :

I am the original poster of this bug. The problem used to occur mostly at www.gamespot.com/xbox360. However, recently, it has been happening more often at different locations, such as www.engadget.com, my.yahoo.com etc. Also, I have never done this before. Can I actually expect this bug to get fixed?

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

As soon as someone with a debugger manages to reproduce it, probably...

Revision history for this message
In , Robert-dominy (robert-dominy) wrote :

Our QA group went back through the nightly builds and have narrowed the bug to beginning on the 7/07 build. Builds prior to that do not exhibit the bug and ones from that build onward exhibit the bug. Our testers report that it seems more reproducible with Flash 8 installed (and using Flash on the page).

I can't be sure the bug we are seeing is identical to the original reported bug, so it may be helpful if some corroborates the build dates.

Revision history for this message
In , Nrthomas (nrthomas) wrote :

Assuming the date is good, the checkins between the 06/July and 07/July Windows nightly are listed at
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=AviarySuiteBranchTinderbox&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=1152181140&maxdate=1152272400&cvsroot=%2Fcvsroot
which includes the changes to nsHTMLDocument from bug 333038 and the JS 1.7 landing.

Is it possible to get useful Talkback reports for a 07/July build ? Or are the symbols long gone ...

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

Jay, see comment 20?

Stavan Desai, can you confirm the regression range?

Revision history for this message
In , Jay-mozilla (jay-mozilla) wrote :
Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

Robert, could you paste in links to the first build you guys see issues with and the last one you don't, just to make sure we're all on the same page?

Also, does your application use designmode iframes?

Revision history for this message
In , Robert-dominy (robert-dominy) wrote :

Here are the file links we tracked the bug we're seeing to.

The bug begins occuring in this build:
 http://archive.mozilla.org/pub/firefox/nightly/2006-07-07-04-mozilla1.8/
 The file link: firefox-2.0a3.en-US.win32.installer.exe

The bug does NOT occur in this build:
 http://archive.mozilla.org/pub/firefox/nightly/2006-07-06-03-mozilla1.8/
 The file link: firefox-2.0a3.en-US.win32.installer.exe

Our application does not use designmode iframes. It does use regular iframes and embedded Flash object.s

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

OK. If designmode is not being used, then bug 333038 is not too likely to be the problem.

Robert, I assume your stacks look like those cited in comment 2?

Looking at those stacks again, what's happening is that chrome JS is calling getElementById on a content document off a timeout (note the XPC_NW_FunctionWrapper on the stack). Given the regression range, this makes bug 189039 a possible suspect... at the very least it's a change to chrome code that does in fact make getElementById calls (not sure about the timeout part).

Stavan, Robert, are you using typeahead find on pages with textboxes?

What would be ideal here would be someone catching this crash in a debugger and looking up the JS callstack... (e.g. by calling the DumpJSStack() function in the debugger). That would certainly help with sorting out how we're getting to the crash site...

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

Just wondering if this can do something wrong:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/toolkit/components/typeaheadfind/src/nsTypeAheadFind.cpp&rev=&cvsroot=/cvsroot&mark=496,497#496
I'd write that using nsCOMPtr and .swap

Still reading bug 189039...

Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

That line could do the wrong thing if we're holding the only reference to the node, for sure. But then we'd crash at this point...

Revision history for this message
In , Adam Guthrie (ispiked) wrote :

This crash has jumped up to the #14 topcrash for Firefox 2.0.0.1. I don't know if we're willing to reconsider blocking status based on that information.

Revision history for this message
dim6003 (dm-martin) wrote : Crashed while opening an eBay link in a new tab

Binary package hint: firefox

I have a bug report file (~23.4mb) but don't know how to send it ...

Revision history for this message
dim6003 (dm-martin) wrote :

Ok, it's in the second step. didn't know at first!

Revision history for this message
Freddy Martinez (freddymartinez9) wrote :

Taking for retrace

Changed in firefox:
assignee: nobody → freddymartinez9
status: Unconfirmed → Needs Info
Revision history for this message
Hilario J. Montoliu (hjmf) (hmontoliu) wrote : Retraced Stacktrace

Retrace done:
...
 #4 0x00000028 in ?? ()
 #5 0xb5c58f6e in nsHTMLDocument::GetElementById (this=0xa89b848, aElementId=@0x9992310,
     aReturn=0xbfde172c) at ../../../../dist/include/xpcom/nsISupportsUtils.h:225
  entry = (IdAndNameMapEntry *) 0xaa26284
  e = (nsIContent *) 0xabc8070
 #6 0xb7effbb9 in XPTC_InvokeByIndex () at xptcinvoke_gcc_x86_unix.cpp:50
 No locals.
 #7 0xb68cbea4 in XPCWrappedNative::CallMethod (ccx=@0xbfde184c, mode=XPCWrappedNative::CALL_METHOD)
     at xpcwrappednative.cpp:2169
...

Tagging as mt-confirm for further processing

Revision history for this message
Hilario J. Montoliu (hjmf) (hmontoliu) wrote : Retraced Thread Stacktrace

Retraced Thread Stacktrace

Changed in firefox:
assignee: freddymartinez9 → mozilla-bugs
Alexander Sack (asac)
Changed in firefox:
importance: Undecided → High
status: Needs Info → In Progress
Changed in firefox:
status: Unknown → Confirmed
Revision history for this message
In , Ostrj (ostrj) wrote :

(In reply to comment #28)
> This crash has jumped up to the #14 topcrash for Firefox 2.0.0.1. I don't know
> if we're willing to reconsider blocking status based on that information.
>

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

I've been running into an issue that looks very similar with an extension I'm developing. It's crashing immediately below the CallQueryInterface function in GetElementById(). I've run it with my own build of BonEchoDebug, and MallocDebug with scribbling on free turned on, and I see the pointer to the element that's in in the e variable points to an area filled with 0x55, which means it's been freed.

I did also notice that mIsGoingAway always seems to be true in the document when I get this crash.

I can reproduce this with the following steps:
- Install my extension, which listens for google search result loads, and calls XMLHttpRequest's on the result links to determine if they exist, and have the search terms in their text
- Load a page of search results in google, and rapidly click next while there's a lot of requests in flight
- After two or three pages, I usually crash with a callstack that looks just like the one in comment #4. It dies right when I click the next button

It dies in my onreadystatechange callback from my logging, right after I've added an element to the current document by appending to another element's innerHTML. I make a call to retrieve the element I just added with that ID, and it dies in there.

I will attach my extension code if anyone else wants to help debug this. My next steps are going to be adding logging to the getElementById() code to see where the pointer to the element is being retrieved from (eg the hash map?), log the closing of the document, and try to trace the life of the element I add so I can see why it's getting freed. Oh, and I'm testing on OS X if that makes any difference.

Revision history for this message
In , Frenchfrog (frenchfrog) wrote :

Also attach the .xpi so someone could try it on Win32/Linux ?

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 261402
Pre-alpha extension xpi that crash can be repro-ed with

This is the extension xpi I mention in my comment, that's needed to reproduce the crash I'm seeing.

You can look at the source code by unzipping the xpi, and then unzipping the content.jar that's inside the xpi. The crashing function is sendPageRequest in petesearch.js

I've included heavy logging for that function, to narrow down where the crash occurs, using dump(). Logging indicates that the crash occurs inside the getPreviewElement(), which looks like this:

function getPreviewElement(document, target)
{
 return document.getElementById(getPreviewIDFromTarget(target));
}

Revision history for this message
In , Frenchfrog (frenchfrog) wrote :

Pete, have you tryed to reproduce in _stock_ 2.0.0.3 or the new 1.9 nightly builds?

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

Sounds like htmldocument doesn't get notified that element is removed
from document. Should we remove content objects in ::Destroy (not just unbind) to get notifications or just clear mIdAndNameHashTable there.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

(In reply to comment #33)
> Pete, have you tryed to reproduce in _stock_ 2.0.0.3 or the new 1.9 nightly
> builds?

Yes, it was in stock 2.0.3 that I saw this. I only built my own version to try and get some more debug information. I haven't tried it on top-of-tree yet.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

I don't think we want to remove the content objects in ::Destroy on branch, that's too big of a change. Instead lets just clear the ID cache and use nsContentUtils::MatchElementId

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

To test the theory that this is caused by the hash map holding on to destroyed pointers, I added some logging. I had nsDocument::Destroy() log when it was called for each document, and made the following changes to nsHTMLDocument::GetElementById:

NS_IMETHODIMP
nsHTMLDocument::GetElementById(const nsAString& aElementId,
                               nsIDOMElement** aReturn)
{
...
fprintf(stderr, "GetElementById called on document %8x\n", this);
...
  if (e == ID_NOT_IN_DOCUMENT) {
...
fprintf(stderr, "GetElementById:: 0x%8x found in hash after flush\n",e);
  }
else
{
fprintf(stderr, "GetElementById:: 0x%8x found in hash before flush\n",e);
}
...
fprintf(stderr, "GetElementById:: 0x%8x found, but not in hash\n",e);
}

Logging output just before the crash:

Document 0x1d40ee00 destroyed
<... other unrelated log statements ...>
GetElementById called on document 1d40ee00
GetElementById:: 0x3b8aec10 found in hash before flush

This seems to support the idea that the hash map is the cause.

It seems like I could further test this by calling nsHTMLDocument::InvalidateHashTables(). This isn't accessible from the nsDocument base class though, should I create a derived version of Destroy() for nsHTMLDocument that calls nsDocument::Destroy() and then calls InvalidateHashTables(), or is there some other mechanism I could use to make sure it happens at the right time?

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 261555
Proposed patch

Here's a proposed fix. I've added an implementation of Destroy to nsHTMLDocument, that overrides the original virtual nsDocument implementation.
void
nsHTMLDocument::Destroy()
{
  nsDocument::Destroy();
  InvalidateHashTables();
}

It calls back to the the nsDocument Destroy, and then invalidates the hash as an additional step. I've tested this, and I'm no longer able to reproduce the crash with the steps that caused it before.

This patch is against BonEcho. I'm a mozilla newbie, going from http://www.mozilla.org/hacking/life-cycle.html it sounds like I should really do this against top-of-tree, and then maybe it could get backported if necessary for a 2.0.x release?

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Please swap the calls and invalidate the hashes first. This is because we're still in a bad state after the Destroy call, so we should do as little as possible then.

description: updated
Revision history for this message
Patterson (vinpatt44-deactivatedaccount) wrote : Re: [Bug 91584] Re: MASTER firefox crash [@nsHTMLDocument::GetElementById] [@XPTC_InvokeByIndex]

I allready fix the bug, Please send no more mail.

----- Original Message ----
From: Hilario J. Montoliu (hjmf) <email address hidden>
To: <email address hidden>
Sent: Sunday, April 15, 2007 1:58:41 PM
Subject: [Bug 91584] Re: MASTER firefox crash [@nsHTMLDocument::GetElementById] [@XPTC_InvokeByIndex]

** Description changed:

  Binary package hint: firefox

  I have a bug report file (~23.4mb) but don't know how to send it ...
+
+ From retraced stack trace:
+
+ ...
+ #3 <signal handler called>
+ #4 ?? ()
+ #5 nsHTMLDocument::GetElementById (this=0xa89b848, aElementId=@0x9992310,
+ #6 XPTC_InvokeByIndex () at xptcinvoke_gcc_x86_unix.cpp:50
+ #7 XPCWrappedNative::CallMethod (ccx=@0xbfde184c, mode=XPCWrappedNative::CALL_METHOD)
+ #8 XPC_WN_CallMethod (cx=0x8420820, obj=0x8c42818, argc=1, argv=0xaa59dd0,
+ #9 js_Invoke (cx=0x8420820, argc=1, flags=2) at jsinterp.c:1396
+ #10 js_InternalInvoke (cx=0x8420820, obj=0x8c42818, fval=160952368, flags=2, argc=1,
+ #11 JS_CallFunctionValue (cx=0x8420820, obj=0x8c42818, fval=160952368, argc=1,
+ #12 XPC_NW_FunctionWrapper (cx=0x8420820, obj=0x8c427b8, argc=1, argv=0xaa59dbc,
+ #13 js_Invoke (cx=0x8420820, argc=1, flags=0) at jsinterp.c:1396
+ ...

--
MASTER firefox crash [@nsHTMLDocument::GetElementById] [@XPTC_InvokeByIndex]
https://bugs.launchpad.net/bugs/91584
You received this bug notification because you are a bug contact for
firefox in ubuntu.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Lets block on this since it's a topcrasher.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 262060
Updated Patch

Thanks Jonas, that makes sense. I originally placed it last in case anything in the base destroy added objects to the hash, but looking through the code it's pretty clear nothing does or should.

Here's the updated patch, the only change from the previous one is the move of the invalidate call. I've tested it locally with my repro steps, and it no longer crashes. I've only tested on BonEcho, I will add a comment once I've built and run with it on the latest trunk. I assumed that won't block review?

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

I've now tested with the latest Mac nightly (21st April). I can still repro the crash in the nightly builds, though the crash location seems to vary more. I've seen it in nsHTMLDocument::getAttribute() for example.

I've also built my own version of Minefield, with my patch applied, and I no longer see the crash when following the repro steps.

The callstack for the nightly crashes starts from XMLHttpRequest, so I believe it's the same underlying issue (a freed object pulled from the hash table and used).

Based on this, it seems like the fix is still needed, and should go into 1.9 as well as 1.8.x

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Comment on attachment 262060
Updated Patch

You should also set some flag indicating that this doc has been destroyed, and then make the implementation of getElementById not put new things into the hash if that flag is set.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Sorry for not mentioning that earlier :(

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

(In reply to comment #44)
> Sorry for not mentioning that earlier :(

np :)

I'll add some code using the existing NSDocument::mIsGoingAway flag in getElementById(). That only gets set in NSDocument::destroy(), and looks like it's used for similar "do something different when the document's destroyed" behaviors in a couple of other functions. Sound reasonable?

Once I've implemented and tested that change, I'll put up a new patch.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Sounds great.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 263474
Proposed patch with hash adding disabled on destroyed documents

Here's a patch with all of the places where an object might be added to the hash table guarded with checks on mIsGoingAway.

I had to touch several functions in nsHtmlDocument, in addition to GetElementById. I searched the code for any use of mIdAndNameHashTable with the PL_DHASH_ADD operation to decide where to put the check.

I made AddToIdTable, UpdateIdTableEntry and ResolveName return early without doing anything if the document was destroyed.

The one function I didn't touch was ReserveNameInHash, since this is only used in the init and resetURI methods, and the entries it creates are not normal elements at risk of being deallocated.

The changes I made to GetElementById were more involved than I'd like, because the path we want to take (calling MatchElementId and then CallQueryInterface) is at the bottom of the function, and interleaved with hash table accesses that I had to guard.

An alternative for GetElementById is to duplicate that code in an early returning block at the top of the function, so we don't have to guard all the subsequent hash code. This has the disadvantage of making it harder to maintain and keep the two duplicate code blocks in sync going forward. My current approach seems lower risk and more maintainable. A third alternative would be to return null from GetElementById when the document had been destroyed, but this seems too big a change in behavior.

I've tested this against my original steps, and it does fix the problem. It does slightly change the behavior of some of the document functions after destruction, so it would be good to see it tested against some other DOM intensive pages.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Reassigning since Pete is the one actually working on this.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

As a note for anyone looking for a workaround until we get this in a release, I switched my extension over to retrieving the element by name rather than ID. Since this doesn't use the hash table, it appears to avoid the crash and work on destroyed documents.

The released plugin is up at http://petesearch.com/

Changed in firefox:
status: Confirmed → In Progress
Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 265080
Updated patch, same as above but using -w to hide whitespace differences

After chatting to Jonas, I've reformatted the patch to hide the large number of whitespace differences caused by the indenting, to make it easier to understand, and put it back in for review.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Comment on attachment 265080
Updated patch, same as above but using -w to hide whitespace differences

>@@ -2503,34 +2518,38 @@ nsHTMLDocument::GetElementById(const nsA
> if (e == ID_NOT_IN_DOCUMENT) {
> // We've looked for this id before and we didn't find it, so it
> // won't be in the document now either (since the
> // mIdAndNameHashTable is live for entries in the table)
>
> return NS_OK;
> }
>
>+ }

No need for the empty line between the two '}' lines.

> // the fact that it's not in the document
>+ if (entry)
> entry->mIdContent = ID_NOT_IN_DOCUMENT;
>
> return NS_OK;
> }
>
> // We found an element with a matching id, store that in the hash
>+ if (entry)
> entry->mIdContent = e;
> }

Please use { } around the two above new if-statements.

> nsresult
> nsHTMLDocument::ResolveName(const nsAString& aName,
> nsIDOMHTMLFormElement *aForm,
> nsISupports **aResult)
> {
> *aResult = nsnull;
>
>- if (IsXHTML()) {
>+ if (IsXHTML()||mIsGoingAway) {

Put spaces around the ||.

r/sr=me with that fixed.

Revision history for this message
Alexander Sack (asac) wrote :

Patterson, you can unsubscribe from this bug if you don't want any bugmail. If you fixed it, please tell us know how.

Thanks,

  - Alexander

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

Do we need a separate patch for branch or does the patch work there too?
This is still a 2.0.0.x topcrasher.
Though, first the patch needs to land on trunk.
Pete, could you update the patch, I can then check it in.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 267546
Updated patch, as above but with conforming brace style

Here's the cleaned up patch, diffed against the 2.0.3 branch. I'll update and build against the main trunk, test it there, and then post another patch.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 267717
Patch as above, but against the trunk rather than the 2.0.3 branch

Here's the diff against the trunk. The internals of HTMLDocument have changed enough that I had to do a pretty manual merge, so it might be easier to use the previous patch for the 2.0.x branch when it needs to be added there.

I've tested it, and I got the crash with the attached xpi and described steps without this change on last night's trunk, and couldn't repro it after my changes.

I do wonder if we shouldn't change the hash so that it takes ownership of the objects added to it as a longer-term fix? Anyway, this patch is a solution for the immediate crash at least.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Please hold off landing this on trunk as I'm working on a different fix there. Please do land on branch asap though (after getting approval of course).

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

This should be fixed on trunk with the patch from bug 348156

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Reopening since bug 348156 was backed out.

Changed in firefox:
status: In Progress → Confirmed
Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Created attachment 269341
Patch as 267546, but against 2.0.4 branch, with whitespace shown

Here's the patch against FIREFOX_2_0_0_4_RELEASE
I've included whitespace differences, as per Olli's request, and tested locally.

Revision history for this message
In , Mozilla-petewarden (mozilla-petewarden) wrote :

Comment on attachment 269341
Patch as 267546, but against 2.0.4 branch, with whitespace shown

Requested approval on this patch.

Revision history for this message
In , Dveditz (dveditz) wrote :

Comment on attachment 269341
Patch as 267546, but against 2.0.4 branch, with whitespace shown

Jonas, would you mind a brief re-review here to make sure the branch patch isn't going to stumble on some trunk/branch difference?

Revision history for this message
In , Dveditz (dveditz) wrote :

Comment on attachment 269341
Patch as 267546, but against 2.0.4 branch, with whitespace shown

approved for 1.8.1.5 and 1.8.0.13, a=dveditz

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

attachment 269341 checked in to branches.

Revision history for this message
In , Martijn-martijn (martijn-martijn) wrote :

This is now a trunk only bug, afaict, so changing the Version field to Trunk.

Revision history for this message
In , Mtschrep (mtschrep) wrote :

Is there an updated patch for trunk needed here?

Revision history for this message
In , Olli-pettay (olli-pettay) wrote :

No. Bug 348156 should fix this one too.
But better to test after that has landed.

Revision history for this message
In , Jonas-sicking (jonas-sicking) wrote :

Should be fixed by patch in bug 348156

Changed in firefox:
status: Confirmed → Fix Released
Revision history for this message
Alexander Sack (asac) wrote :

fixed for quite some time now.

Changed in firefox:
status: In Progress → Fix Released
Changed in firefox:
importance: Unknown → Critical
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.