firefox crashed with SIGSEGV in NSSRWLock_LockRead_Util()

Bug #236853 reported by Thomas N on 2008-06-02
114
This bug affects 13 people
Affects Status Importance Assigned to Milestone
Mozilla Firefox
Fix Released
Critical
Nominated for 3.0 by squirrly69
firefox-3.0 (Ubuntu)
High
Unassigned
firefox-3.5 (Ubuntu)
High
Unassigned

Bug Description

Binary package hint: firefox-3.0

Don't really know what I did but it obviously crashed...

ProblemType: Crash
Architecture: i386
Date: Mon Jun 2 19:16:26 2008
DistroRelease: Ubuntu 8.04
ExecutablePath: /usr/lib/firefox-3.0/firefox
Package: firefox-3.0 3.0~rc1+nobinonly-0ubuntu0.8.04.1
PackageArchitecture: i386
ProcCmdline: /usr/lib/firefox-3.0/firefox -height 100 -width 100
ProcEnviron:
 PATH=/home/username/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=en_DK.UTF-8
 SHELL=/bin/bash
Signal: 11
SourcePackage: firefox-3.0
StacktraceTop:
 NSSRWLock_LockRead_Util () from /usr/lib/libnssutil3.so.1d
 SECMOD_GetReadLock () from /usr/lib/libnss3.so.1d
 PK11_GetAllTokens () from /usr/lib/libnss3.so.1d
 PK11_GetBestSlotMultiple () from /usr/lib/libnss3.so.1d
 PK11_GetBestSlot () from /usr/lib/libnss3.so.1d
Title: firefox crashed with SIGSEGV in NSSRWLock_LockRead_Util()
Uname: Linux 2.6.24-17-generic i686
UserGroups: adm admin audio cdrom dialout dip floppy fuse lpadmin plugdev sambashare vboxusers video

Explain how it is that this process is not initializing NSS until after it
has run so long that it is already in an out-of-memory condition.

Nelson, Mozilla software will not access PSM unless it has a need to. And only when PSM gets accessed the first time it will init NSS.

According to my understanding, it seems reasonable to start by getting a lock, and only afterwards try to get the module list.

But I don't know! I don't know this code well.

Looking at http://lxr.mozilla.org/seamonkey/ident?i=SECMOD_GetDefaultModuleList
it appears that usually the module list is obtained first, and only afterwards the lock is obtained and locked.

I suspect that the lock is only required for modifying the contents of the lock?

The crash report comment quoted here and the reports i read show this probably *is* early in startup.

I think we use the hashing bits as part of the update service and it seems likely it'd be nearly the first crypto bit to be hit.

please note that i use systems where memory is often but not always scarce. It isn't legal to assume early allocs will succeed any more than it is to assume late ones will.

is there a public contract for "HASH_Create" somewhere?

Ouch...

The NSS contract is NO nss calls are made until NSS is initialized (period). We guarrentee you a crash if you do.

To date, PSM has been the only user of NSS inside a mozilla app.

If someone outside PSM is calling NSS directly, they, at the very least, need to do the minimal xpcom calls to access the crypto object (exported by PSM). Simply setting up that object will guarrentee NSS is initialized properly.

That does not necessarily mean you are home free (PSM keeps track of the NSS objects it creates to make sure it closes them all down if mozilla wants to switch users, for instance, as well as keeps track that mozilla isn't in the middle of shutting down (to make sure no calls are made after NSS is shutdown).

bob

> I suspect that the lock is only required for modifying the contents of the
> lock?

No, the lock is required for both reading and writing. Looking at the code, it's clear it should be acquired before getting the list (as adding, and removal could in theory change the pointer).

In practice the first module is the softoken, which is never removed (though it may change character), so such a bug is unlikely to show up. (Actually changes to the module list is extremely rarer anyway. Adding and deleting pkcs #11 modules in mozilla are the only way it happens in any real application that I know of.

right, I claim that the patch I'm providing here is correct based on your comments. And I'd like it to be taken.

I think that PSM itself may be violating the NSS initialization constraint, but this bug is about flaws in the NSS code. I'll deal w/ any PSM errors in another bug.

Kaie: i think comment 6 should be sufficient for you to r+ attachment 314290

See Bug 410897 Comment 7 (and 8)

Binary package hint: firefox-3.0

Don't really know what I did but it obviously crashed...

ProblemType: Crash
Architecture: i386
Date: Mon Jun 2 19:16:26 2008
DistroRelease: Ubuntu 8.04
ExecutablePath: /usr/lib/firefox-3.0/firefox
Package: firefox-3.0 3.0~rc1+nobinonly-0ubuntu0.8.04.1
PackageArchitecture: i386
ProcCmdline: /usr/lib/firefox-3.0/firefox -height 100 -width 100
ProcEnviron:
 PATH=/home/username/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=en_DK.UTF-8
 SHELL=/bin/bash
Signal: 11
SourcePackage: firefox-3.0
StacktraceTop:
 NSSRWLock_LockRead_Util () from /usr/lib/libnssutil3.so.1d
 SECMOD_GetReadLock () from /usr/lib/libnss3.so.1d
 PK11_GetAllTokens () from /usr/lib/libnss3.so.1d
 PK11_GetBestSlotMultiple () from /usr/lib/libnss3.so.1d
 PK11_GetBestSlot () from /usr/lib/libnss3.so.1d
Title: firefox crashed with SIGSEGV in NSSRWLock_LockRead_Util()
Uname: Linux 2.6.24-17-generic i686
UserGroups: adm admin audio cdrom dialout dip floppy fuse lpadmin plugdev sambashare vboxusers video

Bob's comment 6 should be interpreted as saying this:

Crashing is the DEFINED BEHAVIOR of NSS when NSS functions are called
before NSS is initialized. It is the caller's responsibility to ensure
that NSS is initialized before calling other NSS functions (besides
NSS initialization functions).

So, by definition, this is NOT an NSS bug.
TWO NSS module owners have now spoken on this subject.

Nelson, where should we move this bug to to get the needed attention (and an acceptable patch)?

Samuel: I think I just moved it to the right place when I added comment 10.

Oh, sorry. I missed that in the Bug Activity. Thanks!

I think we've got the same topcrash on branch, would be nice to fix this there as well if the patch gets reviews and such.

nsNSSModule.cpp has its own copied+modified XPCOM constructors.
Those are designed to do the following:
- if the interface requested means we create an instance of
  class nsNSSComponent, then proceed (because that class
  implements the code that is needed to init NSS)
- if the above is not true, then we first jump through function
  EnsureNSSInitialized and if necessary create the nsNSSComponent singleton.

The crashing code is in nsCryptoHash, which uses the correct macro that tries to achieve the latter.

If this bug is because NSS was not initialized, it means either
- the EnsureNSSInitialized code does not work (or regressed)
or
- you have shut NSS. You might have sent out a profile changed notification, but not yet switched to a new profile

The existing PSM code assumes that you'll never use any of PSM services while you are in the middle of a profile switch.

We should trace the calls to
  NSS_InitReadWrite
  NSS_Shutdown
  HASH_Create

If we see calls in the above order, then we've found the cause of the bug.

We must find a way to ensure it never happens.

Oops. We missed something when nsICryptoHash got implemented in bug 415799.

=> nsNSSShutDownObject

nsCryptoHash and nsCryptoHMAC contain a handle to an NSS object that gets invalid at time of NSS shutdown.

We should have made these classes derive from nsNSSShutDownObject, in order to track/invalidate the references when necessary.

timeless: it is really pity you didn't get the JS stack when it crashed. Do you know about a way to reproduce it?

The theory with the update service seems to me right. I experienced some crashes while closing firefox and its autoupdate was in progress (there were active download).

Just for report: I created fake update file on my local server and I emulate the auto update being in progress in background. When I shutdown Minefield it does NOT crash - tried 10 times...

The theory with update service will probably be wrong because: nsUpdateService.js!_verifyDownload that creates cryptohash instance is called only from OnStopRequest with status NS_OK. When shutdown with download in progress, the download request is canceled from "xpcom-shutdown" notification with NS_ERROR_ABORT status - so _verifyDownload is not called. Anyway, NSS is shutdown by nsNSSComponent in "profile-before-change" event that comes earlier then "xpcom-shutdown". If the download would succeed between these two events and OnStopRequest would be passed to JS somehow we theoretically may crash. But it is highly unlikely to happen.

this was a crash report based bug. i didn't experience it. i saw it in reports and picked a series of paths which aren't correct to patch. they're still not correct. fwiw, it's easy to tell if a crash isn't mine. I only run XP, so if the crash report i pick is vista (6.*) then it can't be mine :).

that psm is broken is quite likely (mentioned in comment 0).

do note that the crash report i picked has a very nice comment in the comments field. :)

Kai: is a patch based on comment 17 likely soon, and if so would it be small? I guess we won't 'block' the current release (since we're trying to wrap it up) but we'd really like to make this crash go away if we can.

(In reply to comment #21)
> Kai: is a patch based on comment 17 likely soon, and if so would it be small? I
> guess we won't 'block' the current release (since we're trying to wrap it up)
> but we'd really like to make this crash go away if we can.

We don't know if implementing what I mention in comment 17 will cure the crash.
It might only help us to understand more.

Created attachment 323866
derivation from nss shutdown object

Tested using firefox restart from add-ons dialog. This won't fix the crash.

I see this popping up to #6 on the topcrash list using a one day sample for RC2.

If anyone can reproduce this, please, list the JS callstack using xpc3250!DumpJSStack(). This prints it to stdout. This should help us.

Comment on attachment 323866
derivation from nss shutdown object

Honza, thanks for working on this patch.

Could you please follow the style that other code implementing nsNSSShutdownObject is using?

For example, please look at nsNSSCertCache.

In the destructor, you should use the locker, check for isAlreadyShutDown.

In virtualDestroyNSSReference all other code checks for isAlreadyShutDown, too. This prevents destroying objects that belong to a shutdown NSS session.

I would recommend you rearrance the code so that you have the same functions called virtualDestroyNSSReference, destructorSafeDestroyNSSReference. As a result, you'll have your calls to destroy the objects only once.

Created attachment 325550
derivation from nss shutdown object, 2

StacktraceTop:NSSRWLock_LockRead_Util () from /usr/lib/libnssutil3.so.1d
SECMOD_GetReadLock () from /usr/lib/libnss3.so.1d
PK11_GetAllTokens () from /usr/lib/libnss3.so.1d
PK11_GetBestSlotMultiple () from /usr/lib/libnss3.so.1d
PK11_GetBestSlot () from /usr/lib/libnss3.so.1d

Changed in firefox-3.0:
importance: Undecided → Medium

Not blocking at this point, but topcrash+ (#1 topcrash on OSX) so let's see if we can get a safe patch in.

Got this crash on Windows once today:
bp-e34de6cf-428e-11dd-862f-001cc45a2ce4

Kai, does this patch look good? This is currently the #4 topcrash and we'd love to see it fixed in 3.0.2 (and, really 2.0.0.x).

Would be really nice to get this into 1.9.0.2/1.8.1.17

Not going to block on this for 1.9.0.2 (and likely 1.8.1.17). The patches aren't reviewed or landed and this will likely need bake time.

Note that Bug 450468 has a rather similar stack, but not identical.
That appears to be in PRNG code, while this bug is in hash code.

update: we still don't know what causes the crash, the patches in this bug are not a cure for the crash.

Alexander Sack (asac) on 2008-09-24
Changed in firefox-3.0:
status: New → Triaged
Changed in firefox:
status: Unknown → Confirmed

I know this bug was originally filed as a Windows Vista bug, but I'm getting [@ NSSRWLock_LockRead_Util] crashes on Windows XP. Namely XP SP 3.

If it's alright, I can provide my crash reports (though I don't have any stack traces available =/ ). If I'm in the wrong place or you'd like a separate bug to be filed, do let me know.

Comment on attachment 314289
don't assume allocs won't fail and lock before reading

Clearing review request per comment 34.

Changed in firefox:
status: Confirmed → Fix Released
Changed in firefox:
status: Fix Released → In Progress
Changed in firefox:
status: In Progress → Fix Released
Changed in firefox:
status: Fix Released → In Progress
44 comments hidden view all 124 comments

I wonder if going off-line several seconds before shutting down the browser
would help.

Created attachment 387852
v2 for trunk, 1.9.1

Marking haveLoaded = false when we shutdown nss from nsNSSComponent. This really should prevent to get any psm component while nss is not at the moment initialized.

We should also apply the attachment 325550.

Created attachment 387853
v2 for 1.8.1

Bug 505223 is almost certainly a duplicate of this bug. It contains a stack showing that nsCryptoHash::Init is being called while in NS_ShutdownXPCOM_P .

*** Bug 505223 has been marked as a duplicate of this bug. ***

Kai: any ETA on a fix here? This is blocking our next release which we're trying to square away earlier rather than later.

Comment on attachment 384744
v1 for trunk [Checkin comment 73]

I'm assuming these two are obsolete...

Comment on attachment 325550
derivation from nss shutdown object, 2

(In reply to comment #86)
> (From update of attachment 384744 [details])
> I'm assuming these two are obsolete...

No, it is not, I point out in comment 81 we should also take it.

Uh... then we should really do a combined patch instead of separate ones...

Nominating this topcrash for 1.9.0.13 since it looks like these fixes will land on 1.9.1 before then.

Comment on attachment 325550
derivation from nss shutdown object, 2

This patch is good. It adds protection for the "NSS shutdown race" to the nsCryptoHash and nsCryptoHMAC objects, the same protection we already use for other PSM objects that store references to NSS objects.

r=kaie

Comment on attachment 387852
v2 for trunk, 1.9.1

r=kaie

Comment on attachment 387852
v2 for trunk, 1.9.1

http://hg.mozilla.org/mozilla-central/rev/4679ac688c56

Created attachment 391336
v2, as laned on m-c [Checkin comment 92]

Created attachment 391359
v1+v2 merged for 1.9.1

attachment 384744 and attachment 391336 folded and merged to 1.9.1.

Created attachment 391368
v1+v2 merged for 1.9.1 [Checkin comment 101]

This is the correct one...

Comment on attachment 391368
v1+v2 merged for 1.9.1 [Checkin comment 101]

This *just* landed on mozilla-central. I don't think we can take it in 1.9.1.2.

Created attachment 391397
v1+v2 merged for 1.9.0 [Checkin comment 102]

Changed in firefox:
status: In Progress → Fix Released

Created attachment 391554
v1+v2 merged for 1.8.1

Comment on attachment 391368
v1+v2 merged for 1.9.1 [Checkin comment 101]

Approved for 1.9.1.3, a=dveditz for release-drivers

Comment on attachment 391397
v1+v2 merged for 1.9.0 [Checkin comment 102]

Approved for 1.9.0.14, a=dveditz for release-drivers

Comment on attachment 391368
v1+v2 merged for 1.9.1 [Checkin comment 101]

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/11bdfa7b7c58

Comment on attachment 391397
v1+v2 merged for 1.9.0 [Checkin comment 102]

Checking in security/manager/ssl/src/nsNSSComponent.cpp;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSComponent.cpp,v <-- nsNSSComponent.cpp
new revision: 1.170; previous revision: 1.169
done
Checking in security/manager/ssl/src/nsNSSComponent.h;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSComponent.h,v <-- nsNSSComponent.h
new revision: 1.55; previous revision: 1.54

Micah Gersten (micahg) wrote :

This is set to be released in Firefox 3.5.3 and Firefox 3.0.14

Changed in firefox-3.5 (Ubuntu):
importance: Undecided → High
status: New → Triaged
Changed in firefox-3.0 (Ubuntu):
importance: Medium → High
Micah Gersten (micahg) on 2009-08-14
tags: added: fixed-3.0.14 fixed-3.5.3

It was fixed for Gecko 1.9.1.3 per the flags above, meaning that Firefox 3.5.3 should no longer show the issue.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package firefox-3.0 - 3.0.14+build2+nobinonly-0ubuntu0.8.04.1

---------------
firefox-3.0 (3.0.14+build2+nobinonly-0ubuntu0.8.04.1) hardy-security; urgency=low

  * security/stability v3.0.14 build2 (FIREFOX_3_0_14_RELEASE)
    - see USN-821-1
    - fix LP: #236853 - firefox crashed with SIGSEGV in NSSRWLock_LockRead_Util()

 -- Alexander Sack <email address hidden> Mon, 07 Sep 2009 14:27:50 +0200

Changed in firefox-3.0 (Ubuntu):
status: Triaged → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package firefox-3.5 - 3.5.3+build1+nobinonly-0ubuntu2

---------------
firefox-3.5 (3.5.3+build1+nobinonly-0ubuntu2) karmic; urgency=low

  * security/stability update v3.5.3 build1 (FIREFOX_3_5_3_BUILD1)
    - see USN-821-1
    - fix LP: #333127 - Firefox 3.5 and above crash on full screen flash video
    - fix LP: #236853 - firefox crashed with SIGSEGV in NSSRWLock_LockRead_Util()

  [ Alexander Sack <email address hidden> ]
  * fix LP: #422365 - apport hook fails because profiles_d is not initialized
    in add_info if no profiles.ini exist; we ensure that profiles_d gets instantiated
    as an empty map even if no profiles.ini exist.
    - update debian/apport/firefox-3.5.py
  * hook firefox-addons/searchplugins as the distribution/searchplugins
    directory to support localized distro search engines.
    - update debian/rules
  * in case localized search engines are available the main searchplugins
    directory is not scanned anymore; to fix this we provide a compatibility
    link /usr/lib/firefox-addons/searchplugins/common => /usr/lib/firefox-addons/searchplugins
    - update debian/firefox-3.5.links
  * fix localized search engine upstream code to properly deal with general.useragent.locale
    being a complex pref; also change plugin dir order to allow locale specific searchplugins
    to overlay the ones shipped in "searchplugins/common"
    - add debian/patches/fix_complex_locale_distro_searchplugins.patch
    - update debian/patches/series

  [ Jamie Strandboge <email address hidden> ]
  * add AppArmor profile (disabled by default) (LP: #382917)
    - debian/firefox-3.5.dirs: add etc/apparmor.d/disable
    - add debian/firefox-3.5.preinst.in: disable the profile on new installs
      and upgrades to this version
    - debian/firefox-3.5.postinst.in: reload profile
    - add debian/firefox-3.5.postrm.in: cleanup force-complain and disable
      directories
    - add debian/usr.bin.firefox.apparmor.in
    - debian/rules: install profile
    - add debian/README.Debian.in with note about AppArmor
    - debian/apport/firefox-3.5.py: add AppArmor information if the profile is
      not disabled
    - debian/firefox-3.5.preinst.in: allow for when apparmor is not installed

 -- Alexander Sack <email address hidden> Thu, 03 Sep 2009 10:03:08 +0200

Changed in firefox-3.5 (Ubuntu):
status: Triaged → Fix Released
leeshih49@msn.com (leeshih49) wrote :

JUST DO IT

Changed in firefox-3.0 (Ubuntu):
assignee: nobody → leeshih49@msn.com (leeshih49)
status: Fix Released → New
Changed in firefox-3.5 (Ubuntu):
status: Fix Released → New
Changed in firefox-3.0 (Ubuntu):
status: New → Won't Fix
Changed in firefox-3.5 (Ubuntu):
status: New → Fix Released
Changed in firefox-3.0 (Ubuntu):
assignee: leeshih49@msn.com (leeshih49) → nobody
status: Won't Fix → Fix Released
John Vivirito (gnomefreak) wrote :

Please see https://bugzilla.mozilla.org/show_bug.cgi?id=427715#c105
it was fixed for 3.5
This should have been fixed in 3.0 see:
firefox-3.0 (3.0.14+build2+nobinonly-0ubuntu0.8.04.1) hardy-security; urgency=low

  * security/stability v3.0.14 build2 (FIREFOX_3_0_14_RELEASE)
    - see USN-821-1
    - fix LP: #236853 - firefox crashed with SIGSEGV in NSSRWLock_LockRead_Util()

 -- Alexander Sack <email address hidden> Mon, 07 Sep 2009 14:27:50 +0200

hardy and up has been fixed. please make sure you are at latest version and try using a new profile following the instructions on the following link:
https://wiki.ubuntu.com/MozillaTeam/Bugs#Try%20with%20a%20new%20profile

John Vivirito (gnomefreak) wrote :

Also please do not assign bugs to yourself unless you plan on fixing them.
when you reopen a bug please set it to incomplete or confirmed if it was confirmed.

Make sure you have hardy-security repos enabled

Changed in firefox-3.0 (Ubuntu):
assignee: nobody → Peter J Castle (p-j-castle)
John Vivirito (gnomefreak) wrote :

Please do not assign yourself to bugs unless you plan on fixing it. thanks
Please see my comment above your last comment

Changed in firefox-3.0 (Ubuntu):
assignee: Peter J Castle (p-j-castle) → nobody
coachlaw10 (coachlaw10) wrote :

how do I fix this problem?

Changed in firefox-3.0 (Ubuntu):
status: Fix Released → Fix Committed
Micah Gersten (micahg) wrote :

@Carl Cameron

Please do not change the status of Fix Released bugs.

Changed in firefox-3.0 (Ubuntu):
status: Fix Committed → Fix Released
Mark (markd1) on 2009-10-06
Changed in firefox-3.0 (Ubuntu):
assignee: nobody → Mark (markd1)
Micah Gersten (micahg) wrote :

Please don't assign yourself bugs unless you are actually working on them. This fix has already been released.

Changed in firefox-3.0 (Ubuntu):
assignee: Mark (markd1) → nobody
squirrly69 (squirrly69) on 2009-10-06
Changed in firefox-3.0 (Ubuntu):
status: Fix Released → Confirmed
status: Confirmed → Fix Released
adsa_a (adsa-a) wrote :

good!

Changed in firefox-3.0 (Ubuntu):
assignee: nobody → adsa_a (adsa-a)
Changed in firefox-3.0 (Ubuntu):
status: Fix Released → Fix Committed
Changed in firefox-3.5 (Ubuntu):
status: Fix Released → Fix Committed
Micah Gersten (micahg) wrote :

Please don't assign yourself bugs unless you are actually working on them. This fix has already been released.

Changed in firefox-3.0 (Ubuntu):
assignee: adsa_a (adsa-a) → nobody
status: Fix Committed → Fix Released
Changed in firefox-3.5 (Ubuntu):
status: Fix Committed → Fix Released
Changed in firefox-3.0 (Ubuntu):
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
status: Fix Released → Confirmed
status: Confirmed → Fix Committed
status: Fix Committed → Fix Released
Changed in firefox:
importance: Unknown → Critical

Flag clean up.

Comment on attachment 391554
v1+v2 merged for 1.8.1

Flag clean up.

Displaying first 40 and last 40 comments. View all 124 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.