thunderbird shredder always segfaults on startup with LDAP auth in nsswitch

Bug #507089 reported by Bruce Edge on 2010-01-13
338
This bug affects 53 people
Affects Status Importance Assigned to Milestone
Mozilla Thunderbird
Confirmed
Critical
SeaMonkey
New
Undecided
Unassigned
thunderbird (Ubuntu)
High
Unassigned
Nominated for Lucid by Christian

Bug Description

Binary package hint: thunderbird

If nsswitch.conf is set with:

passwd: compat ldap
group: compat ldap
shadow: compat ldap

My ldap config does work as I'm using it for login authentication.

thunderbird-3.0 always segfaults:

0 %> thunderbird-3.0
Segmentation fault
0 %> thunderbird-3.0 --g-fatal-warnings
Segmentation fault
139 %> thunderbird-3.0 -options
Segmentation fault
139 %> thunderbird-3.0 -safe-mode
Segmentation fault
139 %> thunderbird-3.0 -ProfileManager
Segmentation fault
139 %>

...even with no .thunderbird-30 dir:

 %> thunderbird-3.0 -ProfileManager
*INFO* No /users/bedge/.thunderbird-3.0 detected. Create it from /users/bedge/.mozilla-thunderbird
Segmentation fault
0 9:12:52 bedge@ice ~
139 %>

Changing nsswitch back to NIS works:

passwd: compat nis
group: compat nis
shadow: compat nis

1 %> dpkg -l thunderbird\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Cfg-files/Unpacked/Failed-cfg/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-===========================-===========================-======================================================================
ii thunderbird 2.0.0.23+build1+nobinonly-0 mail/news client with RSS and integrated spam filter support
ii thunderbird-3.0 3.0~rc3~micahg+nobinonly-0u mail/news client with RSS and integrated spam filter support
ii thunderbird-3.0-gnome-suppo 3.0~rc3~micahg+nobinonly-0u Support for Gnome in Mozilla Thunderbird 3.0
ii thunderbird-dispmua 1.6.4.3-1ubuntu1 Display Mail User Agent extension (transitional package)
ii thunderbird-gnome-support 2.0.0.23+build1+nobinonly-0 Support for Gnome in Mozilla Thunderbird
ii thunderbird-locale-en-gb 1:2.0.0.14+1-0ubuntu2 Thunderbird English language/region package
ii thunderbird-nostalgy 0.2.16+svn151-1ubuntu1 keyboard shortcut extension for thunderbird
ii thunderbird-quickfile 0.17.0.0011-0ubuntu4 faster mail filing for the Thunderbird mail client
ii thunderbird-traybiff 1.2.3-4.2ubuntu2 traybiff - new mail alert for thunderbird

WORKAROUND: Installing the nscd package solves this issue in most cases.

Here's the output from one of the segfaults:
jmarco[~] $ /usr/thunderbird/thunderbird
/usr/thunderbird/run-mozilla.sh: line 451: 2193 Segmentation fault "$prog"
${1+"$@"}

Created an attachment (id=181988)
Attached an 'strace -f' of the failure.

I did an 'strace -f /usr/thunderbird/thunderbird' and attached the results.

Created an attachment (id=181996)
Output from a simple gdb session on the core dump

Enabled coredumps and did a gdb on the resulting corefile.
I could do more if you'd like.

I've also submitted this issue as bug#203 for nss_ldap on the padl.com bugzilla
system.

Same problem happens to me. This time "nscd" doesn't help to pass away the problem.

Created an attachment (id=210379)
New stack trace that points to: libldap50 conflict.

Here's the related bug for PADL nss_ldap:
http://bugzilla.padl.com/show_bug.cgi?id=203
I did more investigation on the problem at their request and
found that there seems to be a conflict between the libldap50.so
included with the binary version of Mozilla, and whatever the
default libldapxxx.so that is installed on the user's distribution.
This occurs with nss_ldap because it causes libc to drag in the
default ldap library via NSS for anything that does user name
translation. This seems to react poorl with the Mozilla libldap50
library. Included is the text snippit from my most recent comment
on the PADL bug, and the stack trace from that bug.

From PADL Bug:
Problem is not in nss_ldap. It's a Thunderbird bug

I brought up thunderbird under gdb on my desktop with LDAP user env and not
nscd. Thunderbird died a messy death in strtok() with corrupted stack, so no
trace. No problem. Turns out this is luckily the first call to strtok(), so
was able to 'break strtok' and get a trace.

It turns out that thunderbird has its own version of libldap.so called
libldap50.so with a version of ldap_str2chararray that conflicts with that in
/usr/lib/libldap-2.2.so.7. The version in Thunderbird's libldap50 is called
unexpectedly and it looks like this is causing libldap-2.2 to poop its pants.

As an experiment, I moved /usr/thunderbird/libldap50.so aside and symlinked it
to the linux /usr/lib/libldap-2.2.so.7, and sure enough Thunderbird worked
perfectly.

Therefore, this is a problem with the binary release of Thunderbird not
handling conflicts in system LDAP libraries. Nothing wrong with nss_ldap from
the looks of it.

If you read my previous comment, you'll see that a possible workaround for this bug right now is to go into /usr/thunderbird and:
    mv libldap50.so moved-libldap50.so
    ln -s /usr/lib/libldap-2.2.so.7 libldap50.so
Of course, change the locations of Thunderbird and your currently
installed /usr/lib/libldapXXX as required for your distro/version.
Then, restart Thunderbird.

the other implementations of this function do a dupe before using strtok. reporter, if someone here posted a patch, could you test it? i have absolutely no interest in setting up your configuration, something which has no use for me, but i might be willing to post patches for you to test and provide feedback. note: i'm not a mozilla ldap dev, i'm just someone who flags crash bugs.

Changing one function in the Mozilla libldap will probably not solve the entire problem here. Why not? Because there are undoubtedly dozens of small differences in behavior between the OpenLDAP libldap and the Mozilla libldap. I am not yet sure how to solve this problem in a way that is bulletproof. I added Rich Megginson to the CC in case he has any ideas/experience in dealing with this kind of conflict on Linux.

Yes, there are many large and small and incompatible differences between the OpenLDAP API and the Mozilla API. We had the same problem with newer binary versions of Apache on linux because they are linked directly with OpenLDAP, and we have some modules that depend on the Mozilla API. We solved that problem by using LD_PRELOAD to make sure the Mozilla API is loaded first. However, in this case, you may need to do the reverse and do a LD_PRELOAD to make sure the OpenLDAP API is loaded first. While that might solve the first problem, it will probably break other LDAP features of thunderbird like type down addressing, etc. So I'm not really sure how you can force PAM/NSS to use exclusively OpenLDAP calls while forcing the rest of Thunderbird to use exclusively Mozilla calls.

What we really need is a unified API between OpenLDAP and Mozilla. There are several impediments to this happening:
1) OpenLDAP uses OpenSSL for crypto, while Mozilla uses NSS. My preference would be to have the ability for OpenLDAP to use NSS for crypto, especially if running in a Mozilla client app.
2) Each API has extensions lacking in the other.
3) The command line tools are incompatible.
4) No one in either of the communities has either the time or the inclination to do the work.

I would be willing to test an updated libldap50 library if supplied as a binary, but I don't have the spare time to build from source.

It's been a while since I've looked at this kind of stuff. From the glibc source code, it appears the the NSS code opens its database modules using
dlopen(libnames[x], RTLD_LAZY). The problem is that Thunderbird is compile-time linked with libldap50.so, and so brings in its own version of any number of identically named but incompatible functions. By the time NSS does its dlopen() it's too late. Some of its internal function calls are going to resolve to already-bound functions from libldap50 and blow up.

One way to work around this issue would be to implement a thin LDAP glue library that only contains functions called by Thunderbird. The glue library would internally dlopen("libldap50.so", RTLD_LAZY|RTLD_LOCAL) so as to not globally export loaded symbols for binding by other libraries. The glue versions of the API calls would dlsym() for the real versions and pass through.

My workaround of replacing libldap50.so with OpenLDAP "works" for me, since I don't use any of the LDAP related stuff in Thunderbird. It just keeps getpwuid() type lookups from blowing up. I'd not be surprised to find that some of the LDAP related functionality is actually broken.

Confirmed bug on my setup - Changing Shared lib to OpenLDAP does resolve issue with startup, but does kill addressbook ldap usage.

*** Bug 333571 has been marked as a duplicate of this bug. ***

The same problem exists for thunderbird 2.
The workaround to create a symlink to the local libldap-2.2.so.7 still fixes the issue.

*** Bug 348506 has been marked as a duplicate of this bug. ***

(In reply to comment #10)
> What we really need is a unified API between OpenLDAP and Mozilla.

Yes. More to the point, we need a *good* LDAP API. Interested developers are invited to add comments here
http://scratchpad.wikia.com/wiki/LDAP_C_API

> There are
> several impediments to this happening:
> 1) OpenLDAP uses OpenSSL for crypto, while Mozilla uses NSS. My preference
> would be to have the ability for OpenLDAP to use NSS for crypto, especially if
> running in a Mozilla client app.

That probably makes sense from a Mozilla perspective, but I'm not sure it's worth the overhead of carrying NSPR around everywhere. Also some interesting commentary here:

http://markmail.org/message/z3sf37vnryypdko4#query:openssl%20vs%20nss+page:2+mid:xvw5nybqrhkw6w7n+state:results

> 2) Each API has extensions lacking in the other.

Not relevant, since Mozilla's use of LDAP is quite plain-jane.

> 3) The command line tools are incompatible.

I don't see how associated tools are relevant to the Thunderbird/Mozilla apps..

> 4) No one in either of the communities has either the time or the inclination
> to do the work.

Well, out of boredom, I spent 2 hours this afternoon patching my Mozilla build tree to use OpenLDAP. I think the difficulties have been overstated, because it's working fine on my OpenSUSE laptop.

Note that I haven't looked at the necessary autoconf changes, just edited my build tree after configure was already run. As such, edit config/autoconf.mk:

#LDAP_CFLAGS = -I${DIST}/public/ldap
#LDAP_LIBS = -L${DIST}/bin -L${DIST}/lib -lldap60 -lprldap60 -lldif60
LDAP_CFLAGS = -I/usr/local/include -DLDAP_DEPRECATED
LDAP_LIBS= -L/usr/local/lib -lldap_r -llber

and use the attached patch. A more thorough adaptation would go through and eliminate the use of LDAPv2/deprecated APIs but this was quick and dirty...

Created an attachment (id=333135)
Quick'n'dirty patch

Works with all ldap URLs that OpenLDAP supports (cldap, ldap, ldapi, ldaps); someone should add an option for choosing StartTLS...

Oh, you also need to turn off the MOZ_PSM stuff in directory/xpcom/base/src/Makefile:

#ifdef MOZ_PSM
#DEFINES += -DMOZ_PSM
#CPPSRCS += \
# nsLDAPSecurityGlue.cpp \
# $(NULL)
#endif

This leaves you with a Mozilla build that uses OpenLDAP's SSL support, whatever it may be linked to (OpenSSL or GnuTLS, currently). It's worth noting that OpenSSL is already loaded in the process under Linux, due to various other system libraries included in the build, so this isn't really making any situation worse. Since OpenSSL has been a standard system library on Linux for so long and pretty much everything uses it, it would make more sense to replace NSS with OpenSSL here.

It should be noted that NSS is being considered for inclusion in the LSB,
and OpenSSL is not, due in part to commitment to ABI compatibility in NSS.

Created an attachment (id=333197)
Cleaned up patch

This patch is properly ifdef'd so it won't break the existing MozLDAP functionality...

Created an attachment (id=333905)
OpenLDAP+PSM support

This patch also supports PSM with OpenLDAP, using new callback hooks that were just added to OpenLDAP's CVS HEAD. (Those hooks probably will be released in OpenLDAP 2.4.12; 2.4.11 is current.)

The PSM support just mimics the existing MozLDAP behavior. It's worth noting that the existing behavior will typically break when chasing referrals: The hostname that's passed in persists until the LDAP* handle is closed and is used for all Connection attempts. If a referral is received which points to ldaps:// on a different host, the hostname will not match and the connection should fail. If the referral points to the same host (as is common on MSAD) then it will probably succeed.

To fix this problem the Connect callback should record a bit more info, to answer two questions:
  1) whether it successfully connected once before - that will allow distinguishing referral chasing from the first successful connection.
  2) whether the IP address of the current connection attempt matches the previous successful attempt - that will distinguish referrals to the same host from referrals to a different host.

Then when it's determined that this connect attempt is chasing a secure referral on a different server, it can just use the name provided in the callback argument list.

This whole referral issue probably belongs in a separate bug report, but I'm commenting here because the details only surfaced while investigating this report.

Another obvious problem with the current PSM support: if the initial connection is plaintext but a referral to an ldaps:// URL is received and chased, the subsequent connection will not have the PSM layer installed. The fix for this is to always install the callback, and just have it pass-thru without pushing the PSM layer if the current connection didn't request ldaps://.

Created an attachment (id=334053)
Fix referral issues

Also noticed, in the current code there's a potential memory leak in nsLDAPSSLInstall if prldap_set_sessioninfo fails; it will leak the dup'd hostname because it calls the wrong free function before returning.

(nsLDAPSecurityGlue.cpp:369 should be calling nsLDAPSSLFreeSessionClosure()...)

The socketClosure stuff doesn't seem to accomplish anything. It should probably be ripped out; there's no special handling needed for closure of individual sockets. It's only needed for closing the session handle.

The attached patch fixes these two issues in the existing code. It also fixes the referral issues I mentioned before, for both MozLDAP and OpenLDAP.

(Mark said "be my guest" ...)

(In reply to comment #15)
> What do Mozilla LDAP people think about using the same approach as is done for
> cairo:
> http://lxr.mozilla.org/seamonkey/source/gfx/cairo/cairo/src/filterpublic.awk
> http://lxr.mozilla.org/seamonkey/source/gfx/cairo/cairo/src/cairo-rename.h
>
It seems this would make the app more dependent on having these specific libraries bundled with the app. It would be nice to be able to use the library already present on a system, instead.

An alternative approach, along similar lines, would be to avoid direct references to these library functions in any particular code. Instead, use dlopen (or its analogue) to find any suitable version of the desired library, and use dlsym to build up a table of function pointers for all of the needed entry points. Then wrap macros around all of the invocations in the main source, to always invoke these functions through your table of pointers.

On a separate note, in my current patches I left nsLDAPService::CreateFilter unimplemented because a quick grep thru the source tree didn't turn up anyone using this function. But now I see that the AddressBook actually does try to use it for autocomplete, so I guess we'll have to provide an OpenLDAP version of ldap_create_filter() before this patch can be considered complete.

NSPR provides a analogue of dlopen that works on all Mozilla/Firefox/TBird
platforms and is present in every FF browser and TB mail client (SM too).
See documentation here
http://mxr.mozilla.org/nspr/source/nsprpub/pr/include/prlink.h#94
http://mxr.mozilla.org/nspr/source/nsprpub/pr/include/prlink.h#181

Another approach that sometimes works is to link these libraries with -Bsymbolic, to restrict them to resolving their symbol references to within their own shared objects. Unfortunately, it also requires whoever built the conflicting library to use the same option. I.e., it's not sufficient to link Mozilla's libldap with this flag; the platform's libldap must be linked this way as well. (The symbol conflict confusion is bi-directional; only linking one of the conflicting libraries only eliminates the conflict in one direction.) It also doesn't help when the shared library has other external dependencies (e.g. OpenLDAP's libldap depends on liblber).

Had to mention this because the dlopen approach is still vulnerable to the problem of the dlopen'd libldap referencing the wrong liblber if another one was implicitly loaded into the process by some other library dependency.

Created an attachment (id=334117)
Add ldap_create_filter

I note that in Mozilla's libldap/getfilter.c, which provides ldap_create_filter(), the header comment says "getfilter.c -- optional add-on to libldap". It's not a part of the libldap API spec, and it's totally self-contained - it has no dependencies on anything else in libldap. IMO it doesn't really belong in there, someone just tossed it in there for lack of a more obvious place. So for this patch, I've copied the necessary bits out of getfilter.c and pasted them in here where they're actually used.

Just for your information:
- The bug still exist in OpenSuse 11.0 x86_64
   kernel 2.6.25.18-0.2-default
   MozillaThunderbird-2.0.0.17-3.1
   nscd-2.8-14.1

   nscd crashes as soon as thunderbird is launched

# ps -ef |grep nscd
root 4905 1 0 09:56 ? 00:00:00 /usr/sbin/nscd
root 4915 4844 0 09:56 pts/2 00:00:00 grep nscd
# logout
begou@thor: thunderbird
Registering Enigmail account manager extension.
Enigmail account manager extension registered.
/usr/bin/thunderbird: line 134: 4918 Erreur de segmentation $MOZ_PROGRAM $@
begou@thor: ps -ef |grep nscd
begou 4927 4818 0 09:57 pts/2 00:00:00 grep nscd

Using /usr/lib64/libldap-2.4.so.2 instead of /usr/lib64/thunderbird/libldap50.so seems to provide a good work-around.

Given that we have a patch, maybe this should block Thunderbird 3. It would be really nice to have an idea of how prevalent this is...

Though, despite having a patch, it seems like there's still some discussion to be had on whether it uses the optimal approach, or if one of the other approaches suggested here would make more sense.

Whatever the effect of this specific patch is, I'd like to voice my opinion that, unless Thunderbird gains useful LDAP support for reading and writing address books, there is no way to place Thunderbird onto the corporate desktop, although there are other limiting factors around as well.

*** Bug 470451 has been marked as a duplicate of this bug. ***

Removing the flag that I mistakenly set: since this isn't part of Gecko, so it can't block a gecko release. I'd love to get this for Thunderbird 3, but it feels like there's still a non-trivial amount of work to do here. Not adding [tb3needs], because if I this were the last bug standing, I don't think we would hold the release for it. Sorry I haven't been able to get back this yet, Howard. :-(

I was just bitten by it on Tbird 3 (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1pre) Gecko/20090607 Shredder/3.0b3pre). Oddly, it hangs early in startup when connecting over remote X (ssh-tunnel) but not when invoked locally.

*** Bug 496861 has been marked as a duplicate of this bug. ***

*** Bug 431145 has been marked as a duplicate of this bug. ***

Given that we have a patch, should really try to drive this in for tb3.
I'm not sure it's a problem that the code is in m-c, if it's all NPOTB for firefox anyway.

Changed in thunderbird (Ubuntu):
status: New → Confirmed
Micah Gersten (micahg) on 2010-03-18
Changed in thunderbird (Ubuntu):
status: Confirmed → Incomplete
Micah Gersten (micahg) on 2010-03-18
Changed in thunderbird (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged
Changed in thunderbird:
status: Unknown → Invalid
Changed in thunderbird:
status: Invalid → Unknown
Changed in thunderbird:
status: Unknown → In Progress
Changed in thunderbird:
importance: Unknown → Critical
Ralph Janke (txwikinger) on 2010-10-07
description: updated
description: updated
66 comments hidden view all 146 comments

This also works fine on Ubuntu 8.04 and TB 3.1.7

I've reproduced this bug in both Ubuntu Maverick and Natty, using Thunderbird 3.1.7.

I'll dig a bit deeper, and keep you all posted.

Comment on attachment 334117
Add ldap_create_filter

I'm not convinced by this solution.

If I understand it correctly, then this is trying to make our API the same as OpenLDAP's version. So depending on the set-up of the (Linux) system, we could be using either the OpenLDAP library, or our own. We don't know what is in OpenLDAP's library, nor have will we have done extensive testing in it. If we get crashes or strange results, we may not even realise that we're using OpenLDAP's library. This would make support very difficult. I think this is what Mark was saying in comment 9.

Given that we ship this library in Thunderbird, intending that Thunderbird is going to use this library, then maybe we should consider re-naming the library when we ship it within Thunderbird. This idea is from a similar approach Firefox took with SQLite in bug 513747.

So for instance, we could ship libmozldap60.so etc where we build LDAP as part of Thunderbird. Hence, changing the name should resolve the conflicts we're seeing, and ensure that Thunderbird runs with what we intended.

The LDAP c-sdk could still default to libldap60.so, and if building with the system LDAP c-sdk, then we could still use libldap60.so. If Linux distributions want to use the system LDAP for shipping Thunderbird, then I would expect them to verify/handle bugs with LDAP, especially if it isn't the LDAP c-sdk that we're shipping with Thunderbird.

Obviously we may still want to move the two sets of LDAP APIs closer together, but I'm not convinced doing it as a result of this bug is the right thing to do. For example, it really does feel like ldap_create_filter should be in the c-sdk, and therefore maybe it needs adding to OpenLDAP's version, not removing from ours.

If I've misunderstood things, then please correct me.

What is the status on this bug after 7 years?

From what I understand (correct me if I am wrong) the solution is to install either nscd or libnss-ldapd. While both of these seem to work, they are not acceptable solutions because it affects the rest of the system.
And why should thunderbird even care what the controlling backend auth module is in the first place?

(In reply to comment #70)
> What is the status on this bug after 7 years?
>
> From what I understand (correct me if I am wrong) the solution is to install
> either nscd or libnss-ldapd. While both of these seem to work, they are not
> acceptable solutions because it affects the rest of the system.
> And why should thunderbird even care what the controlling backend auth module
> is in the first place?

You're right that Thunderbird or some other app *shouldn't* ever need to care about this, but the fact is that the old nss-ldap design causes these types of problems, and libnss-ldapd corrects the design flaw.

Dan Woodard (dan-e-woodard) wrote :

I have a similar problem. I'm adding what may be more details to aid in solving. I use LDAP for authentication and I have nscd installed. Thunderbird has been working without issue until yesterday when I changed hosts in nsswitch.conf to use ldap as shown below;

Changed:
hosts: files dns [NOTFOUND=return]
to the following;
hosts: files ldap dns [NOTFOUND=return]

This change resulted in seg faults with no other warning. I was able to create an alternate user profile and that would work, but when trying to get back to the current user, the seg fault would return. Once I changed nsswitch.conf back to using dns, the app starts without issue.

My release;
Linux bari.iqanalog.com 2.6.32-33-generic #72-Ubuntu SMP Fri Jul 29 21:07:13 UTC 2011 x86_64 GNU/Linux
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.3 LTS"
Thunderbird 3.1.13

still a problem in 12.04 amd64 desktop and the default thunderbird provided.

airtonix (airtonix-gmail) wrote :

although it's still a problem in ubuntu 12.04 64bit desktop and the default thunderbird (not sure if any thunderbird), applying the workaround in #7 (https://bugs.launchpad.net/ubuntu/+source/thunderbird/+bug/507089/comments/7) works.

we use 10 ubuntu 12.04 64bit workstations here at work, that require ldap login from a ubuntu 10.04 server managed by zentyal.

I apply the zentyla-desktop.deb to each of the workstations.

*** Bug 756782 has been marked as a duplicate of this bug. ***

Created attachment 628378
rename libldap60.so to libmozldap60.so

While that change makes sense in general I'm wondering what it's supposed to fix? I'm pretty sure that the filename is not the issue.

Comment on attachment 628378
rename libldap60.so to libmozldap60.so

From discussions about this sort of thing previously (which admittedly were a while ago), I believe that changing the library name wouldn't actually resolve all the problems.

Additionally, I don't think it is really right to change the library name unless the developers of the Mozilla LDAP c-sdk really want to, as it would impact on all the users of it, and potentially the use of libraries on existing systems.

I think that we should really go for changing the LDAP c-sdk that we use, and possibly replacing it with OpenLDAP as Howard was intending (or something else). To this effect I've put a proposal to tb-planning about this change:

http://groups.google.com/group/tb-planning/browse_thread/thread/342164ae0db9b21a
(https://wiki.mozilla.org/Thunderbird/tb-planning)

Comment on attachment 334117
Add ldap_create_filter

I'm rescinding my previous feedback- on this. Per previous comment on this bug, discussions have moved on, and we're considering moving away from the LDAP c-sdk, so this patch may therefore be heading in the right direction. Obviously, it would need to be updated and re-tested etc, but see the tb-planning discussion first.

The problem ist also in Thunderbird 15 still present! I get a backtrace like in https://bugzilla.mozilla.org/show_bug.cgi?id=433530:

(gdb) bt
#0 strtok_r () at ../sysdeps/x86_64/strtok.S:190
#1 0x00007ffff6ad3b3a in ldap_str2charray (str=0x7fffe3781ced "ldap://localhost/", brkstr=0x7fffe3781a4b ", ")
    at /usr/src/debug/mail-client/thunderbird-15.0.1/comm-release/ldap/sdks/c-sdk/ldap/libraries/libldap/charray.c:218
#2 0x00007fffe376c216 in ldap_url_parselist_int (ludlist=0x7fffe398be80, url=<optimized out>, sep=<optimized out>, flags=11) at url.c:1293
#3 0x00007fffe376da8b in ldap_int_initialize_global_options (gopts=0x7fffe398bdc0, dbglvl=<optimized out>) at init.c:537
#4 0x00007fffe376dc0d in ldap_int_initialize (gopts=0x7fffe398bdc0, dbglvl=<optimized out>) at init.c:653
#5 0x00007fffe3753309 in ldap_create (ldp=0x7fffffff9cb8) at open.c:108

By looking at
(gdb) info sharedlibrary
0x00007ffff6ad2040 0x00007ffff6af6558 Yes /usr/lib64/thunderbird/libldap60.so
0x00007fffe3752fd0 0x00007fffe377e0a8 Yes /usr/lib64/libldap-2.4.so.2

you can see that the openldap routine is jumping into a mozilla routine, causing a segfault by applying strtok to "ldap://localhost/", which is a built in string in the openldap lib. A solution would be nice, because currently I can't use Thunderbird at all.

The problem is also in Thunderbird 16. It's a clash of symbols from libldap-2.4.so and libldap60.so.

(gdb) bt
#0 0x00007fffe708a100 in ldap_str2charray () from /usr/lib64/libldap-2.4.so.2
#1 0x00007fffe70816c6 in ldap_url_parselist_int () from /usr/lib64/libldap-2.4.so.2
#2 0x00007fffe7082f1b in ldap_int_initialize_global_options () from /usr/lib64/libldap-2.4.so.2
#3 0x00007fffe7083016 in ldap_int_initialize () from /usr/lib64/libldap-2.4.so.2
#4 0x00007fffe706a6ab in ldap_create () from /usr/lib64/libldap-2.4.so.2
#5 0x00007fffe706aa81 in ldap_initialize () from /usr/lib64/libldap-2.4.so.2
#6 0x00007fffe72a79c0 in do_init () from /lib64/libnss_ldap.so.2
#7 0x00007fffe72a9d1c in _nss_ldap_search_s () from /lib64/libnss_ldap.so.2
#8 0x00007fffe72ab580 in _nss_ldap_getbyname () from /lib64/libnss_ldap.so.2
#9 0x00007fffe72abd07 in _nss_ldap_getpwnam_r () from /lib64/libnss_ldap.so.2
#10 0x00007ffff70c5685 in getpwnam_r () from /lib64/libc.so.6

Removing/renaming libldap60.so caused some errors in finding the library, so this seems no solution:
  XPCOMGlueLoad error for file /usr/lib64/thunderbird/libxpcom.so:
  libxul.so: cannot open shared object file: No such file or directory
  Couldn't load XPCOM.

We brute-forced renaming the symbol via
   sed -e 's:ldap_str2charray:ldap_str2xharray:' /usr/lib64/thunderbird/libldap60.so
in order to make it work.

The workaround in post #7 no longer seems to work for my Ubuntu 12.04 x86_64 system, as of thunderbird package version 11.0.1+build1-0ubuntu2. After upgrade, thunderbird immediately jumps to a Mozilla bug reporting screen on start. I've ran this command as suggested:

"cp /usr/lib/x86_64-linux-gnu/libldap-2.4.so.2 /usr/lib/thunderbird/libldap60.so"

And while this does change the issue somewhat (I no longer get a bug reporting screen now, Thunderbird just exists silently), it doesn't resolve the issue. Please advise on other things I might try, or other information I could gather that will be useful for people who could resolve this.

I've tried all the proposed workarounds without success. This is affecting all of our user in our LDAP-based network.

I'm running Ubuntu 12.04.1 x86_64 with thunderbird package version 17.0+build2-0ubuntu0.12.04.1.

Previously, the workaround was to run the "nscd" service -- however that is no longer an option due to this bug:

https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1085957

I do not see how bug #1085957 is in any way related to this bug… running nscd to cache LDAP responses is itself a crucial service and mozilla should not allow its internal LDAP library to export symbols overlapping with openldap’s ABI unless if it is binary-compatible with openldap.

Bug #1085957 is clearly a different bug that is not related to #506089 in any other way than it invalidates (one of) the proposed workarounds for #506089.

s/#506089/#507089

*** Bug 708222 has been marked as a duplicate of this bug. ***

I'm getting an error in jemalloc.c on Ubuntu 12.04.1 (this error is reproducible.) Any pointers on how to work around this problem are much welcome (NSCD is not an option due to bug #507089.)

Program received signal SIGSEGV, Segmentation fault.
arena_dalloc (ptr=0x7fffead2f190, offset=<optimized out>) at /build/buildd/thunderbird-17.0.2+build1/mozilla/memory/mozjemalloc/jemalloc.c:4626
4626 /build/buildd/thunderbird-17.0.2+build1/mozilla/memory/mozjemalloc/jemalloc.c: No such file or directory.
(gdb) bt
#0 arena_dalloc (ptr=0x7fffead2f190, offset=<optimized out>) at /build/buildd/thunderbird-17.0.2+build1/mozilla/memory/mozjemalloc/jemalloc.c:4626
#1 0x00007ffff5a1eb5f in ldap_ld_free (ld=0x7ffff6cab5e0, serverctrls=0x0, clientctrls=<optimized out>, close=<optimized out>)
    at /build/buildd/thunderbird-17.0.2+build1/./ldap/sdks/c-sdk/ldap/libraries/libldap/unbind.c:158
#2 0x00007fffead2e955 in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#3 0x00007fffead2fb5b in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#4 0x00007fffead31192 in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#5 0x00007fffead32819 in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#6 0x00007fffead32e09 in _nss_ldap_getpwuid_r () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#7 0x00007ffff70bcbfd in __getpwuid_r (uid=537, resbuf=0x7ffff73b9380, buffer=0x7ffff6c09000 "+", buflen=1024, result=0x7fffffffc6e0) at ../nss/getXXbyYY_r.c:256
#8 0x00007ffff70bc4f3 in getpwuid (uid=537) at ../nss/getXXbyYY.c:117
#9 0x00007ffff1afe0ef in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#10 0x00007ffff1afe99d in g_get_home_dir () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007ffff027cb67 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#12 0x00007ffff0280f9f in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#13 0x00007ffff0230b2a in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#14 0x00007ffff1addfa0 in g_option_context_parse () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x00007ffff0231120 in gtk_parse_args () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#16 0x00007ffff3280fb4 in XREMain::XRE_mainStartup (this=0x7fffffffce30, aExitFlag=0x7fffffffcdff) at /build/buildd/thunderbird-17.0.2+build1/mozilla/toolkit/xre/nsAppRunner.cpp:3247
#17 0x00007ffff3283a19 in XREMain::XRE_main (this=0x7fffffffce30, argc=<optimized out>, argv=0x7fffffffe228, aAppData=0x7ffff6c26680)
    at /build/buildd/thunderbird-17.0.2+build1/mozilla/toolkit/xre/nsAppRunner.cpp:3871
#18 0x00007ffff3283c8d in XRE_main (argc=1, argv=0x7fffffffe228, aAppData=0x7ffff6c26680, aFlags=<optimized out>)
    at /build/buildd/thunderbird-17.0.2+build1/mozilla/toolkit/xre/nsAppRunner.cpp:3965
#19 0x000000000040225b in do_main (argv=0x7fffffffe228, argc=1, exePath=0x7fffffffd108 "/usr/lib/thunderbird/") at /build/buildd/thunderbird-17.0.2+build1/mail/app/nsMailApp.cpp:111
#20 main (argc=1, argv=0x7fffffffe228) at /build/buildd/thunderbird-17.0.2+build1/mail/app/nsMailApp.cpp:200

Ro (robert-markula) wrote :

If you have the possibility, try SSSD [1]. It works fine with Thunderbird and does away with all the cruft that nscd and co. brought along. One single and easy to understand config file to configure it.
It's here in production with Ubuntu 12.04.* and never had problems again - which is in sharp contrast to ns(l)cd. Packages are available for Ubuntu in standard repositories.
A small howto is at [2].

[1] https://fedorahosted.org/sssd/
[2] http://labs.opinsys.com/blog/2010/03/26/user-management-with-sssd-on-shared-laptops/

*** Bug 874029 has been marked as a duplicate of this bug. ***

Problem is exist on Thunderbird 17 too, here we can find crash reports relevant to this issue from all versions of thunderbird: https://crash-stats.mozilla.com/report/list?product=Thunderbird&query_search=signature&query_type=contains&reason_type=contains&date=2013-07-26&range_value=28&range_unit=days&hang_type=any&process_type=any&signature=arena_dalloc+|+ldap_x_free+|+ldap_set_lderrno

After upgrading to Thunderbird 22, the error is reproducible too, but signature is changed from:
arena_dalloc | ldap_x_free | ldap_set_lderrno
to
arena_dalloc | ld-2.15.so@0x214e4

- is this the same error or some other problem?

Howard is no longer working on this

(In reply to Murz from comment #83)
> After upgrading to Thunderbird 22, the error is reproducible too, but
> signature is changed from:
> arena_dalloc | ldap_x_free | ldap_set_lderrno
> to
> arena_dalloc | ld-2.15.so@0x214e4
>
> - is this the same error or some other problem?

seems likely.
https://crash-stats.mozilla.com/query/?product=Thunderbird&version=ALL%3AALL&range_value=4&range_unit=weeks&date=08%2F06%2F2013+17%3A00%3A00&query_search=signature&query_type=contains&query=arena_dalloc+|+ld&reason=&release_channels=&build_id=&process_type=any&hang_type=any
 arena_dalloc | ldap_x_free | ldap_set_lderrno
arena_dalloc | ldap_ld_free | libnss_ldap-2.13.so@0x3955
arena_dalloc | ldap_set_lderrno
arena_dalloc | ld-2.15.so@0x214e4
arena_dalloc | ld-2.15.so@0xe774

Changed in thunderbird:
status: In Progress → Confirmed

The bug is present in Thunderbird 24.2.0 running on Kubuntu 12.04.4. Running nscd appears to work around the issue, but I haven't tested it thoroughly for side effects.

I find it somewhat ironic that a nearly nine year old bug of this magnitude has status: NEW.

Software versions (all from Ubuntu repos):
$ aptitude show thunderbird | grep Version
Version: 1:24.2.0+build1-0ubuntu0.12.04.1
$ aptitude show libldap-2.4-2 | grep Version
Version: 2.4.28-1.1ubuntu4.4
$ uname -a
Linux tiny 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

(In reply to Maciej Puzio from comment #85)
> I find it somewhat ironic that a nearly nine year old bug of this magnitude
> has status: NEW.

Actually, a better label would be CONFIRMED rather than NEW. That's what NEW really means, it does not refer to the bug's age.

(In reply to Tony Mechelynck [:tonymec] from comment #86)
> Actually, a better label would be CONFIRMED rather than NEW. That's what NEW
> really means, it does not refer to the bug's age.

I am very well aware of that; my point was to draw attention to an unacceptable quality control, record-breaking in the length of bug fix cycle. Anyway, my further testing revealed several more issues with libldap, libpam-ldap and libnss-ldap, and I decided that this software as a whole does not meet my quality requirements. Instead I am deploying sssd as LDAP client for PAM and NSS, and this is my recommendation for readers of this page.

nslcd /nss-pam-ldapd would be the best choice, the code is quite mature since the basic LDAP functionality is ported from the old PADL code and well proven. It's also quite compact, it does just LDAP and nothing else. SSSD is unproven, and quite overloaded featurewise. For security/authentication software, complexity is the enemy of reliability. I shouldn't have to roll out that lecture again...

Ro (robert-markula) wrote :

nslcd/nss-pam-ldapd has its own share of problems. I would say that calling SSSD unproven is unjustified. It exists for quite some time, is actively developed and solves many problems that are still present - partly per design - with nslcd/nss-pam-ldapd. Finally, configuration with SSSD is much easier and much less error prone than the old nslcd/nss-pam-ldapd-combo.

Maciej Puzio and Howard Chu - thanks for the info, moving to ldapd or sssd solves this problem.

Chiming in with the info that I first encountered this bug in Mint 13 (Ubuntu Precise), and it still applies in Mint 17 (Ubuntu Trusty). And while I can understand all the issues involved with deciding the "right way to go", I am somewhat miffed to find that a decade-old bug still expresses itself as a SIGSEGV. Expecting the user to strace / google / eventually find this bug entry if he's lucky? Is it really that difficult to check for the condition and at least give a meaningful message (perhaps including a workaround recommendation) before exiting gracefully?

(In reply to Martin Baute from comment #90)
> Chiming in with the info that I first encountered this bug in Mint 13
> (Ubuntu Precise), and it still applies in Mint 17 (Ubuntu Trusty). And while
> I can understand all the issues involved with deciding the "right way to
> go", I am somewhat miffed to find that a decade-old bug still expresses
> itself as a SIGSEGV. Expecting the user to strace / google / eventually find
> this bug entry if he's lucky? Is it really that difficult to check for the
> condition and at least give a meaningful message (perhaps including a
> workaround recommendation) before exiting gracefully?

It is a constant of Electronic Data Processing that no program is bug-free before it is obsolete. Even once a bug is identified, fixing it is not always easy. Complaining that "after so many years, no fix has been found" doesn't push the bug any nearer to be fixed, while it adds to the lot of useless rubbish (please excuse my language) that developers must wade through in order to find what the problem really is.

Another constant of EDP is that there are never enough coding hands do do all that needs doing, even when, as at Mozilla, a lot of volunteers selflessly donate part of their time to help the people whose paid job it is to try and fix these bugs. Any help is always welcome, and the code is anyone's to look into.

Do you know how to fix the bug? Good! Write a patch, ASSIGN the bug to yourself, find an appropriate reviewer by browsing https://wiki.mozilla.org/Modules and off you go. Once you get a positive review, set the checkin-needed flag, and someone will push your patch into the permanent source.

You mean you don't know how to fix the patch? Ah, too bad. Neither do I. So let us wait patiently, even years if that's what it takes, until someone comes around who does, and in the meantime let's have a look at the "rules of the house", https://bugzilla.mozilla.org/page.cgi?id=etiquette.html

(In reply to Tony Mechelynck [:tonymec] from comment #91)
> ...lots of the usual deleted...

So your answer to a bug that's been confirmed, and after nine years still expresses itself as SIGSEGV, is basically, "go fix it yourself"?

You think *that* is a useful contribution to this bug report?

Sometimes I'm really ashamed of my peers in the trade. And no, I won't wade through Thunderbird sources, because I've got other projects. I am a Thunderbird *user*, not a *maintainer*, so...

...go fix it yourself.

Confirming this bug for 31.1.1 Linux (Xubuntu 14.04): User accounts through ldap authentication make Thunderbird crash when trying to print. Installing nscd makes that go away.

Yo (yleduc) on 2015-04-10
Changed in thunderbird (Ubuntu):
status: Triaged → Fix Released
Steve Kowalik (stevenk) on 2015-04-14
Changed in thunderbird (Ubuntu):
status: Fix Released → Triaged
psnizek (psnizek-i) wrote :

Bug #507089
"My (jcollins) home directory is sshfs mounted to a remote server on my network using pam_mount. This is what happens when I try to run thunderbird:
jcollins@joewks:~$ thunderbird
Segmentation fault
jcollins@joewks:~$

When I log in as a user (jcollins-local) without a sshfs mounted home directory, thunderbird runs fine."

>> I ran into the exact same symptoms as described by jcollins today after replacing the mainboard in my PC. Before that, Thunderbird was working reliably and was in use every day since 2013 on this machine. If logging in to my network where /home is on a NFS share, Thunderbird quits with "segmentation fault" during program start. If starting locally (logged in as local user), Thunderbird starts successfully. De- and re-installation didn't help. NSCD is on newest version.
We use LDAP for network user authentification (not for email).

psnizek (psnizek) wrote :

Update on #145:

I just noticed that the nscd demon is crashing during login. After restarting the service firefox starts and functions again. Don't know the cause for nscd crashing, but I believe it is off topic here. I don't mind this post and #145 being removed by the admin.

Displaying first 40 and last 40 comments. View all 146 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.