thunderbird shredder always segfaults on startup with LDAP auth in nsswitch

Bug #507089 reported by Bruce Edge on 2010-01-13
338
This bug affects 53 people
Affects Status Importance Assigned to Milestone
Mozilla Thunderbird
Confirmed
Critical
SeaMonkey
New
Undecided
Unassigned
thunderbird (Ubuntu)
High
Unassigned
Nominated for Lucid by Christian

Bug Description

Binary package hint: thunderbird

If nsswitch.conf is set with:

passwd: compat ldap
group: compat ldap
shadow: compat ldap

My ldap config does work as I'm using it for login authentication.

thunderbird-3.0 always segfaults:

0 %> thunderbird-3.0
Segmentation fault
0 %> thunderbird-3.0 --g-fatal-warnings
Segmentation fault
139 %> thunderbird-3.0 -options
Segmentation fault
139 %> thunderbird-3.0 -safe-mode
Segmentation fault
139 %> thunderbird-3.0 -ProfileManager
Segmentation fault
139 %>

...even with no .thunderbird-30 dir:

 %> thunderbird-3.0 -ProfileManager
*INFO* No /users/bedge/.thunderbird-3.0 detected. Create it from /users/bedge/.mozilla-thunderbird
Segmentation fault
0 9:12:52 bedge@ice ~
139 %>

Changing nsswitch back to NIS works:

passwd: compat nis
group: compat nis
shadow: compat nis

1 %> dpkg -l thunderbird\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Cfg-files/Unpacked/Failed-cfg/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-===========================-===========================-======================================================================
ii thunderbird 2.0.0.23+build1+nobinonly-0 mail/news client with RSS and integrated spam filter support
ii thunderbird-3.0 3.0~rc3~micahg+nobinonly-0u mail/news client with RSS and integrated spam filter support
ii thunderbird-3.0-gnome-suppo 3.0~rc3~micahg+nobinonly-0u Support for Gnome in Mozilla Thunderbird 3.0
ii thunderbird-dispmua 1.6.4.3-1ubuntu1 Display Mail User Agent extension (transitional package)
ii thunderbird-gnome-support 2.0.0.23+build1+nobinonly-0 Support for Gnome in Mozilla Thunderbird
ii thunderbird-locale-en-gb 1:2.0.0.14+1-0ubuntu2 Thunderbird English language/region package
ii thunderbird-nostalgy 0.2.16+svn151-1ubuntu1 keyboard shortcut extension for thunderbird
ii thunderbird-quickfile 0.17.0.0011-0ubuntu4 faster mail filing for the Thunderbird mail client
ii thunderbird-traybiff 1.2.3-4.2ubuntu2 traybiff - new mail alert for thunderbird

WORKAROUND: Installing the nscd package solves this issue in most cases.

Here's the output from one of the segfaults:
jmarco[~] $ /usr/thunderbird/thunderbird
/usr/thunderbird/run-mozilla.sh: line 451: 2193 Segmentation fault "$prog"
${1+"$@"}

Created an attachment (id=181988)
Attached an 'strace -f' of the failure.

I did an 'strace -f /usr/thunderbird/thunderbird' and attached the results.

Created an attachment (id=181996)
Output from a simple gdb session on the core dump

Enabled coredumps and did a gdb on the resulting corefile.
I could do more if you'd like.

I've also submitted this issue as bug#203 for nss_ldap on the padl.com bugzilla
system.

Same problem happens to me. This time "nscd" doesn't help to pass away the problem.

Created an attachment (id=210379)
New stack trace that points to: libldap50 conflict.

Here's the related bug for PADL nss_ldap:
http://bugzilla.padl.com/show_bug.cgi?id=203
I did more investigation on the problem at their request and
found that there seems to be a conflict between the libldap50.so
included with the binary version of Mozilla, and whatever the
default libldapxxx.so that is installed on the user's distribution.
This occurs with nss_ldap because it causes libc to drag in the
default ldap library via NSS for anything that does user name
translation. This seems to react poorl with the Mozilla libldap50
library. Included is the text snippit from my most recent comment
on the PADL bug, and the stack trace from that bug.

From PADL Bug:
Problem is not in nss_ldap. It's a Thunderbird bug

I brought up thunderbird under gdb on my desktop with LDAP user env and not
nscd. Thunderbird died a messy death in strtok() with corrupted stack, so no
trace. No problem. Turns out this is luckily the first call to strtok(), so
was able to 'break strtok' and get a trace.

It turns out that thunderbird has its own version of libldap.so called
libldap50.so with a version of ldap_str2chararray that conflicts with that in
/usr/lib/libldap-2.2.so.7. The version in Thunderbird's libldap50 is called
unexpectedly and it looks like this is causing libldap-2.2 to poop its pants.

As an experiment, I moved /usr/thunderbird/libldap50.so aside and symlinked it
to the linux /usr/lib/libldap-2.2.so.7, and sure enough Thunderbird worked
perfectly.

Therefore, this is a problem with the binary release of Thunderbird not
handling conflicts in system LDAP libraries. Nothing wrong with nss_ldap from
the looks of it.

If you read my previous comment, you'll see that a possible workaround for this bug right now is to go into /usr/thunderbird and:
    mv libldap50.so moved-libldap50.so
    ln -s /usr/lib/libldap-2.2.so.7 libldap50.so
Of course, change the locations of Thunderbird and your currently
installed /usr/lib/libldapXXX as required for your distro/version.
Then, restart Thunderbird.

the other implementations of this function do a dupe before using strtok. reporter, if someone here posted a patch, could you test it? i have absolutely no interest in setting up your configuration, something which has no use for me, but i might be willing to post patches for you to test and provide feedback. note: i'm not a mozilla ldap dev, i'm just someone who flags crash bugs.

Changing one function in the Mozilla libldap will probably not solve the entire problem here. Why not? Because there are undoubtedly dozens of small differences in behavior between the OpenLDAP libldap and the Mozilla libldap. I am not yet sure how to solve this problem in a way that is bulletproof. I added Rich Megginson to the CC in case he has any ideas/experience in dealing with this kind of conflict on Linux.

Yes, there are many large and small and incompatible differences between the OpenLDAP API and the Mozilla API. We had the same problem with newer binary versions of Apache on linux because they are linked directly with OpenLDAP, and we have some modules that depend on the Mozilla API. We solved that problem by using LD_PRELOAD to make sure the Mozilla API is loaded first. However, in this case, you may need to do the reverse and do a LD_PRELOAD to make sure the OpenLDAP API is loaded first. While that might solve the first problem, it will probably break other LDAP features of thunderbird like type down addressing, etc. So I'm not really sure how you can force PAM/NSS to use exclusively OpenLDAP calls while forcing the rest of Thunderbird to use exclusively Mozilla calls.

What we really need is a unified API between OpenLDAP and Mozilla. There are several impediments to this happening:
1) OpenLDAP uses OpenSSL for crypto, while Mozilla uses NSS. My preference would be to have the ability for OpenLDAP to use NSS for crypto, especially if running in a Mozilla client app.
2) Each API has extensions lacking in the other.
3) The command line tools are incompatible.
4) No one in either of the communities has either the time or the inclination to do the work.

I would be willing to test an updated libldap50 library if supplied as a binary, but I don't have the spare time to build from source.

It's been a while since I've looked at this kind of stuff. From the glibc source code, it appears the the NSS code opens its database modules using
dlopen(libnames[x], RTLD_LAZY). The problem is that Thunderbird is compile-time linked with libldap50.so, and so brings in its own version of any number of identically named but incompatible functions. By the time NSS does its dlopen() it's too late. Some of its internal function calls are going to resolve to already-bound functions from libldap50 and blow up.

One way to work around this issue would be to implement a thin LDAP glue library that only contains functions called by Thunderbird. The glue library would internally dlopen("libldap50.so", RTLD_LAZY|RTLD_LOCAL) so as to not globally export loaded symbols for binding by other libraries. The glue versions of the API calls would dlsym() for the real versions and pass through.

My workaround of replacing libldap50.so with OpenLDAP "works" for me, since I don't use any of the LDAP related stuff in Thunderbird. It just keeps getpwuid() type lookups from blowing up. I'd not be surprised to find that some of the LDAP related functionality is actually broken.

Confirmed bug on my setup - Changing Shared lib to OpenLDAP does resolve issue with startup, but does kill addressbook ldap usage.

*** Bug 333571 has been marked as a duplicate of this bug. ***

The same problem exists for thunderbird 2.
The workaround to create a symlink to the local libldap-2.2.so.7 still fixes the issue.

*** Bug 348506 has been marked as a duplicate of this bug. ***

(In reply to comment #10)
> What we really need is a unified API between OpenLDAP and Mozilla.

Yes. More to the point, we need a *good* LDAP API. Interested developers are invited to add comments here
http://scratchpad.wikia.com/wiki/LDAP_C_API

> There are
> several impediments to this happening:
> 1) OpenLDAP uses OpenSSL for crypto, while Mozilla uses NSS. My preference
> would be to have the ability for OpenLDAP to use NSS for crypto, especially if
> running in a Mozilla client app.

That probably makes sense from a Mozilla perspective, but I'm not sure it's worth the overhead of carrying NSPR around everywhere. Also some interesting commentary here:

http://markmail.org/message/z3sf37vnryypdko4#query:openssl%20vs%20nss+page:2+mid:xvw5nybqrhkw6w7n+state:results

> 2) Each API has extensions lacking in the other.

Not relevant, since Mozilla's use of LDAP is quite plain-jane.

> 3) The command line tools are incompatible.

I don't see how associated tools are relevant to the Thunderbird/Mozilla apps..

> 4) No one in either of the communities has either the time or the inclination
> to do the work.

Well, out of boredom, I spent 2 hours this afternoon patching my Mozilla build tree to use OpenLDAP. I think the difficulties have been overstated, because it's working fine on my OpenSUSE laptop.

Note that I haven't looked at the necessary autoconf changes, just edited my build tree after configure was already run. As such, edit config/autoconf.mk:

#LDAP_CFLAGS = -I${DIST}/public/ldap
#LDAP_LIBS = -L${DIST}/bin -L${DIST}/lib -lldap60 -lprldap60 -lldif60
LDAP_CFLAGS = -I/usr/local/include -DLDAP_DEPRECATED
LDAP_LIBS= -L/usr/local/lib -lldap_r -llber

and use the attached patch. A more thorough adaptation would go through and eliminate the use of LDAPv2/deprecated APIs but this was quick and dirty...

Created an attachment (id=333135)
Quick'n'dirty patch

Works with all ldap URLs that OpenLDAP supports (cldap, ldap, ldapi, ldaps); someone should add an option for choosing StartTLS...

Oh, you also need to turn off the MOZ_PSM stuff in directory/xpcom/base/src/Makefile:

#ifdef MOZ_PSM
#DEFINES += -DMOZ_PSM
#CPPSRCS += \
# nsLDAPSecurityGlue.cpp \
# $(NULL)
#endif

This leaves you with a Mozilla build that uses OpenLDAP's SSL support, whatever it may be linked to (OpenSSL or GnuTLS, currently). It's worth noting that OpenSSL is already loaded in the process under Linux, due to various other system libraries included in the build, so this isn't really making any situation worse. Since OpenSSL has been a standard system library on Linux for so long and pretty much everything uses it, it would make more sense to replace NSS with OpenSSL here.

It should be noted that NSS is being considered for inclusion in the LSB,
and OpenSSL is not, due in part to commitment to ABI compatibility in NSS.

Created an attachment (id=333197)
Cleaned up patch

This patch is properly ifdef'd so it won't break the existing MozLDAP functionality...

Created an attachment (id=333905)
OpenLDAP+PSM support

This patch also supports PSM with OpenLDAP, using new callback hooks that were just added to OpenLDAP's CVS HEAD. (Those hooks probably will be released in OpenLDAP 2.4.12; 2.4.11 is current.)

The PSM support just mimics the existing MozLDAP behavior. It's worth noting that the existing behavior will typically break when chasing referrals: The hostname that's passed in persists until the LDAP* handle is closed and is used for all Connection attempts. If a referral is received which points to ldaps:// on a different host, the hostname will not match and the connection should fail. If the referral points to the same host (as is common on MSAD) then it will probably succeed.

To fix this problem the Connect callback should record a bit more info, to answer two questions:
  1) whether it successfully connected once before - that will allow distinguishing referral chasing from the first successful connection.
  2) whether the IP address of the current connection attempt matches the previous successful attempt - that will distinguish referrals to the same host from referrals to a different host.

Then when it's determined that this connect attempt is chasing a secure referral on a different server, it can just use the name provided in the callback argument list.

This whole referral issue probably belongs in a separate bug report, but I'm commenting here because the details only surfaced while investigating this report.

Another obvious problem with the current PSM support: if the initial connection is plaintext but a referral to an ldaps:// URL is received and chased, the subsequent connection will not have the PSM layer installed. The fix for this is to always install the callback, and just have it pass-thru without pushing the PSM layer if the current connection didn't request ldaps://.

Created an attachment (id=334053)
Fix referral issues

Also noticed, in the current code there's a potential memory leak in nsLDAPSSLInstall if prldap_set_sessioninfo fails; it will leak the dup'd hostname because it calls the wrong free function before returning.

(nsLDAPSecurityGlue.cpp:369 should be calling nsLDAPSSLFreeSessionClosure()...)

The socketClosure stuff doesn't seem to accomplish anything. It should probably be ripped out; there's no special handling needed for closure of individual sockets. It's only needed for closing the session handle.

The attached patch fixes these two issues in the existing code. It also fixes the referral issues I mentioned before, for both MozLDAP and OpenLDAP.

(Mark said "be my guest" ...)

(In reply to comment #15)
> What do Mozilla LDAP people think about using the same approach as is done for
> cairo:
> http://lxr.mozilla.org/seamonkey/source/gfx/cairo/cairo/src/filterpublic.awk
> http://lxr.mozilla.org/seamonkey/source/gfx/cairo/cairo/src/cairo-rename.h
>
It seems this would make the app more dependent on having these specific libraries bundled with the app. It would be nice to be able to use the library already present on a system, instead.

An alternative approach, along similar lines, would be to avoid direct references to these library functions in any particular code. Instead, use dlopen (or its analogue) to find any suitable version of the desired library, and use dlsym to build up a table of function pointers for all of the needed entry points. Then wrap macros around all of the invocations in the main source, to always invoke these functions through your table of pointers.

On a separate note, in my current patches I left nsLDAPService::CreateFilter unimplemented because a quick grep thru the source tree didn't turn up anyone using this function. But now I see that the AddressBook actually does try to use it for autocomplete, so I guess we'll have to provide an OpenLDAP version of ldap_create_filter() before this patch can be considered complete.

NSPR provides a analogue of dlopen that works on all Mozilla/Firefox/TBird
platforms and is present in every FF browser and TB mail client (SM too).
See documentation here
http://mxr.mozilla.org/nspr/source/nsprpub/pr/include/prlink.h#94
http://mxr.mozilla.org/nspr/source/nsprpub/pr/include/prlink.h#181

Another approach that sometimes works is to link these libraries with -Bsymbolic, to restrict them to resolving their symbol references to within their own shared objects. Unfortunately, it also requires whoever built the conflicting library to use the same option. I.e., it's not sufficient to link Mozilla's libldap with this flag; the platform's libldap must be linked this way as well. (The symbol conflict confusion is bi-directional; only linking one of the conflicting libraries only eliminates the conflict in one direction.) It also doesn't help when the shared library has other external dependencies (e.g. OpenLDAP's libldap depends on liblber).

Had to mention this because the dlopen approach is still vulnerable to the problem of the dlopen'd libldap referencing the wrong liblber if another one was implicitly loaded into the process by some other library dependency.

Created an attachment (id=334117)
Add ldap_create_filter

I note that in Mozilla's libldap/getfilter.c, which provides ldap_create_filter(), the header comment says "getfilter.c -- optional add-on to libldap". It's not a part of the libldap API spec, and it's totally self-contained - it has no dependencies on anything else in libldap. IMO it doesn't really belong in there, someone just tossed it in there for lack of a more obvious place. So for this patch, I've copied the necessary bits out of getfilter.c and pasted them in here where they're actually used.

Just for your information:
- The bug still exist in OpenSuse 11.0 x86_64
   kernel 2.6.25.18-0.2-default
   MozillaThunderbird-2.0.0.17-3.1
   nscd-2.8-14.1

   nscd crashes as soon as thunderbird is launched

# ps -ef |grep nscd
root 4905 1 0 09:56 ? 00:00:00 /usr/sbin/nscd
root 4915 4844 0 09:56 pts/2 00:00:00 grep nscd
# logout
begou@thor: thunderbird
Registering Enigmail account manager extension.
Enigmail account manager extension registered.
/usr/bin/thunderbird: line 134: 4918 Erreur de segmentation $MOZ_PROGRAM $@
begou@thor: ps -ef |grep nscd
begou 4927 4818 0 09:57 pts/2 00:00:00 grep nscd

Using /usr/lib64/libldap-2.4.so.2 instead of /usr/lib64/thunderbird/libldap50.so seems to provide a good work-around.

Given that we have a patch, maybe this should block Thunderbird 3. It would be really nice to have an idea of how prevalent this is...

Though, despite having a patch, it seems like there's still some discussion to be had on whether it uses the optimal approach, or if one of the other approaches suggested here would make more sense.

Whatever the effect of this specific patch is, I'd like to voice my opinion that, unless Thunderbird gains useful LDAP support for reading and writing address books, there is no way to place Thunderbird onto the corporate desktop, although there are other limiting factors around as well.

*** Bug 470451 has been marked as a duplicate of this bug. ***

Removing the flag that I mistakenly set: since this isn't part of Gecko, so it can't block a gecko release. I'd love to get this for Thunderbird 3, but it feels like there's still a non-trivial amount of work to do here. Not adding [tb3needs], because if I this were the last bug standing, I don't think we would hold the release for it. Sorry I haven't been able to get back this yet, Howard. :-(

I was just bitten by it on Tbird 3 (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1pre) Gecko/20090607 Shredder/3.0b3pre). Oddly, it hangs early in startup when connecting over remote X (ssh-tunnel) but not when invoked locally.

*** Bug 496861 has been marked as a duplicate of this bug. ***

*** Bug 431145 has been marked as a duplicate of this bug. ***

Given that we have a patch, should really try to drive this in for tb3.
I'm not sure it's a problem that the code is in m-c, if it's all NPOTB for firefox anyway.

NPOTB? Let me guess: Not part of the browser ?

Not part of the build.

(From update of attachment 334117)
is this patch still wanted/needed?

Unless something has changed, I suspect it's still necessary. Comment 37 still applies, though.

I heard there are distributions which patched glibc's name service switch components to avoid these crashes.
One comment about that is
https://bugzilla.novell.com/show_bug.cgi?id=503151#c5

I don't know more details though.

(In reply to comment #44)
> (From update of attachment 334117 [details])
> is this patch still wanted/needed?

Independent of the OpenLDAP functionality, the bugs / memory leaks in the current code are still issues.

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.5) Gecko/20091109 Ubuntu/9.10 (karmic) Firefox/3.5.5
Build Identifier: Thunderbird 3 RC2 (http://download.mozilla.org/?product=thunderbird-3.0rc2&os=linux&lang=en-US)

Thunderbird doesn't start at all (seems to hang). When running with gdb (thunderbird -g) getting a SIGSEV and following stack trace:

Program received signal SIGSEGV, Segmentation fault.
0x08090f3c in free ()
(gdb) bt
#0 0x08090f3c in free ()
#1 0x004872b8 in ldap_x_free () from ./bin/thunderbird/libldap60.so
#2 0x0047edc4 in ldap_set_lderrno () from ./bin/thunderbird/libldap60.so
#3 0x00493e5e in ldap_set_option () from ./bin/thunderbird/libldap60.so
#4 0x041d20f5 in ?? () from /lib/libnss_ldap.so.2
#5 0x041d2d70 in ?? () from /lib/libnss_ldap.so.2
#6 0x041d30fa in ?? () from /lib/libnss_ldap.so.2
#7 0x041d38d0 in _nss_ldap_getpwnam_r () from /lib/libnss_ldap.so.2
#8 0x01167a95 in getpwnam_r () from /lib/tls/i686/cmov/libc.so.6
#9 0x00a10802 in ?? () from /lib/libglib-2.0.so.0
#10 0x00a12805 in g_get_home_dir () from /lib/libglib-2.0.so.0
#11 0x00e984d8 in ?? () from /usr/lib/libgtk-x11-2.0.so.0
#12 0x00e9abfb in ?? () from /usr/lib/libgtk-x11-2.0.so.0
#13 0x00e48d65 in ?? () from /usr/lib/libgtk-x11-2.0.so.0
#14 0x009eac57 in g_option_context_parse () from /lib/libglib-2.0.so.0
#15 0x00e4895c in gtk_parse_args () from /usr/lib/libgtk-x11-2.0.so.0
#16 0x0807c1e6 in ?? ()
#17 0x08079f06 in ?? ()
#18 0x010e8b56 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#19 0x08079dd1 in ?? ()

Reproducible: Always

Steps to Reproduce:
1. configure libnss to use ldap backend to get current user information (user should not be in local /etc/passwd file)
2. run thunderbird
Actual Results:
Thunderbirds hands due to a crash in libldap60.so

This bug shows with ubuntu 9.04 and 9.10, not showing with CentOS 5.3

It looks like libldap60.so provided with thunderbird overrides functionality from distribution's ldap library and fails at performing operation requested by libnss.

this is basically an incompatibility between libnss_ldap.so and libldap60.so, they have the same symbol (ldap_set_option), but libnss_ldap.so probably wanted to call its version, not ours.

iirc this is more a fault on their side than on our side. please review other bugs on this subject.

Are you suing builds from the ubuntu repository our Thunderbird downloaded from mozilla.org, or self compiled ?

Same problem here. Running 32bit Ubuntu 9.10 *without* NFS homedir but with LDAP NSS enabled (my user logs in on a network).

Thunderbird 3.0 RC2 (and beta 4 too) refuses to start. With gdb enabled, only a "Program received signal SIGSEGV, Segmentation fault." will be visible, but program doesn't die. Even a strace doesn't give answer..

Workaround which I found on a OpenSuse website and seems to work:
sudo getent passwd ldap_user_name >> /etc/passwd

Thunderbird can now start.
I downloaded Thunderbird from mozilla.org website, so no package or something :)

Confirmed here using Thunderbird 3.0 released source, building on a 64-bit Kubuntu (9.10) system. Exactly the same symptom - the crash is in ldap_get_errno, when a ptr with a value of 0x2 gets passed to free().

Building same source with --disable-ldap produces a working executable - of course, with no LDAP support.

Seems to me that libldap60 should check if its functions already exist in libnss_ldap, and if so, defer to libnss_ldap since that is what the OS provides.

I have not seen this problem in any of the Thunderbird 2.x releases.

Sorry, that should have been ld_set_errno

henry: we can't do that. so's don't work that way, nor can we use the system libraries which aren't tested with our product and which don't have the same feature set as our library.

Well, I just built from the comm-central trunk (so it's 3.1a1pre), and it works - but only if I don't build a static binary. It looks at a quick inspection like ./mozilla/memory/jemalloc/jemalloc.c has been completely rewritten - it looks like a lot of the reserve memory stuff has been cut out - but if I build with --enable-static, I still get the same crash. So, I built without --enable-static, and then commented out the bit in installer/Makefile that checks for static libs, and everything is fine - Thunderbird just uses the system nss_ldap stuff instead of its own. Anyway, it works for me...

Bruce Edge (bruce-edge) wrote :

Binary package hint: thunderbird

If nsswitch.conf is set with:

passwd: compat ldap
group: compat ldap
shadow: compat ldap

My ldap config does work as I'm using it for login authentication.

thunderbird-3.0 always segfaults:

0 %> thunderbird-3.0
Segmentation fault
0 %> thunderbird-3.0 --g-fatal-warnings
Segmentation fault
139 %> thunderbird-3.0 -options
Segmentation fault
139 %> thunderbird-3.0 -safe-mode
Segmentation fault
139 %> thunderbird-3.0 -ProfileManager
Segmentation fault
139 %>

...even with no .thunderbird-30 dir:

 %> thunderbird-3.0 -ProfileManager
*INFO* No /users/bedge/.thunderbird-3.0 detected. Create it from /users/bedge/.mozilla-thunderbird
Segmentation fault
0 9:12:52 bedge@ice ~
139 %>

Changing nsswitch back to NIS works:

passwd: compat nis
group: compat nis
shadow: compat nis

1 %> dpkg -l thunderbird\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Cfg-files/Unpacked/Failed-cfg/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-===========================-===========================-======================================================================
ii thunderbird 2.0.0.23+build1+nobinonly-0 mail/news client with RSS and integrated spam filter support
ii thunderbird-3.0 3.0~rc3~micahg+nobinonly-0u mail/news client with RSS and integrated spam filter support
ii thunderbird-3.0-gnome-suppo 3.0~rc3~micahg+nobinonly-0u Support for Gnome in Mozilla Thunderbird 3.0
ii thunderbird-dispmua 1.6.4.3-1ubuntu1 Display Mail User Agent extension (transitional package)
ii thunderbird-gnome-support 2.0.0.23+build1+nobinonly-0 Support for Gnome in Mozilla Thunderbird
ii thunderbird-locale-en-gb 1:2.0.0.14+1-0ubuntu2 Thunderbird English language/region package
ii thunderbird-nostalgy 0.2.16+svn151-1ubuntu1 keyboard shortcut extension for thunderbird
ii thunderbird-quickfile 0.17.0.0011-0ubuntu4 faster mail filing for the Thunderbird mail client
ii thunderbird-traybiff 1.2.3-4.2ubuntu2 traybiff - new mail alert for thunderbird

bl8n8r (bl8n8r-gmail) wrote :

Something definitely funky with ldap users and thunderbid (tested versions <= 2.0.0.23). I can run thunderbird as a non-ldap user just fine, but as soon as trying it with and ldap user it seg faults.

I've tried 3.6 all the way down to 2.0.0.23 and it's the same thing for each version.

dmesg
------------------------------------------------------------------
[597021.472581] thunderbird-bin[16356]: segfault at 912fb8bc ip 08091fe6 sp b6a32668 error 4 in thunderbird-bin[8048000+c4b000]
[597024.625942] thunderbird-bin[16366]: segfault at 90dfb8bc ip 08091fe6 sp b6932668 error 4 in thunderbird-bin[8048000+c4b000]
[597030.588103] thunderbird-bin[16375]: segfault at 917f18bc ip 08091fe6 sp b6b30668 error 4 in thunderbird-bin[8048000+c4b000]
[597167.664977] thunderbird-bin[16402]: segfault at 90dfb8bc ip 08091fe6 sp b6932668 error 4 in thunderbird-bin[8048000+c4b000]
[597171.815210] thunderbird-bin[16411]: segfault at 917f18bc ip 08091fe6 sp b6b30668 error 4 in thunderbird-bin[8048000+c4b000]

my nsswitch.conf
------------------------------------------------------------------
  # grep ldap /etc/nsswitch.conf
  passwd: files ldap
  group: files ldap
  shadow: files ldap

strace shows barfology after looking for ldap.conf
------------------------------------------------------------------
$ export LD_LIBRARY_PATH=.
$ strace ./thunderbird-bin
...
open("/home/user/ldaprc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/home/user/.ldaprc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("ldaprc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
brk(0xaa1d000) = 0xaa1d000
stat64("/etc/ldap.conf", {st_mode=S_IFREG|0644, st_size=776, ...}) = 0
geteuid32() = 9418
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

Louis-Marie, did you try comment 5?

Same problem here with Ubuntu 9.10 and 10.04. I tested daily builds from ppa:ubuntu-mozilla-daily, and the included version in 10.04.
Is there any workarround?

Christian (c-pradelli) wrote :

this is due a know bug in mozilla thunderbird:

https://bugzilla.mozilla.org/show_bug.cgi?id=532601

I think that it should be fixed before final Lucid release or give an alternative to install thunderbird 2

John Vivirito (gnomefreak) wrote :

Marked as triaged since there is enough info at this point. Please see upstream bug for more info. Adding upstream bug to this one for tracking

Changed in thunderbird (Ubuntu):
status: New → Confirmed
Micah Gersten (micahg) wrote :

Could someone please check with the daily build of Thunderbird 3.0 to see if this is still occurring?
https://launchpad.net/~ubuntu-mozilla-daily/+archive/ppa/

Changed in thunderbird (Ubuntu):
status: Confirmed → Incomplete
Micah Gersten (micahg) wrote :

Note, this is the thunderbird package in that PPA.

Christian (c-pradelli) wrote :

Yes, I checked with lucid repositories package, and 3.0 and 3.1 package from https://launchpad.net/~ubuntu-mozilla-daily/+archive/ppa/ the bug exists with all of them

Micah Gersten (micahg) wrote :

Marking as Triaged -> High, Please report any other bugs you may find.

Changed in thunderbird (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged

Using Ubuntu 9.10 with TB 3.0.3, NFS mounted home directory, and ldap/PAM authentication. It seems to hang forvever. On debug, it shows seg. fault after reading /etc/ldap.conf.

TB 2.0.0.23 was working perfectly.
--------
Work Around: Remove Directory lookup (network address book) and softlink libldap60.so with /usr/lib/libldap-2.4.so.2
--------
TB 3 is incomplete without address network lookup working.
Is there any plan of fixing this ?

Micah isn't the issue due to packaging ?(In reply to comment #12)

> Is this a dupe of bug 292127?
And yes they look alike.

(In reply to comment #14)
> Micah isn't the issue due to packaging ?(In reply to comment #12)

No, it's a general incompatibility between applications using mozldap while doing "system user-related stuff" (what almost every app does e.g. working with local files) while the system's base is configured to work with ldap integration based on openldap.

> > Is this a dupe of bug 292127?
> And yes they look alike.

Exactly.

People seem to be in agreement that it is a dupe of bug 292127.

*** This bug has been marked as a duplicate of bug 292127 ***

*** Bug 532601 has been marked as a duplicate of this bug. ***

*** Bug 558862 has been marked as a duplicate of this bug. ***

Here's a workaround:
Add yourself as a local user in passwd & shadow as well as ldap.

For me, running nscd as suggested in Bug #532128 gets rid of the segfault but thunderbird still won't start: Bug #561323

Changed in thunderbird:
status: Unknown → Invalid
Bruce Edge (bruce-edge) wrote :

Here's a workaround.
Add yourself as a local user in /etc/passwd & /etc/shadow and it'll work.

To confirm, run
       getent passwd | grep <your uid>

and you should see 2 entries, the local and the ldap. If they are ==, then you did the right thing.

bedge:x:1077:2222:Bruce Edge:/users/bedge:/bin/zsh
bedge:x:1077:2222:Bruce Edge:/users/bedge:/bin/zsh

-Bruce

Bruce Edge (bruce-edge) wrote :

Just curious, how can this be Importance: High and unassigned with an LTS release just around the corner?

LDAP appears to be a second class citizen to Canonical. See bug 442498 ( https://bugs.launchpad.net/ubuntu/+source/openldap/+bug/442498 )

zoolook (nbensa) wrote :

surprised?

https://bugs.launchpad.net/ubuntu/+source/phpldapadmin/+bug/551269

of course the part of the virtual machine is a joke; i installed phpldapadmin 1.2.0.5 in /usr/local until i find the time to migrate to a different distro

@Bruce Edge:
it is sufficient to have the users in passwd - adding them to shadow is not needed.

I'd like to say that as of now I can get Thunderbird to start successfully when I have the name service cache daemon (NSCD) running (properly configured).

Can anyone else confirm this?

Fabián Rodríguez (magicfab) wrote :

Upstream bug was a duplicate of upstream#292127 - I updated to reflect this.

Changed in thunderbird:
status: Invalid → Unknown
Changed in thunderbird:
status: Unknown → In Progress
urusha (urusha) wrote :

Confirm, it works with enabled nscd.

rabbit83 (mail-to-me) wrote :

Works with enabled nsc here, to.

Besides any workaround (adding to local passwd an run nscd), it works just to run TB, but not for lightning. I can run TB+Lightning with a local only user, but not with a LDAP user, on the same machine with the same pam_ldap config.

Forget my comments. My ldap user has its homedir mounted on a NFS volume mounted with 'noexec' flag. Removing this makes it work.

I can also confirm that installing nscd fixes the problem.

jesse (jesse) wrote :

Just confirming this is indeed still a problem with an up to date fresh lucid install, and nscd does seem to fix it.

Aksel Filipović (aksel-ads) wrote :

I'v got the same problem today (I also have LDAP configuration) and couldn't take any appropriate action to solve it, but I found the workaround that worked very well.
Start thunderbird -> Close the "Create Account Wizard".
"File" -> "Work Offline".
"Create New Account" -> Write your full name and Email address -> "Continue"
Now you will get an Message saying: "Thunderbird failed to find setting for your email account. - that's OK
Now fill out needed information "AND DO NOT RE-TEST-CONFIGURATION".
Click on "Manual Setup" - which then creates a configuration for your email account.
Now you can precisely configure your account as you like.
When finished confirm the new configuration with OK and take your thunderbird online.

That's it.

In this case you say your client to not be smart and try to configure him self, while online. You do it your configuration offline and force thunderbird to try you configuration when online.

This problem exist obviously only if you let thunderbird discover the settings. I hope this will be resolved soon.
Btw. if you make mistake before clicking on "Manual Setup" button, you can delete the misconfigured account by choosing the "Account Action"->"Remove Account"

- Aksel

bjorn (bl-ubuntu) wrote :

Running nscd did not fix it for me. I also had to remove the libnss-ldap package (in addition to putting a local entry in /etc/passwd and /etc/shadow, as mentioned earlier). With the local entries, I don't have to run nscd. But now I can't login as other users unless I put in local entries for those, too.

bjorn (bl-ubuntu) wrote :

Update to the previous entry (#89):

Using the package libnss-ldapd instead of libnss-ldap fixed the problem.
So I no longer need local entries in /etc/passwd and /etc/shadow.

Changed in thunderbird:
importance: Unknown → Critical
Ro (robert-markula) wrote :

Bug is still present in Lucid with all the latest updates installed.

Although the workaround by installing libnss-ldapd or nscd seems to work - according to above comments -, this is not an acceptable solution as it interferes too deeply with the system (changing fundamental system internals just to get a specific client program working).

This problem first appeared with Lucid; in Karmic Thunderbird worked fine with exactly the same system configuration.

Ralph Janke (txwikinger) on 2010-10-07
description: updated
description: updated

Wayne Mery (vn) <<email address hidden> wrote>:
> I wrote:
> > unless Thunderbird gains useful LDAP support for reading and writing
> > address books, there is no way to place Thunderbird onto the corporate
> > desktop, although there are other limiting factors around as well.
> a_geek, are you in a corporate environment?

First off, please quote properly, and keep it here.

To answer the question: A large part of my work is as a consultant to corporations with typically several hundred users, and my mind set is tweaked towards the requirements of such organisations. But I have a hard time seeing TB even in the SME area, as they also at least want shared address books and calendaring throughout the company, and will not accept an out-of-band management requirement for their address books (this is a large part of what LDAP access is about).

I would prefer to discuss this out of band, because it's not related to the bug, but my interest in contacting you was to determine your level of interest, which of the ldap bugs you think are most important, and to what extent you would be able to help moving some of them forward (testing, etc)?

And, as implied by my earlier question, there could be more progress if we sought additional users/enterprises who were interested in helping sort through the issues.

Hi, I run a corporate network with aprox. 90 users in 3 different sites and roamaing users. Our only mail client is TB with lightning, using IMAP mailboxes and SOGo as calendar server. I am interested in any progress/enhancement in any of those two. I can help testing.

Dennis (dennisuk) wrote :

Just wanted to add that this is still a problem in Lucid 10.04.

The workaround of moving the shared library bundled with Thunderbird (/usr/lib/thunderbird-3.1.7/libldap60.so) and linking the system wide libldap (/usr/lib/libldap-2.4.so.2) into its place fixed the problem. Though I haven't done much testing beyond that yet.

Can anyone point out problems that still exist with this workaround?

Phil M (unmobile+ubuntu) wrote :

I just reproduced the segfault at startup as an LDAP user with thunderbird version 3.1.7+build3+nobinonly-0ubuntu0.10.04.1 from lucid-update/lucid-security.

I can confirm that installing and starting nscd lets Thunderbird start, and purging it restores the segfault.

Hi, today I have realize that this bug is affecting us also. We have just decide to move to TB from Evolution. If nscd is not installed on the system TB even does not start, it gave SEG FAULT an crashes. If nscd is present TB starts but you can't configure accounts or access some menus such as addons menu. Ive tried to add users in /etc/password but no way it still crashes.

Is it there any workaround for this??

Ubuntu: 10.10
TB: 3.1.7
nscd 2.12.1-0ubuntu10.2
libldap 2.4-2
libnss-ldap 264-2ubuntu2

The last lines of strace are this

read(53, "#\n# LDAP Defaults\n#\n\n# See ldap."..., 4096) = 198
read(53, "", 4096) = 0
close(53) = 0
munmap(0xb7704000, 4096) = 0
geteuid32() = 25004
getuid32() = 25004
open("/home/user/ldaprc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/home/user/.ldaprc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("ldaprc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
stat64("/etc/ldap.conf", {st_mode=S_IFREG|0644, st_size=712, ...}) = 0
geteuid32() = 25004
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
unlink("/home/user/.thunderbird/3nkktvbg.default/lock") = 0
rt_sigaction(SIGSEGV, {SIG_DFL, [], 0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [SEGV], NULL, 8) = 0
tgkill(18919, 18943, SIGSEGV) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---

You really need a stack trace for the crash to make progress, I think.

(In reply to comment #58)
> You really need a stack trace for the crash to make progress, I think.

Oh, sorry, I see Howard has a patch in progress.

Comment on attachment 334117
Add ldap_create_filter

switching review to standard8

Mike, since this might be a serious issue on Ubuntu, you might want to look into driving this patch forward, if it's still applicable.

(In reply to comment #57)
> Is it there any workaround for this??

Carlos,

reportedly replacing libnss-ldap by libnss-ldapd is a workaround for this. See also the Ubuntu report on this bug,

https://bugs.launchpad.net/ubuntu/+source/thunderbird/+bug/507089

btw.
as far as I have tested - Mandriva 2010.2 is not affected anymore (thunderbird 3.1.7)

Bug in Mandriva 2010.2 (Thunderbird 3.1.7) seems fixed - maybe an internal
patch - or bug was based on a library which was replaced.

Could not reproduce in Debian Squeeze using Thunderbird 3.1.7.

This also works fine on Ubuntu 8.04 and TB 3.1.7

As this bug was not present in Ubuntu 9.10 Karmic Koala (thunderbird 2.0.0.24+build1+nobinonly-0ubuntu0.9.10.3), it must have been introduced sometime in between karmic and lucid.

I've reproduced this bug in both Ubuntu Maverick and Natty, using Thunderbird 3.1.7.

I'll dig a bit deeper, and keep you all posted.

Comment on attachment 334117
Add ldap_create_filter

I'm not convinced by this solution.

If I understand it correctly, then this is trying to make our API the same as OpenLDAP's version. So depending on the set-up of the (Linux) system, we could be using either the OpenLDAP library, or our own. We don't know what is in OpenLDAP's library, nor have will we have done extensive testing in it. If we get crashes or strange results, we may not even realise that we're using OpenLDAP's library. This would make support very difficult. I think this is what Mark was saying in comment 9.

Given that we ship this library in Thunderbird, intending that Thunderbird is going to use this library, then maybe we should consider re-naming the library when we ship it within Thunderbird. This idea is from a similar approach Firefox took with SQLite in bug 513747.

So for instance, we could ship libmozldap60.so etc where we build LDAP as part of Thunderbird. Hence, changing the name should resolve the conflicts we're seeing, and ensure that Thunderbird runs with what we intended.

The LDAP c-sdk could still default to libldap60.so, and if building with the system LDAP c-sdk, then we could still use libldap60.so. If Linux distributions want to use the system LDAP for shipping Thunderbird, then I would expect them to verify/handle bugs with LDAP, especially if it isn't the LDAP c-sdk that we're shipping with Thunderbird.

Obviously we may still want to move the two sets of LDAP APIs closer together, but I'm not convinced doing it as a result of this bug is the right thing to do. For example, it really does feel like ldap_create_filter should be in the c-sdk, and therefore maybe it needs adding to OpenLDAP's version, not removing from ours.

If I've misunderstood things, then please correct me.

What is the status on this bug after 7 years?

From what I understand (correct me if I am wrong) the solution is to install either nscd or libnss-ldapd. While both of these seem to work, they are not acceptable solutions because it affects the rest of the system.
And why should thunderbird even care what the controlling backend auth module is in the first place?

(In reply to comment #70)
> What is the status on this bug after 7 years?
>
> From what I understand (correct me if I am wrong) the solution is to install
> either nscd or libnss-ldapd. While both of these seem to work, they are not
> acceptable solutions because it affects the rest of the system.
> And why should thunderbird even care what the controlling backend auth module
> is in the first place?

You're right that Thunderbird or some other app *shouldn't* ever need to care about this, but the fact is that the old nss-ldap design causes these types of problems, and libnss-ldapd corrects the design flaw.

Dan Woodard (dan-e-woodard) wrote :

I have a similar problem. I'm adding what may be more details to aid in solving. I use LDAP for authentication and I have nscd installed. Thunderbird has been working without issue until yesterday when I changed hosts in nsswitch.conf to use ldap as shown below;

Changed:
hosts: files dns [NOTFOUND=return]
to the following;
hosts: files ldap dns [NOTFOUND=return]

This change resulted in seg faults with no other warning. I was able to create an alternate user profile and that would work, but when trying to get back to the current user, the seg fault would return. Once I changed nsswitch.conf back to using dns, the app starts without issue.

My release;
Linux bari.iqanalog.com 2.6.32-33-generic #72-Ubuntu SMP Fri Jul 29 21:07:13 UTC 2011 x86_64 GNU/Linux
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.3 LTS"
Thunderbird 3.1.13

still a problem in 12.04 amd64 desktop and the default thunderbird provided.

airtonix (airtonix-gmail) wrote :

although it's still a problem in ubuntu 12.04 64bit desktop and the default thunderbird (not sure if any thunderbird), applying the workaround in #7 (https://bugs.launchpad.net/ubuntu/+source/thunderbird/+bug/507089/comments/7) works.

we use 10 ubuntu 12.04 64bit workstations here at work, that require ldap login from a ubuntu 10.04 server managed by zentyal.

I apply the zentyla-desktop.deb to each of the workstations.

*** Bug 756782 has been marked as a duplicate of this bug. ***

Created attachment 628378
rename libldap60.so to libmozldap60.so

While that change makes sense in general I'm wondering what it's supposed to fix? I'm pretty sure that the filename is not the issue.

Comment on attachment 628378
rename libldap60.so to libmozldap60.so

From discussions about this sort of thing previously (which admittedly were a while ago), I believe that changing the library name wouldn't actually resolve all the problems.

Additionally, I don't think it is really right to change the library name unless the developers of the Mozilla LDAP c-sdk really want to, as it would impact on all the users of it, and potentially the use of libraries on existing systems.

I think that we should really go for changing the LDAP c-sdk that we use, and possibly replacing it with OpenLDAP as Howard was intending (or something else). To this effect I've put a proposal to tb-planning about this change:

http://groups.google.com/group/tb-planning/browse_thread/thread/342164ae0db9b21a
(https://wiki.mozilla.org/Thunderbird/tb-planning)

Comment on attachment 334117
Add ldap_create_filter

I'm rescinding my previous feedback- on this. Per previous comment on this bug, discussions have moved on, and we're considering moving away from the LDAP c-sdk, so this patch may therefore be heading in the right direction. Obviously, it would need to be updated and re-tested etc, but see the tb-planning discussion first.

The problem ist also in Thunderbird 15 still present! I get a backtrace like in https://bugzilla.mozilla.org/show_bug.cgi?id=433530:

(gdb) bt
#0 strtok_r () at ../sysdeps/x86_64/strtok.S:190
#1 0x00007ffff6ad3b3a in ldap_str2charray (str=0x7fffe3781ced "ldap://localhost/", brkstr=0x7fffe3781a4b ", ")
    at /usr/src/debug/mail-client/thunderbird-15.0.1/comm-release/ldap/sdks/c-sdk/ldap/libraries/libldap/charray.c:218
#2 0x00007fffe376c216 in ldap_url_parselist_int (ludlist=0x7fffe398be80, url=<optimized out>, sep=<optimized out>, flags=11) at url.c:1293
#3 0x00007fffe376da8b in ldap_int_initialize_global_options (gopts=0x7fffe398bdc0, dbglvl=<optimized out>) at init.c:537
#4 0x00007fffe376dc0d in ldap_int_initialize (gopts=0x7fffe398bdc0, dbglvl=<optimized out>) at init.c:653
#5 0x00007fffe3753309 in ldap_create (ldp=0x7fffffff9cb8) at open.c:108

By looking at
(gdb) info sharedlibrary
0x00007ffff6ad2040 0x00007ffff6af6558 Yes /usr/lib64/thunderbird/libldap60.so
0x00007fffe3752fd0 0x00007fffe377e0a8 Yes /usr/lib64/libldap-2.4.so.2

you can see that the openldap routine is jumping into a mozilla routine, causing a segfault by applying strtok to "ldap://localhost/", which is a built in string in the openldap lib. A solution would be nice, because currently I can't use Thunderbird at all.

The problem is also in Thunderbird 16. It's a clash of symbols from libldap-2.4.so and libldap60.so.

(gdb) bt
#0 0x00007fffe708a100 in ldap_str2charray () from /usr/lib64/libldap-2.4.so.2
#1 0x00007fffe70816c6 in ldap_url_parselist_int () from /usr/lib64/libldap-2.4.so.2
#2 0x00007fffe7082f1b in ldap_int_initialize_global_options () from /usr/lib64/libldap-2.4.so.2
#3 0x00007fffe7083016 in ldap_int_initialize () from /usr/lib64/libldap-2.4.so.2
#4 0x00007fffe706a6ab in ldap_create () from /usr/lib64/libldap-2.4.so.2
#5 0x00007fffe706aa81 in ldap_initialize () from /usr/lib64/libldap-2.4.so.2
#6 0x00007fffe72a79c0 in do_init () from /lib64/libnss_ldap.so.2
#7 0x00007fffe72a9d1c in _nss_ldap_search_s () from /lib64/libnss_ldap.so.2
#8 0x00007fffe72ab580 in _nss_ldap_getbyname () from /lib64/libnss_ldap.so.2
#9 0x00007fffe72abd07 in _nss_ldap_getpwnam_r () from /lib64/libnss_ldap.so.2
#10 0x00007ffff70c5685 in getpwnam_r () from /lib64/libc.so.6

Removing/renaming libldap60.so caused some errors in finding the library, so this seems no solution:
  XPCOMGlueLoad error for file /usr/lib64/thunderbird/libxpcom.so:
  libxul.so: cannot open shared object file: No such file or directory
  Couldn't load XPCOM.

We brute-forced renaming the symbol via
   sed -e 's:ldap_str2charray:ldap_str2xharray:' /usr/lib64/thunderbird/libldap60.so
in order to make it work.

The workaround in post #7 no longer seems to work for my Ubuntu 12.04 x86_64 system, as of thunderbird package version 11.0.1+build1-0ubuntu2. After upgrade, thunderbird immediately jumps to a Mozilla bug reporting screen on start. I've ran this command as suggested:

"cp /usr/lib/x86_64-linux-gnu/libldap-2.4.so.2 /usr/lib/thunderbird/libldap60.so"

And while this does change the issue somewhat (I no longer get a bug reporting screen now, Thunderbird just exists silently), it doesn't resolve the issue. Please advise on other things I might try, or other information I could gather that will be useful for people who could resolve this.

I've tried all the proposed workarounds without success. This is affecting all of our user in our LDAP-based network.

I'm running Ubuntu 12.04.1 x86_64 with thunderbird package version 17.0+build2-0ubuntu0.12.04.1.

Previously, the workaround was to run the "nscd" service -- however that is no longer an option due to this bug:

https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1085957

I do not see how bug #1085957 is in any way related to this bug… running nscd to cache LDAP responses is itself a crucial service and mozilla should not allow its internal LDAP library to export symbols overlapping with openldap’s ABI unless if it is binary-compatible with openldap.

Bug #1085957 is clearly a different bug that is not related to #506089 in any other way than it invalidates (one of) the proposed workarounds for #506089.

s/#506089/#507089

*** Bug 708222 has been marked as a duplicate of this bug. ***

I'm getting an error in jemalloc.c on Ubuntu 12.04.1 (this error is reproducible.) Any pointers on how to work around this problem are much welcome (NSCD is not an option due to bug #507089.)

Program received signal SIGSEGV, Segmentation fault.
arena_dalloc (ptr=0x7fffead2f190, offset=<optimized out>) at /build/buildd/thunderbird-17.0.2+build1/mozilla/memory/mozjemalloc/jemalloc.c:4626
4626 /build/buildd/thunderbird-17.0.2+build1/mozilla/memory/mozjemalloc/jemalloc.c: No such file or directory.
(gdb) bt
#0 arena_dalloc (ptr=0x7fffead2f190, offset=<optimized out>) at /build/buildd/thunderbird-17.0.2+build1/mozilla/memory/mozjemalloc/jemalloc.c:4626
#1 0x00007ffff5a1eb5f in ldap_ld_free (ld=0x7ffff6cab5e0, serverctrls=0x0, clientctrls=<optimized out>, close=<optimized out>)
    at /build/buildd/thunderbird-17.0.2+build1/./ldap/sdks/c-sdk/ldap/libraries/libldap/unbind.c:158
#2 0x00007fffead2e955 in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#3 0x00007fffead2fb5b in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#4 0x00007fffead31192 in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#5 0x00007fffead32819 in ?? () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#6 0x00007fffead32e09 in _nss_ldap_getpwuid_r () from /lib/x86_64-linux-gnu/libnss_ldap.so.2
#7 0x00007ffff70bcbfd in __getpwuid_r (uid=537, resbuf=0x7ffff73b9380, buffer=0x7ffff6c09000 "+", buflen=1024, result=0x7fffffffc6e0) at ../nss/getXXbyYY_r.c:256
#8 0x00007ffff70bc4f3 in getpwuid (uid=537) at ../nss/getXXbyYY.c:117
#9 0x00007ffff1afe0ef in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#10 0x00007ffff1afe99d in g_get_home_dir () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007ffff027cb67 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#12 0x00007ffff0280f9f in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#13 0x00007ffff0230b2a in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#14 0x00007ffff1addfa0 in g_option_context_parse () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x00007ffff0231120 in gtk_parse_args () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#16 0x00007ffff3280fb4 in XREMain::XRE_mainStartup (this=0x7fffffffce30, aExitFlag=0x7fffffffcdff) at /build/buildd/thunderbird-17.0.2+build1/mozilla/toolkit/xre/nsAppRunner.cpp:3247
#17 0x00007ffff3283a19 in XREMain::XRE_main (this=0x7fffffffce30, argc=<optimized out>, argv=0x7fffffffe228, aAppData=0x7ffff6c26680)
    at /build/buildd/thunderbird-17.0.2+build1/mozilla/toolkit/xre/nsAppRunner.cpp:3871
#18 0x00007ffff3283c8d in XRE_main (argc=1, argv=0x7fffffffe228, aAppData=0x7ffff6c26680, aFlags=<optimized out>)
    at /build/buildd/thunderbird-17.0.2+build1/mozilla/toolkit/xre/nsAppRunner.cpp:3965
#19 0x000000000040225b in do_main (argv=0x7fffffffe228, argc=1, exePath=0x7fffffffd108 "/usr/lib/thunderbird/") at /build/buildd/thunderbird-17.0.2+build1/mail/app/nsMailApp.cpp:111
#20 main (argc=1, argv=0x7fffffffe228) at /build/buildd/thunderbird-17.0.2+build1/mail/app/nsMailApp.cpp:200

Ro (robert-markula) wrote :

If you have the possibility, try SSSD [1]. It works fine with Thunderbird and does away with all the cruft that nscd and co. brought along. One single and easy to understand config file to configure it.
It's here in production with Ubuntu 12.04.* and never had problems again - which is in sharp contrast to ns(l)cd. Packages are available for Ubuntu in standard repositories.
A small howto is at [2].

[1] https://fedorahosted.org/sssd/
[2] http://labs.opinsys.com/blog/2010/03/26/user-management-with-sssd-on-shared-laptops/

*** Bug 874029 has been marked as a duplicate of this bug. ***

Problem is exist on Thunderbird 17 too, here we can find crash reports relevant to this issue from all versions of thunderbird: https://crash-stats.mozilla.com/report/list?product=Thunderbird&query_search=signature&query_type=contains&reason_type=contains&date=2013-07-26&range_value=28&range_unit=days&hang_type=any&process_type=any&signature=arena_dalloc+|+ldap_x_free+|+ldap_set_lderrno

After upgrading to Thunderbird 22, the error is reproducible too, but signature is changed from:
arena_dalloc | ldap_x_free | ldap_set_lderrno
to
arena_dalloc | ld-2.15.so@0x214e4

- is this the same error or some other problem?

Howard is no longer working on this

(In reply to Murz from comment #83)
> After upgrading to Thunderbird 22, the error is reproducible too, but
> signature is changed from:
> arena_dalloc | ldap_x_free | ldap_set_lderrno
> to
> arena_dalloc | ld-2.15.so@0x214e4
>
> - is this the same error or some other problem?

seems likely.
https://crash-stats.mozilla.com/query/?product=Thunderbird&version=ALL%3AALL&range_value=4&range_unit=weeks&date=08%2F06%2F2013+17%3A00%3A00&query_search=signature&query_type=contains&query=arena_dalloc+|+ld&reason=&release_channels=&build_id=&process_type=any&hang_type=any
 arena_dalloc | ldap_x_free | ldap_set_lderrno
arena_dalloc | ldap_ld_free | libnss_ldap-2.13.so@0x3955
arena_dalloc | ldap_set_lderrno
arena_dalloc | ld-2.15.so@0x214e4
arena_dalloc | ld-2.15.so@0xe774

Changed in thunderbird:
status: In Progress → Confirmed

The bug is present in Thunderbird 24.2.0 running on Kubuntu 12.04.4. Running nscd appears to work around the issue, but I haven't tested it thoroughly for side effects.

I find it somewhat ironic that a nearly nine year old bug of this magnitude has status: NEW.

Software versions (all from Ubuntu repos):
$ aptitude show thunderbird | grep Version
Version: 1:24.2.0+build1-0ubuntu0.12.04.1
$ aptitude show libldap-2.4-2 | grep Version
Version: 2.4.28-1.1ubuntu4.4
$ uname -a
Linux tiny 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

(In reply to Maciej Puzio from comment #85)
> I find it somewhat ironic that a nearly nine year old bug of this magnitude
> has status: NEW.

Actually, a better label would be CONFIRMED rather than NEW. That's what NEW really means, it does not refer to the bug's age.

(In reply to Tony Mechelynck [:tonymec] from comment #86)
> Actually, a better label would be CONFIRMED rather than NEW. That's what NEW
> really means, it does not refer to the bug's age.

I am very well aware of that; my point was to draw attention to an unacceptable quality control, record-breaking in the length of bug fix cycle. Anyway, my further testing revealed several more issues with libldap, libpam-ldap and libnss-ldap, and I decided that this software as a whole does not meet my quality requirements. Instead I am deploying sssd as LDAP client for PAM and NSS, and this is my recommendation for readers of this page.

nslcd /nss-pam-ldapd would be the best choice, the code is quite mature since the basic LDAP functionality is ported from the old PADL code and well proven. It's also quite compact, it does just LDAP and nothing else. SSSD is unproven, and quite overloaded featurewise. For security/authentication software, complexity is the enemy of reliability. I shouldn't have to roll out that lecture again...

Ro (robert-markula) wrote :

nslcd/nss-pam-ldapd has its own share of problems. I would say that calling SSSD unproven is unjustified. It exists for quite some time, is actively developed and solves many problems that are still present - partly per design - with nslcd/nss-pam-ldapd. Finally, configuration with SSSD is much easier and much less error prone than the old nslcd/nss-pam-ldapd-combo.

Maciej Puzio and Howard Chu - thanks for the info, moving to ldapd or sssd solves this problem.

Chiming in with the info that I first encountered this bug in Mint 13 (Ubuntu Precise), and it still applies in Mint 17 (Ubuntu Trusty). And while I can understand all the issues involved with deciding the "right way to go", I am somewhat miffed to find that a decade-old bug still expresses itself as a SIGSEGV. Expecting the user to strace / google / eventually find this bug entry if he's lucky? Is it really that difficult to check for the condition and at least give a meaningful message (perhaps including a workaround recommendation) before exiting gracefully?

(In reply to Martin Baute from comment #90)
> Chiming in with the info that I first encountered this bug in Mint 13
> (Ubuntu Precise), and it still applies in Mint 17 (Ubuntu Trusty). And while
> I can understand all the issues involved with deciding the "right way to
> go", I am somewhat miffed to find that a decade-old bug still expresses
> itself as a SIGSEGV. Expecting the user to strace / google / eventually find
> this bug entry if he's lucky? Is it really that difficult to check for the
> condition and at least give a meaningful message (perhaps including a
> workaround recommendation) before exiting gracefully?

It is a constant of Electronic Data Processing that no program is bug-free before it is obsolete. Even once a bug is identified, fixing it is not always easy. Complaining that "after so many years, no fix has been found" doesn't push the bug any nearer to be fixed, while it adds to the lot of useless rubbish (please excuse my language) that developers must wade through in order to find what the problem really is.

Another constant of EDP is that there are never enough coding hands do do all that needs doing, even when, as at Mozilla, a lot of volunteers selflessly donate part of their time to help the people whose paid job it is to try and fix these bugs. Any help is always welcome, and the code is anyone's to look into.

Do you know how to fix the bug? Good! Write a patch, ASSIGN the bug to yourself, find an appropriate reviewer by browsing https://wiki.mozilla.org/Modules and off you go. Once you get a positive review, set the checkin-needed flag, and someone will push your patch into the permanent source.

You mean you don't know how to fix the patch? Ah, too bad. Neither do I. So let us wait patiently, even years if that's what it takes, until someone comes around who does, and in the meantime let's have a look at the "rules of the house", https://bugzilla.mozilla.org/page.cgi?id=etiquette.html

(In reply to Tony Mechelynck [:tonymec] from comment #91)
> ...lots of the usual deleted...

So your answer to a bug that's been confirmed, and after nine years still expresses itself as SIGSEGV, is basically, "go fix it yourself"?

You think *that* is a useful contribution to this bug report?

Sometimes I'm really ashamed of my peers in the trade. And no, I won't wade through Thunderbird sources, because I've got other projects. I am a Thunderbird *user*, not a *maintainer*, so...

...go fix it yourself.

Confirming this bug for 31.1.1 Linux (Xubuntu 14.04): User accounts through ldap authentication make Thunderbird crash when trying to print. Installing nscd makes that go away.

Yo (yleduc) on 2015-04-10
Changed in thunderbird (Ubuntu):
status: Triaged → Fix Released
Steve Kowalik (stevenk) on 2015-04-14
Changed in thunderbird (Ubuntu):
status: Fix Released → Triaged
psnizek (psnizek-i) wrote :

Bug #507089
"My (jcollins) home directory is sshfs mounted to a remote server on my network using pam_mount. This is what happens when I try to run thunderbird:
jcollins@joewks:~$ thunderbird
Segmentation fault
jcollins@joewks:~$

When I log in as a user (jcollins-local) without a sshfs mounted home directory, thunderbird runs fine."

>> I ran into the exact same symptoms as described by jcollins today after replacing the mainboard in my PC. Before that, Thunderbird was working reliably and was in use every day since 2013 on this machine. If logging in to my network where /home is on a NFS share, Thunderbird quits with "segmentation fault" during program start. If starting locally (logged in as local user), Thunderbird starts successfully. De- and re-installation didn't help. NSCD is on newest version.
We use LDAP for network user authentification (not for email).

psnizek (psnizek) wrote :

Update on #145:

I just noticed that the nscd demon is crashing during login. After restarting the service firefox starts and functions again. Don't know the cause for nscd crashing, but I believe it is off topic here. I don't mind this post and #145 being removed by the admin.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.