[upstream] ReferenceError: processLDAPValues is not defined
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mozilla Thunderbird |
Fix Released
|
Medium
|
|||
thunderbird (Ubuntu) |
Fix Released
|
High
|
Olivier Tilloy | ||
Bionic |
Fix Released
|
High
|
Olivier Tilloy |
Bug Description
The latest thunderbird (1:68.2.
Error messages:
Netscape.
Error: getLDAPAttibutes failed: ReferenceError: processLDAPValues is not defined
Netscape.
Error: authconfig.js failed failed: TypeError: userInfo.mail is undefined
Ref: https:/
I have downgraded thunderbird and confirmed that this happen when upgrading to 1:68.2.1+build1
My system:
Description: Ubuntu 18.04.3 LTS
Release: 18.04
apt policy thunderbird
thunderbird:
Installed: 1:52.7.
Candidate: 1:68.2.
Version table:
1:
500 http://
500 http://
*** 1:52.7.
500 http://
100 /var/lib/
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #14 |
> You can look for "Alvarez" in the LDAP address book ...
I got that wrong, that doesn't work either. Maybe Adams don't do SSL?
EDIT: Apparently they do: https:/
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #15 |
Yes, TB 60.8 works with Adams + SSL for both address book search and auto-complete.
In Mozilla Bugzilla #1576364, Alice0775-t (alice0775-t) wrote : | #16 |
Strange thing I found.
On Daily70.0a1 20190824095533 Windows10,
A. Restart TB > Open Compose Window > Type "Alvarez" in "To:" field > (GOOD)
B. Restart TB > Open Compose Window > Type "Alvarez" in "To:" field > (GOOD) > Search "Alvarez" in Contacts Sidebar(F9) > (BAD)
C. Restart TB > Open Compose Window > Search "Alvarez" in Contacts Sidebar(F9) > (GOOD) > Type "Alvarez" in "To:" field > (BAD)
Regression window:
https:/
https:/
Unfortunately, the taskcluster builds are not available. (it expired) :(
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #17 |
Thank you so much, Alice. So a regression from May 2018 in the 62 cycle. No one noticed in over a year :-(
So at the beginning of your comment #3 you're saying that LDAP lookup works in Daily with SSL enabled? At least the first time?
When I try it in a debug build, I crash:
```
xul.dll!
xul.dll!
xul.dll!
xul.dll!
xul.dll!
xul.dll!
xul.dll!
xul.dll!
xul.dll!
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #18 |
Oh, I got it, the first search with SSL works, in the AB, in the contacts side panel or auto-complete. The second search fails.
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #19 |
Bug 1456489 added this code here: https:/
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #20 |
Yes, but 1456489 is the reason for the LDAP/SSL failures.
Most SSL servers will use a certificiate with OCSP information. OCSP querying (checking for revoked certificates) is enabled by default.
However, in bug 1456489, the maintainers of the Firefox network engine have decided that such requests MUST NOT be done on the primary thread.
I see Thunderbird use a synchronous query to the LDAP server, directly from the main thread. This causes all subsequent I/O to happen on the main thread too, which the Firefox core code refuses to execute, and tells the caller that it's a bad certificate - which causes the LDAP/SSL connection to fail.
The solution is to move LDAP querying to a background thread, and report the results to the main thread.
Call stack subset from the main thread:
#30 0x00007ffff7a7835a in nsldapi_
at /home/user/
#31 0x00007ffff7a76fe9 in nsldapi_
#32 0x00007ffff7a85003 in simple_bind_nolock (ld=0x7fffd452e000, dn=0x7fffe6fb7636 <gNullChar> "", passwd=
#33 0x00007ffff7a84625 in ldap_simple_bind (ld=0x7fffd452e000, dn=0x7fffe6fb7636 <gNullChar> "", passwd=
#34 0x00007fffea41c9ed in nsLDAPOperation
#35 0x00007fffea48f7f4 in nsAbLDAPListene
#36 0x00007fffea410466 in nsLDAPConnectio
#37 0x00007fffeafe1a4b in mozilla:
Thunderbird should have automated tests for LDAP/SSL querying, to ensure we don't regress again in the future. I've filed bug 1576901 to track that.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #21 |
Insecure workaround: Disable OCSP (preferences / advanced / certificates, disable "query OCSP".
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #22 |
*** Bug 1574248 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #23 |
*** Bug 1583775 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #1576364, Mkmelin+mozilla (mkmelin+mozilla) wrote : | #24 |
For looking up LDAP cards, I think it starts off here:
https:/
-> nsAbLDAPDirecto
-> mDirectoryQuery
-> nsAbLDAPDirecto
I suppose we should make that query use a Runnable?
E.g. like https:/
Ben, something you could look into?
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #25 |
No patch yet, but here are my thoughts so far:
I've now gone through the code all the way from `nsAbLDAPDirect
The `nsLDAPConnection` already has a thread for handling blocking ldap operations, however a lot of the logic still happens out in the main thread (in `nsLDAPOperation`) before handing off to the worker thread. Including the bind operation, which is where our problem happens (the bind immediately goes off and tries to sort out the SSL cert, which fails because we're still on the main thread).
My plan is to run more of the ldap operation code into the worker thread, which should be simple enough (although the logic is split up over a bunch of different objects which obfuscates things a bit). Hopefully have a patch to try out tomorrow.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #26 |
Quick update:
It's not that the SSL handshake must be performed off the main thread, it's that it must be done specifically on the SocketTransport
I've now got the bind operation (which ends up performing the SSL handshake) doing that, and it no longer asserts. But it now fails to connect to the server (sighs). So I'm picking my way through the various layers to figure out why that is, and how to fix it.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #27 |
The more I dig into this, the more I think I should probably be working on Bug 826662 instead...
LDAP over SSL (LDAPS) seems to be deprecated in favour of the StartTLS extension.
(but see also this [blog post](https:/
But really, we should be supporting both.
Anyway.
Getting the handshake running on the STS thread is easy enough, but I'm still trying to figure out why my connections are failing.
(debugging is tricky - there are loads of layers to step through, and on a real network connection it'll time out anyway while you're inspecting it!)
Just to help me mentally organise my findings so far (I don't expect any of this to make much sense, but I find trying to write it down helps!):
- The C++ LDAP code uses [`libprldap`](https:/
- When in SSL mode, some of those IO functions are replaced.
- The replacement SSL connect function first calls the original plaintext connect function, then uses the TLS socket provider (nsTLSSocketPro
- when an LDAP operation is performed, the first data write triggers the TLS/SSL negotiation.
One of the really confusing things was that it uses the "starttls" socketprovider to set up the encryption. But this is totally unrelated to the 'startTLS' extension in LDAPv3. Apparently some LDAP servers don't support SSLv3, and "starttls" provides a way to make sure SSLv2 is used.
From the C++ side of things:
- `nsLDAPConnection` defers setting up the LDAP stuff until the host name is resolved ([`OnLookupComp
- The c-api LDAP is set up, [still in `OnLookupComple
- If we're using SSL, then [`nsLDAPInstall
- At some point, a new `nsLDAPOperation` will be created and invoked.
- The operation will call a C API ldap function which will cause the first data to be sent, which will in turn trigger the TLS handshaking.
- the operation will be passed to [`nsLDAPConnect
I'm feeling a bit fried on this one. But I've still got lots of avenues to explore. Weekend now. Fresh look at it on Monday.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #28 |
Created attachment 9102037
LDAP_SSL_
This is my hacky little patch I've been using to shuffle the SSL negotiation off onto the STS thread.
I implemented a special nsIRunnable for the SimpleBind method, because I erroneously thought that was a special case which triggered the SSL negotiation (it's not - it's just whatever operation first runs over the connection). But it's enough to work on a proof of concept. A proper fix would generalise it (and likely clarify a lot of the logic currently smeared across nsLDAPConnection and nsLDAPOperation).
Needless to say, it still doesn't work: the connection fails.
I've got the ldap.adams.edu LDAP server configured, and I open the addressbook and start entering a name to look up.
ldap.adams.edu is the only public LDAP server I've found which listens on both plain (port 389) and SSL (port 636) LDAP ports.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #29 |
Just a few explanations regarding the use of the terms SSL and TLS in our code, as it can be confusing.
nsTLSSocketProvider is misnamed. It should have been called nsStartTLSSocke
That class does a socket with "initial plaintext, after some protocol specific handshake, agree to raise the socket into speaking SSL/TLS".
nsSSLSocketProvider is ambiguous. It should have been called nsSSLorTLSSocke
This is a socket with "speak the SSL/TLS protocol immediately, wrap the inner protocol, and the inner protocol doesn't have to know anything about it".
The versions for SSL and TLS went from:
- SSL 2
- SSL 3
- TLS 1.0
- TLS 1.1
- TLS 1.2
- TLS 1.3
So SSL and TLS are just old and newer versions of each other, and at some time some people decided to rename it, very confusing.
At the time StartTLS was introduced, the most recent available version had already arrived at TLS, which is why it wasn't called StartSSL, but StartTLS.
I'm not sure why you talked about SSL v2. That's very old, and nobody should be using it any more. Even SSL v3 shouldn't be used anymore.
If you had seen some code mentioning SSL v2, it might have been because of another detail.
If a client and a server start to speak with each other about SSL/TLS (either immediately with plain SSL/TLS, or at the time they upgrade to plaintext socket to SSL/TLS in the StartTLS scenario), they exchange an initial handshake message.
That handshake message can either be "modern" with a list of flexible extensions.
Or, that handshake can be "SSL v2 compatible". Some servers simply gave strange errors when talking to them with the initial modern handshake, so a handshake that's backwards comaptible was tried in some scenarios.
Most of what I wrote here shouldn't matter much for the scope of this bug, but might help you understand better what's happening.
However, depending on the server port you're connecting to, you get one code path or the other.
If you connect to a server port that expects to speak SSL/TLS immediately (LDAPS), with an outer SSL/TLS layer, and an inner plaintext LDAP protocol layer, then you should see that the nsSSLSocketProvider code is reached. In this scenario, I assume the SSL/TLS handshake is executed on the thread that initiates the socket connection.
If you connect to a plaintext server port, that optionally can upgrade to SSL/TLS by using StartTLS, then the SSL/TLS handshake will happen later than the initial connect. That SSL/TLS handshake happening at StartTLS upgrade time is probably executed on the thread that handles the data flow on the socket. Depending on the implementation, that might not be the same thread that trigger the initial socket connection.
For the scope of this bug, I'd focus on the code that worked in the past. If so far we had only supported SSL/TLS, I'd try to get that work first, and worry about StartTLS support later.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #30 |
Looking at nsLDAPSSLConnect in nsLDAPSecurityG
sps->
I would have expected that code to do ...
sps->
... given that it doesn't claim to support StartTLS, but only support SSL/TLS.
However, what they do here, they immediately call ->StartTLS() on that socket. They don't wait for some plaintext back and forth talk to negotiate when the time has come to do that, they do it immediately.
So you could say, they do the functional equivalent of a SSL/TLS socket provider, and they don't need the ability to delay the upgrade of the socket to a later time.
That LDAP glue code also has a very old comment, that explains the motivation for doing that:
// If possible we want to avoid using SSLv2, as this can confuse
// some directory servers (notably the netscape 4.1 ds). The only
// way that PSM provides for us to do this is to use a socket that can
// be used for the STARTTLS protocol, because the STARTTLS protocol disallows
// the use of SSLv2.
This means, that code was written at a time when SSL v2 was still widely used. But they considered it insecure and didn't want to use it for LDAP. They didn't find another mechanism from within the LDAP code to disable SSL v2. That's why they abused that alternative TLSSocketProvider == StartTLSSocketP
Nowadays we should no longer need that "trick". That code here should be changed to call
sps->
and the call to ->StartTLS should be removed, as it's unnecessary.
This might not be helpful for this bug here, but it would be a correctness cleanup.
Later, once we get to fix bug 826662, we'd introduce an alternative code path that calls GetSocketProvid
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #31 |
Thanks very much Kai - much appreciated! I figured the "starttls" usage was probably a workaround, but was never sure it wasn't just me missing something important.
Is there still any reason for the LDAP SSL (ie port 636) path to open a plain socket, then upgrade it to SSL/TLS, rather than just asking the "ssl" socket provider to open a SSL/TLS connection in the first place?
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #32 |
(In reply to Ben Campbell from comment #18)
> Is there still any reason for the LDAP SSL (ie port 636) path to open a plain socket, then upgrade it to SSL/TLS, rather than just asking the "ssl" socket provider to open a SSL/TLS connection in the first place?
If we ever want to support StartTLS, you'll need to use the approach "create socket, do plaintext negotiation, then upgrade to SSL/TLS".
That means we'll eventually need a code path that is similar to the one today. But for now, if it means simplification, it seems fine to open the SSL/TLS socket directly.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #33 |
Created attachment 9102870
1576364-
This is getting really infuriating. This patch ditches the startTLS-
I'm uploading this patch now to see if there's anything I'm obviously doing wrong.
Just to confirm, the adams LDAP server (which I'm using to test against) definitely supports LDAP-over-SSL:
```
$ ldapsearch -v -x -H "ldaps:
```
Seems to work fine for me, over port 636.
More details on the patch:
I'm not so happy with the way it's still opening a plain TCP connection then attempting to upgrade it via nsISocketProvid
We use libprldap to provide the IO functions (and DNS hooks and mutex primitives) to interface ldap with the NSPR sockets, but libprldap doesn't know about nsISocketProvider. So we've got helpers in `ldap/xpcom/
I think it'd be much cleaner to ditch libprldap entirely, and provide the interfacing on the C++ side (ie in nsLDAPSecurityG
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #34 |
I suggest to trace where it's stuck, might be quicker than following through the debugger.
export NSPR_LOG_
Compare what's happening without and with your change, so you could see which events are missing.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #35 |
I didn't expect that fixing this bug would require digging deeply into the C/C++ network code. And I wasn't aware how special our ldap network code is.
My explanations from comment 16 and comment 17 weren't a suggestion how to fix this bug. I only wanted to give you more background information, because you were wondering about some of this details in your comment before it.
Your latest patch does very little, and only changes that socket provider detail. That will be insufficient to fix this bug.
You noticed how the socket code uses layers of functions. Yes, it makes use of NSPR file descriptor layering. It allows to implement a wrapped connection, where an outer protocol and an innner protocol is spoken. One layer represents the plain socket that is accessed by the application protocol code (here: LDAP). When using SSL/TLS, this isn't the usual plain implementation of a socket. Instead, it uses a socket implementation from mozilla/
That application level layer glues the I/O functions to the raw underlying implementation of SSL/TLS, which is provided by the NSS library.
If the application code wants to connect/read/write, the call is sent to a function from nsNSSIOLayer (e.g. PSMSend), which in turn redirects to ssl_* functions. Ldap code -> PSM code -> NSS code.
The NSS socket layer keep tracks of the additional handshaking and other stuff that needs to be exchanged on the raw socket, in order to make the socket secure.
The first part of that is to initiate a handshake between the server and the client. That is, the first time you read or write to the socket, the NSS layer will detect that. It will start communicating with the other side to exchange a few message in both directions, until both sides are happy with the connection. Only after this has happened, the NSS layer will accept the raw application data (LDAP) and encapsulate it into encrypted packages for secure transport.
The first read/write activity by the application should trigger the handshake. Because that initial handshake can take some time, the usual approach is to use async network code. The LDAP application should use async reading and writing on the socket, until it succeeds.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #36 |
(In reply to Ben Campbell from comment #13)
> It's not that the SSL handshake must be performed off the main thread, it's that it must be done specifically on the SocketTransport
Why?
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #37 |
Created attachment 9102983
ldap-stack.txt
just for reference, the full stack of the LDAP call and the failing assert.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #38 |
I found that the existing code, which creates a plain socket, then adds the StartTLS provider, then calls StartTLS, works around another problem.
Usually, the "connect" call on the socket would be done late, only after having layering SSL/TLS into it.
However, prldap functions prldap_connect -> prldap_
- get socket fd
- set socket options (such as blocking or nonblocking)
- call PR_Connect on that socket (which doesn't have the SSL/TLS layer yet)
This means, unless we want to rewrite the above prldap functions, we'd have to stick with the "immediate StartTLS" approach, as that offers a mechanism to reset the socket, and trigger the handshake (it calls SSL_ResetHandshake, which does a similar SSL/TLS socket setup than the call to PR_Connect).
In other words, I've found out the second patch is useless right now, and actually counterproductive, unless we rewrite parts of prldap.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #39 |
(In reply to Kai Engert (:kaie:) from comment #24)
> just for reference, the full stack of the LDAP call and the failing assert.
Ahh, now I see. I was getting a different assert (checking specifically for the STS thread rather than anything-
https:/
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #40 |
(In reply to Kai Engert (:kaie:) from comment #25)
> I found that the existing code, which creates a plain socket, then adds the StartTLS provider, then calls StartTLS, works around another problem.
> Usually, the "connect" call on the socket would be done late, only after having layering SSL/TLS into it.
Doh. I'd missed the fact that the SSL/TLS AddToSocket() layering doesn't work upon an already-connected socket. Too much StartTLS on the brain :-)
> However, prldap functions prldap_connect -> prldap_
> - get socket fd
> - set socket options (such as blocking or nonblocking)
> - call PR_Connect on that socket (which doesn't have the SSL/TLS layer yet)
prldap_connect() also does a bunch of host lookup stuff, which makes it trickier for the nsLDAPSSLSecuri
> In other words, I've found out the second patch is useless right now, and actually counterproductive, unless we rewrite parts of prldap.
Agreed. I actually think it'd tidy things up a lot if I essentially merged prldap and nsLDAPSSLSecuri
1) plain text: calls PR_OpenTCPSocket()
2) LDAPS: calls NewSocket() on the SSL/TLS socket provider
3) startTLS: calls NewSocket() on the startTLS socket provider (upgrading it via startTLS(), in sync with sending the appropriate LDAP protocol request to switch to TLS).
On the LDAP side, anything that can block (ie prettymuch anything involving sockets :- ) needs moving off the main thread. It's kind of half way there - it kicks requests off on the main thread, then dispatches to a separate thread to await the result.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #41 |
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #42 |
FYI, I did some experiments and tracing.
Using your first patch, I see that the handshake starts, but eventually the code in NSS stops with error code PR_WOULD_
(In ssl_SecureSend(), inside the call to ssl_Do1stHandsh
It's probably better if we change our ldap implementation to use nonblocking sockets (async i/o). As a quick experiment I changed prldap to request a nonblocking socket (see below), but that fails earlier. We'd need to change our ldap code to expect the typical "would block" scenario from sockets, and try again.
```
diff --git a/ldap/
--- a/ldap/
+++ b/ldap/
@@ -103,6 +103,8 @@ extern "C" int LDAP_CALLBACK nsLDAPSSLCo
NS_ASSERTION
options &= ~LDAP_X_
+
+ options |= LDAP_X_
// Retrieve session info. so we can store a pointer to our session info.
// in our socket info. later.
```
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #43 |
(In reply to Kai Engert (:kaie:) from comment #29)
> It's probably better if we change our ldap implementation to use nonblocking sockets (async i/o).
I think this is probably the Right Thing (tm) to do anyway: the libldap API seems to support async IO pretty comprehensively. A little bugzilla archeology suggests that libldap had troubles with async IO (on Linux, particularly) up to about 2002, but it _looks_ like it's been addressed (fingers crossed!).
Annoyingly, the C++ LDAP code calls only the blocking versions of the libldap APIs. But usually with a zero timeout, so it can poll repeatedly and use them in a kind-of async manner.
> As a quick experiment I changed prldap to request a nonblocking socket (see below), but that fails earlier. We'd need to change our ldap code to expect the typical "would block" scenario from sockets, and try again.
Yes it needs to be handled higher up - setting the libldap session to async, and calling the async variants of the API.
The `LDAP_X_
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #44 |
(In reply to Ben Campbell from comment #30)
> Annoyingly, the C++ LDAP code calls only the blocking versions of the libldap APIs. But usually with a zero timeout, so it can poll repeatedly and use them in a kind-of async manner.
I take that back. It does indeed already use the asynchronous libldap API calls.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #45 |
I think I've figured out a major problem with async mode in ldap: the PR_ERROR_
I've written it up and posted a fix (Bug 1590976).
It's not the complete solution - there's still some extra async setup to do on the C++ `nsLDAPConnection` side of things, which I'll sort out a patch for tomorrow (today my LDAP code is thoroughly riddled with experimental hacks and printfs, so there's cleanup to do first :- )
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #46 |
So, I've got my socket set to `PR_SockOpt_
It's still doing the silly open-a-
But when during the handshake, I always get the not-on-
`Assertion failure: onSTSThread, at /fast/ben/
(https:/
What's going wrong?
Should I be fetching the SocketTransport
I feel like I've missed something.
Anything about this jump out at you, Kai?
My stacks traces look like this:
```
#0 0x00007ffff00196c8 in mozilla:
#1 0x00007ffff0019aac in mozilla:
#2 0x00007ffff7508fe6 in ssl3_AuthCertif
#3 0x00007ffff7508f73 in ssl3_CompleteHa
at /fast/ben/
#4 0x00007ffff750d259 in ssl3_HandleCert
#5 0x00007ffff750d259 in ssl3_HandlePost
at /fast/ben/
#6 0x00007ffff750a5c7 in ssl3_HandleHand
at /fast/ben/
#7 0x00007ffff750ed79 in ssl3_HandleHand
#8 0x00007ffff750ed79 in ssl3_HandleNonA
at /fast/ben/
#9 0x00007ffff750fd17 in ssl3_HandleRecord (ss=0x7fffd709b000, cText=0x7fffd73
#10 0x00007ffff7518798 in ssl3_GatherComp
#11 0x00007ffff751a903 in ssl_GatherRecor
#12 0x00007ffff...
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #47 |
(In reply to Ben Campbell from comment #33)
> But when during the handshake, I always get the not-on-
>
> `Assertion failure: onSTSThread, at /fast/ben/
> (https:/
On which thread does this happen? Is it a thread different from the main thread?
(If it isn't on the main thread: I'm curious if it works if you temporarily remove the assertion.)
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #48 |
Dana, AuthCertificate
Is it necessary to be that strict? Would it be OK to run it on some other background thread?
See comment 33.
In Mozilla Bugzilla #1576364, Dkeeler (dkeeler) wrote : | #49 |
Yes, it is necessary. The certificate verification implementation relies on the fact that it's blocking the socket thread in some places.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #50 |
(In reply to Kai Engert (:kaie:) from comment #34)
> On which thread does this happen? Is it a thread different from the main thread?
> (If it isn't on the main thread: I'm curious if it works if you temporarily remove the assertion.)
Yes, I'm running it on a background thread (our LDAPConnection creates one to dispatch IO to. It used to start operations on the main thread and use the background thread to handle responses, but I'm moving all the IO onto the background thread).
Haven't tried removing the assert yet, but will give it a go when I get a chance.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #51 |
(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #36)
> Yes, it is necessary. The certificate verification implementation relies on the fact that it's blocking the socket thread in some places.
So, from the outside:
as someone just trying to open a secure socket connection (using the SSL/TLS or StartTLS socketproviders), on my own background thread, what am I doing wrong that causes me to hit this assertion?
Should I be dispatching all my IO on the SocketTransport
In Mozilla Bugzilla #1576364, Dkeeler (dkeeler) wrote : | #52 |
What network APIs are you using? `nsIChannel.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #53 |
(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #39)
> What network APIs are you using? `nsIChannel.
It's currently opening `PRFileDesc`-based sockets using `PR_OpenTCPSocket` for the plain connection. Which works fine.
TLS/SSL used to work, using PR_OpenTCPSocket() followed by an immediate `nsTLSSocketPro
(I'd also like to implement proper startTLS protocol (there's an LDAP operation to tell the server to switch), but that's a separate bug, and if I can get this TLS/SSL connection working then startTLS should work fine too).
I was under the impression that `nsIChannel` was for dealing with protocol-based connections (ie implemented in a nsIProtocolHand
Anyway. the LDAP stuff is old code, with some... idiosyncratic aspects. So if the preferred way of doing network IO has changed over the years, then I imagine this code hasn't yet heard the news :-)
In Mozilla Bugzilla #1576364, Dkeeler (dkeeler) wrote : | #54 |
I see. In that case, my understanding is all of that socket work should be done on the socket thread.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #55 |
(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #41)
> I see. In that case, my understanding is all of that socket work should be done on the socket thread.
Ahh, that's what I needed to know - I've now got it working.
Thanks for all your information Dana and Kai - it was very much appreciated!
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #56 |
Created attachment 9105105
1576364-
This patch gets LDAP up and running again over SSL/TLS connections (ie with the "SSL" checkbox ticked in the LDAP setup). It:
1. ensures libldap is placed into non-blocking mode
2. dispatches the ldap_simple_bind operation on the socket thread (rather than mainthread)
3. listens for LDAP responses in the socket thread (rather than on a custom thread)
4. moves a bunch of replicated error handling out of `nsLDAPOperation` and into `nsLDAPConnecti
Item 2 is important, as this is where the SSL/TLS handshaking is done. The SSL/TLS socket layer defers this security negotiation until the first send occurs, and for this code that means in ldap_simple_bind.
The intention is to move all the other operations to dispatch on the socket thread too, but for practical purposes this fixes the bug. (The other operations will issue on the main thread, without blocking, but their responses will be handled on the socket thread).
Moving the rest of the operations over can be done in a follow-up patch - there's enough else going on here already.
(and better still would be to get rid of `nsLDAPOperation` altogether, but that I will write up in another bug).
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #57 |
Comment on attachment 9105105
1576364-
Thanks for the patch, I'm happy to see this fixed so we can finally ship it in TB 68.2.x.
Looks like Kai is more familiar with the "plumbing" here, so I'm adding him as reviewer, but I'll take a look myself.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #58 |
The patch doesn't seem to work for me on Linux. I tried on both comm-central and comm-esr68 (merged). With the patch applied, I never see a search result displayed. Ben, on which platform have you tested? Did you use the Adams LDAP server for testing, or something else?
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #59 |
Comment on attachment 9105105
1576364-
Hmm, I'm afraid that Kai is right. No SSL still works, but SSL doesn't. In the AB search nothing comes up and using auto-complete, I crash on MOZ_ASSERT(
xul.dll!
xul.dll!
nss3.dll!
nss3.dll!
nss3.dll!
[Inline Frame] nss3.dll!
nss3.dll!
nss3.dll!
nss3.dll!
nss3.dll!
nss3.dll!
nss3.dll!
nss3.dll!
xul.dll!
prldap60.
ldap60.
ldap60.
ldap60.
ldap60.
ldap60.
ldap60.
xul.dll!
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #60 |
P.S.: Yes, using ldap.adams.edu (Base DN: ou=people,
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #61 |
Gah. Will get back into it.
Yes, I was using ldap.adams.edu.
I _think_ it might be because I only moved the `SimpleBind` operation to dispatch to STS. That's the first operation which performs a IO Send, and it's the first Send which kicks off the handshaking. I left the others because I figured it'd be better land the least-disruptive fix possible first.
But Jorg's trace has the handshaking happening during the `SearchExt` operation. I suspect it's either a timing thing, or depends on how you initiate the lookup (maybe some paths don't bother performing the simplebind if the username/password are empty).
Either way, the other operations should be moved onto the STS thread anyway, so I'll start on that.
In Mozilla Bugzilla #1592922, Mojmir21 (mojmir21) wrote : | #1 |
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0
Steps to reproduce:
Mozilla.cfg:
getLDAPAttribut
Actual results:
Error: getLDAPAttibutes fail: ReferenceError: processLDAPValues is not defined
Expected results:
It should work with "getLDAPAttribu
In Mozilla Bugzilla #1592922, Jorgk-bmo (jorgk-bmo) wrote : | #2 |
Mike, looks like a typo here in Mozilla core code:
https:/
I'll do a patch and you can find the right component for it.
In Mozilla Bugzilla #1592922, Jorgk-bmo (jorgk-bmo) wrote : | #3 |
Created attachment 9105527
Bug 1592922 - Fix typo in autoconfig's prefcalls.js. r=mkaply DONTBUILD
In Mozilla Bugzilla #1592922, Jorgk-bmo (jorgk-bmo) wrote : | #4 |
So what's the issue here? That there is a typo in the error message or that the function doesn't work?
In Mozilla Bugzilla #1592922, Mojmir21 (mojmir21) wrote : | #5 |
(In reply to Jorg K (GMT+2) from comment #3)
> So what's the issue here? That there is a typo in the error message or that the function doesn't work?
The function doesn't work.
In Mozilla Bugzilla #1592922, Pulsebot (pulsebot) wrote : | #6 |
Pushed by <email address hidden>:
https:/
Fix typo in autoconfig's prefcalls.js. r=mkaply DONTBUILD
In Mozilla Bugzilla #1592922, Jorgk-bmo (jorgk-bmo) wrote : | #7 |
Mike, thanks for the review. I fully agree about the DONTBUILD, although I've see M-C developers push comment changes of a few lines causing an entire build.
Now that we fixed the typo, what can be said about the malfunction? What is that LDAP stuff residing in M-C? Does that call into TB's LDAP handling? It does, incredible:
https:/
https:/
https:/
Note that TB's LDAP handling with SSL is broken in TB 68, see bug 1576364.
Ben, more LDAP chagrin.
In Mozilla Bugzilla #1592922, Benc-q (benc-q) wrote : | #8 |
Hmm.
1. The offending function, `processLDAPValues` doesn't appear to exist anywhere:
https:/
Comment just above where it's called says "user supplied method".
I had to look it what the autoconfig module actually is - it's [MCD, Mission Control Desktop, AKA AutoConfig](https:/
2. I really don't know what the status of the "autoconfig" module is. Even supplying a `processLDAPValues` function, I suspect there might be issues...
3. If it's supposed to work under both TB and Firefox, presumably its LDAP support should be disabled under Firefox, given that the LDAP support won't even be available? If it's TB-specific, then presumably the whole module should be moved out into comm-central.
In Mozilla Bugzilla #1592922, Shindli (shindli) wrote : | #9 |
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #62 |
Created attachment 9105715
1576364-
Not a 100% completed patch yet, but hitting the weekend here, so I'm uploading what I've got - it works for me as is (addressbook lookup via SSL connection to the Adams server) so I'm hoping it'll work for someone else too!
It moves the bulk of the LDAP operations onto the socket thread.
(still got to move the two-part BindSASL/StepSASL over, but I don't think it'll stop anything working right now if you're just using the adams ldap server to test against with no fancy login credentials).
I also need to tidy it up and review it thoroughly myself - there's a lot of churn, so lots of scope for screwing something up with a typo.
In Mozilla Bugzilla #1576364, Mkmelin+mozilla (mkmelin+mozilla) wrote : | #63 |
I can confirm this patch seems to work.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #64 |
Only a subset of operations works for me.
In address book, if the left side stays with the default "all address books", then no search results are shown.
However, if I explicitly select the configured LDAP directory, then I get search results.
The same can be seen in composer, clicking F9 to open the search pane.
I get the same successes and failures regardless of the OCSP enabled/disabled setting, that's an improvement.
In Mozilla Bugzilla #1576364, Mkmelin+mozilla (mkmelin+mozilla) wrote : | #65 |
I think that has been flakey earlier too. Do you see it without SSL?
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #66 |
After additional testing, the behavior I reported in comment 51 looks like a separate regression after ESR68.
Using TB 68, with SSL disabled, the lookup in addressbook works in both scenarios, either having selected "all address books" or "Adams" specifically. Same behaviour in composer F9.
Using latest c-c (TB 72), without Ben's patch, with SSL disabled, only selecting "Adams" gives me results, and nothing for "all".
I think we can handle that as a separate regression, I'll file a bug.
Once we're ready to backport Ben's patch to TB 68, we'll have to test if it works for the "all address books" scenarios, too.
In any case, the latest patch is indeed an improvement, it works for me with both SSL and OCSP enabled, even with OCSP required.
I understand that Ben wants to do cleanup prior to requesting review, so I'll wait for him to request it.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #67 |
Created attachment 9106096
1576364-
This one has the last couple of operations shifted off the main thread, and it all still seems to fix the bug for me.
I can't claim it does much for code readability... but ideally it'll all get swept away by bug 1592500 one day!
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #68 |
Comment on attachment 9106096
1576364-
Review of attachment 9106096:
-------
Just minor comments, but please (at least) move the call to ldap_set_option, r=kaie with that.
Postponing the additional work into the separate bugs you filed seems ok.
::: ldap/xpcom/
@@ +518,5 @@
> //
> mConnectionHandle =
> ldap_init(
> mPort == -1 ? (mSSL ? LDAPS_PORT : LDAP_PORT) : mPort);
> + ldap_set_
should be moved into the else branch below (don't execute if handle is null)
::: ldap/xpcom/
@@ +317,5 @@
> + LDAP *ld = LDAPHandle();
> + int32_t msgID = ldap_simple_
> +
> + if (msgID == -1) {
> + NotifyLDAPError();
Maybe keep TranslateLDAPEr
@@ +506,5 @@
> + }
> +
> + SetID(msgID);
> + // Register the operation to pick up responses.
> + Conn()-
Could Conn() ever be null?
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #69 |
Any news here? I'd like to ship this in the next beta asap. so we can get it into TB 68 soon.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #70 |
(In reply to Kai Engert (:kaie:) from comment #55)
> Maybe keep TranslateLDAPEr
> (multiple places), so it will be available when working on bug 1592449.
Looks like my suggestion is unnecessary, because NotifyLDAPError() has access to lderrno which can be used to obtain the failure.
> > + Conn()-
>
> Could Conn() ever be null?
The previous code didn't check for null at this place, so we can skip that for now.
I'll make the single change that I consider necessary, and attach an updated revision, so Jörg can proceed.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #71 |
Created attachment 9106832
1576364-
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #72 |
Thanks, Kai!
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #73 |
Sorry, this doesn't work for my in SSL mode. No Alvarez found. This doesn't look good either:
[15920, Socket Thread] ###!!! ASSERTION: nsLDAPSSLConnect(): called for blocking connection: 'options & LDAP_X_
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #74 |
Jörg, you tested on Windows, correct?
I've only tested on Linux, which worked for me.
Could this be a platform specific race? I suggest we make a try build, so we have something for reliably testing across platforms.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #75 |
Jörg, also, did you consider bug 1593347 while testing? You must select the LDAP directory in the address book when testing with c-c.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #76 |
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #77 |
Since bug 1593347 is fixed, I tested LDAP directory search, "All Addressbooks" search and address auto-complete. Nothing worked on Windows with SSL enabled.
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #78 |
Ok. I tested with a fresh local build on Linux, and it works for me, all scenarios. I'll test Windows once the above try build is ready.
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #79 |
Nice try run ;-) - Of course I used a local debug build. Your 32bit Windows build on Windows is ready, but it doesn't work either. Port 636, right, the one I get when switching on SSL?
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #80 |
Yes, everything you said is correct.
I confirm, both win32 and win64 builds don't work. Nothing in error console.
OSX works, Linux works.
I'll start a local Windows debug build. If nobody is quicker than me, I could try to debug it tomorrow.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #81 |
(In reply to Jorg K (GMT+2) from comment #60)
> Sorry, this doesn't work for my in SSL mode. No Alvarez found. This doesn't look good either:
>
> [15920, Socket Thread] ###!!! ASSERTION: nsLDAPSSLConnect(): called for blocking connection: 'options & LDAP_X_
This assert means that libldap isn't asking the socket layer for a non-blocking socket... Which will cause _everything_ to screw up.
Maybe libldap does something different on windows. Looking into it.
In Mozilla Bugzilla #1576364, Benc-q (benc-q) wrote : | #82 |
Ahh yeah, this'll probably do it:
https:/
```
/*
* Async IO. Use a non blocking implementation of connect() and
* dns functions
*/
#if !defined(
# if !defined(_WINDOWS) && !defined(macintosh)
# define LDAP_ASYNC_IO
# endif /* _WINDOWS */
#endif
```
It looks like the default is to not compile in async IO support on windows (And mac. But I'm not sure that `macintosh` is actually `#defined`...)
So, hacking in a `#define LDAP_ASYNC_IO` might get it working.
If `LDAP_ASYNC_IO` is undefined then `ldap_set_
https:/
I _think_ libldap should support async IO just fine on Windows. We don't really use the built-in libldap socket stuff anyway - we replace all the IO with an unholy combination of `libprldap` and [`nsLDAPSecurit
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #83 |
Looks like ```macintosh``` was used for old Mac OS 9, which explains why it worked in my OSX testing.
(according to https:/
I can try to remove ```if !defined(_WINDOWS) && !defined(
In Mozilla Bugzilla #1576364, Kai Engert (kaie) wrote : | #84 |
Created attachment 9107132
1576364-
I confirm this change fixes it on Windows.
We might want to report this change to upstream per https:/
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #85 |
Comment on attachment 9107132
1576364-
Thanks, WFM.
In Mozilla Bugzilla #1576364, Pulsebot (pulsebot) wrote : | #86 |
Pushed by <email address hidden>:
https:/
Dispatch LDAP operations on socket thread. r=kaie
https:/
Enable LDAP async IO on Windows. r=jorgk DONTBUILD
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #88 |
Created attachment 9108662
ldap.patch - ESR68
Hi Ben,
your patch doesn't apply to comm-esr68, but this one applies **if** we also uplift
https:/
https:/
(the former has one hunk in ldap/xpcom/
So the choices are: Rebase heavily or uplift these two. What do you think?
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #89 |
Comment on attachment 9106832
1576364-
Sorry, doesn't apply, huge clashes with bug 1562157 in nsLDAPOperation
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #90 |
Comment on attachment 9108662
ldap.patch - ESR68
Sorry, the original patch does apply if I uplift some cosmetic tweaks first and of course bug 1562157.
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #91 |
TB 68.3 ESR:
https:/
https:/
This applied cleanly after uplifting bug 1562157. I didn't want to run the risk of a special TB 68 backport.
In Mozilla Bugzilla #1576364, Jorgk-bmo (jorgk-bmo) wrote : | #92 |
Working in my TB 68 build, Leslie is there.
summary: |
- Typo in thunderbird code prevents central configuration + [upstream] Typo in thunderbird code prevents central configuration |
Changed in thunderbird (Ubuntu): | |
importance: | Undecided → High |
tags: | added: regression-update |
Changed in thunderbird: | |
importance: | Unknown → Medium |
status: | Unknown → New |
Launchpad Janitor (janitor) wrote : Re: [upstream] Typo in thunderbird code prevents central configuration | #10 |
Status changed to 'Confirmed' because the bug affects multiple users.
Changed in thunderbird (Ubuntu): | |
status: | New → Confirmed |
Ingar Smedstad (ingsme) wrote : | #11 |
Any ETA for this? We use this to configure thunderbird for more than 100 users…
Changed in thunderbird (Ubuntu): | |
assignee: | nobody → Olivier Tilloy (osomon) |
tags: | added: rls-bb-incoming |
Olivier Tilloy (osomon) wrote : | #12 |
The typo is in the error message only, it's not what's causing the error itself. This is more likely https:/
summary: |
- [upstream] Typo in thunderbird code prevents central configuration + [upstream] ReferenceError: processLDAPValues is not defined |
Changed in thunderbird: | |
importance: | Medium → Unknown |
status: | New → Unknown |
Changed in thunderbird: | |
importance: | Unknown → Medium |
status: | Unknown → Fix Released |
Ingar Smedstad (ingsme) wrote : | #93 |
Ah. I see. That makes sense. I'll confess I did not study the code and the fix since I thought I had found the culprit.
Anyway, someone is on the case and a solution is forthcoming. Can't ask for more then that.
Thank you.
Changed in thunderbird (Ubuntu Bionic): | |
assignee: | nobody → Olivier Tilloy (osomon) |
importance: | Undecided → High |
status: | New → Confirmed |
tags: | removed: rls-bb-incoming |
Olivier Tilloy (osomon) wrote : | #94 |
This should now be fixed, as bionic and eoan have thunderbird 68.4.1, while focal has 68.5.0.
Feel free to re-open with a comment detailing the version you're running if this is still a problem.
Changed in thunderbird (Ubuntu Bionic): | |
status: | Confirmed → Triaged |
status: | Triaged → Fix Released |
Changed in thunderbird (Ubuntu): | |
status: | Confirmed → Fix Released |
In Mozilla Bugzilla #1576364, Mozilla-kaply (mozilla-kaply) wrote : | #95 |
This broke LDAP autoconfig.
Set up an LDAP server as described in bug 1574712 comment #0. Tick the SSL box.
You can look for "Alvarez" in the LDAP address book, but recipient auto-complete gives Query.doQuery] 2 nsAbLDAPAutoCom pleteSearch. js:261
Component returned failure code: 0x80590051 [nsIAbDirectory
Alice, can you check for us when that stopped working. Same regression range as bug 1574712? Or did that stop working earlier?