2021-03-26 10:55:57 |
Rex Goldsmith |
bug |
|
|
added bug |
2021-03-26 10:59:47 |
Rex Goldsmith |
attachment added |
|
apport file attached. https://bugs.launchpad.net/ubuntu/+bug/1921494/+attachment/5481105/+files/apport.sssd.h805vgu_.apport |
|
2021-03-26 12:31:55 |
Ubuntu Foundations Team Bug Bot |
tags |
|
bot-comment |
|
2021-03-26 13:54:29 |
Brian Murray |
affects |
ubuntu |
sssd (Ubuntu) |
|
2021-03-29 13:45:11 |
Rex Goldsmith |
description |
New sssd.conf variable ad_use_ldaps not working. On starting sssd it errors with "sssd[be[13765]: Could not start TLS encryption. (unknown error code)"
# lsb_release -rd
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Note: problem also seen with Ubuntu 20.04.2
# apt-cache policy sssd | grep Installed
Installed: 1.16.1-1ubuntu1.7
Expectation
Adding ad_use_ldaps to a working AD integrated /etc/sssd/sssd.conf to use port 636 instead of port 389 due ADV 190023. Reference https://bugs.launchpad.net/ubuntu/focal/+source/sssd/+bug/1868703/
Problem
Added a working Public root CA cert to the common ca-certificate (/etc/ssl/ca-certificates) and /etc/ldap/ldap.conf has following set:
TLS_CACERT /etc/ssl/certs/ca-certificates.crt
An ldapsearch using the above certificate bundle against LDAPS is successful:
# openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005)
# ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperuser@COMPANY.COM SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip>
# Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
sssd.conf is configured with:
[sssd]
domains = company.com
config_file_version = 2
services = nss, pam
[domain/company.com]
ad_domain = company.com
krb5_realm = company.com
realmd_tags = manages-system joined-with-adcli
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
ldap_id_mapping = True
ad_use_ldaps = True
ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
auth_provider = ad
access_provider = simple
simple_allow_groups = linux-admins
Stopping sssd, clearing sssd cache, starting sssd returns following error:
sssd[be[13765]: Could not start TLS encryption. (unknown error code)
Setting debug_level = 4 (or higher) returns following around this unknown error:
[set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved'
[be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600
[ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com'
[ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com'
[sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting
[sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)]
[sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18]
[sss_ldap_init_state_destructor] (0x0400): closing socket [18]
[sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error.
[fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working'
[fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working' |
New sssd.conf variable ad_use_ldaps not working. On starting sssd it errors with "sssd[be[13765]: Could not start TLS encryption. (unknown error code)"
# lsb_release -rd
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Note: problem also seen with Ubuntu 20.04.2
# apt-cache policy sssd | grep Installed
Installed: 1.16.1-1ubuntu1.7
Expectation
Adding ad_use_ldaps to a working AD integrated /etc/sssd/sssd.conf to use port 636 instead of port 389 due ADV 190023. Reference https://bugs.launchpad.net/ubuntu/focal/+source/sssd/+bug/1868703/
Problem
Added a working Public root CA cert to the common ca-certificate (/etc/ssl/ca-certificates) and /etc/ldap/ldap.conf has following set:
TLS_CACERT /etc/ssl/certs/ca-certificates.crt
An ldapsearch using the above certificate bundle against LDAPS is successful:
# openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005)
# ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperuser@COMPANY.COM SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip>
# Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
sssd.conf is configured with:
[sssd]
domains = company.com
config_file_version = 2
services = nss, pam
[domain/company.com]
ad_domain = company.com
krb5_realm = company.com
realmd_tags = manages-system joined-with-adcli
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
ldap_id_mapping = True
ad_use_ldaps = True
ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
auth_provider = ad
access_provider = simple
simple_allow_groups = linux-admins
Stopping sssd, clearing sssd cache, starting sssd returns following error:
sssd[be[13765]: Could not start TLS encryption. (unknown error code)
Setting debug_level = 4 (or higher) returns following around this unknown error:
[set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved'
[be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600
[ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com'
[ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com'
[sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting
[sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)]
[sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18]
[sss_ldap_init_state_destructor] (0x0400): closing socket [18]
[sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error.
[fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working'
[fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working' |
|
2021-10-05 11:58:28 |
Launchpad Janitor |
sssd (Ubuntu): status |
New |
Confirmed |
|
2021-10-06 19:58:45 |
Athos Ribeiro |
sssd (Ubuntu): status |
Confirmed |
Incomplete |
|
2021-10-06 20:29:52 |
Matthew Ruffell |
bug |
|
|
added subscriber Matthew Ruffell |
2021-10-06 23:24:14 |
Matthew Ruffell |
bug watch added |
|
https://github.com/SSSD/sssd/issues/5531 |
|
2021-10-07 12:36:10 |
Matthias Winkler |
attachment added |
|
ldaps.JPG https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+attachment/5531294/+files/ldaps.JPG |
|
2021-10-07 14:14:45 |
Matthias Winkler |
attachment added |
|
sssd_ldap_server.log https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+attachment/5531319/+files/sssd_ldap_server.log |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
nominated for series |
|
Ubuntu Focal |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
bug task added |
|
sssd (Ubuntu Focal) |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
nominated for series |
|
Ubuntu Impish |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
bug task added |
|
sssd (Ubuntu Impish) |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
nominated for series |
|
Ubuntu Bionic |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
bug task added |
|
sssd (Ubuntu Bionic) |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
nominated for series |
|
Ubuntu Hirsute |
|
2021-10-10 22:33:15 |
Matthew Ruffell |
bug task added |
|
sssd (Ubuntu Hirsute) |
|
2021-10-10 22:34:07 |
Matthew Ruffell |
sssd (Ubuntu Bionic): status |
New |
In Progress |
|
2021-10-10 22:34:10 |
Matthew Ruffell |
sssd (Ubuntu Focal): status |
New |
In Progress |
|
2021-10-10 22:34:14 |
Matthew Ruffell |
sssd (Ubuntu Hirsute): status |
New |
In Progress |
|
2021-10-10 22:34:17 |
Matthew Ruffell |
sssd (Ubuntu Impish): status |
Incomplete |
In Progress |
|
2021-10-10 22:34:20 |
Matthew Ruffell |
sssd (Ubuntu Bionic): importance |
Undecided |
Medium |
|
2021-10-10 22:34:23 |
Matthew Ruffell |
sssd (Ubuntu Focal): importance |
Undecided |
Medium |
|
2021-10-10 22:34:26 |
Matthew Ruffell |
sssd (Ubuntu Hirsute): importance |
Undecided |
Medium |
|
2021-10-10 22:34:30 |
Matthew Ruffell |
sssd (Ubuntu Impish): importance |
Undecided |
Medium |
|
2021-10-10 22:34:34 |
Matthew Ruffell |
sssd (Ubuntu Bionic): assignee |
|
Matthew Ruffell (mruffell) |
|
2021-10-10 22:34:36 |
Matthew Ruffell |
sssd (Ubuntu Focal): assignee |
|
Matthew Ruffell (mruffell) |
|
2021-10-10 22:34:38 |
Matthew Ruffell |
sssd (Ubuntu Hirsute): assignee |
|
Matthew Ruffell (mruffell) |
|
2021-10-10 22:34:41 |
Matthew Ruffell |
sssd (Ubuntu Impish): assignee |
|
Matthew Ruffell (mruffell) |
|
2021-10-10 22:34:56 |
Matthew Ruffell |
tags |
bot-comment |
bot-comment seg |
|
2021-10-10 23:53:11 |
Matthew Ruffell |
summary |
ad_use_ldaps error could not start tls encryption |
ldap_install_tls occasionally fails due to watchdog timeout when using ad_use_ldaps with tls |
|
2021-10-10 23:57:19 |
Matthew Ruffell |
description |
New sssd.conf variable ad_use_ldaps not working. On starting sssd it errors with "sssd[be[13765]: Could not start TLS encryption. (unknown error code)"
# lsb_release -rd
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Note: problem also seen with Ubuntu 20.04.2
# apt-cache policy sssd | grep Installed
Installed: 1.16.1-1ubuntu1.7
Expectation
Adding ad_use_ldaps to a working AD integrated /etc/sssd/sssd.conf to use port 636 instead of port 389 due ADV 190023. Reference https://bugs.launchpad.net/ubuntu/focal/+source/sssd/+bug/1868703/
Problem
Added a working Public root CA cert to the common ca-certificate (/etc/ssl/ca-certificates) and /etc/ldap/ldap.conf has following set:
TLS_CACERT /etc/ssl/certs/ca-certificates.crt
An ldapsearch using the above certificate bundle against LDAPS is successful:
# openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005)
# ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperuser@COMPANY.COM SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip>
# Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
sssd.conf is configured with:
[sssd]
domains = company.com
config_file_version = 2
services = nss, pam
[domain/company.com]
ad_domain = company.com
krb5_realm = company.com
realmd_tags = manages-system joined-with-adcli
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
ldap_id_mapping = True
ad_use_ldaps = True
ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
auth_provider = ad
access_provider = simple
simple_allow_groups = linux-admins
Stopping sssd, clearing sssd cache, starting sssd returns following error:
sssd[be[13765]: Could not start TLS encryption. (unknown error code)
Setting debug_level = 4 (or higher) returns following around this unknown error:
[set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved'
[be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600
[ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com'
[ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com'
[sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting
[sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)]
[sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18]
[sss_ldap_init_state_destructor] (0x0400): closing socket [18]
[sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error.
[fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working'
[fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working' |
[Impact]
If you enable ad_use_ldaps on your sssd config, and have your sssd configured to use TLS instead of the regular GSS-SPNEGO or GSSAPI encryption, if you have a slow AD server or a busy network, the watchdog could timeout the call to ldap_install_tls() before it completes, and you won't be able to connect to the AD server, since the TLS handshake will fail.
If you set debug_level to 4 or higher, you will see the following in sssd_ldap_server.log:
[set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved'
[be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600
[ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com'
[ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com'
[sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting
[sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)]
[sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18]
[sss_ldap_init_state_destructor] (0x0400): closing socket [18]
[sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error.
[fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working'
[fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working'
ldapsearch with ldaps will work correctly in the same environment:
# openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005)
# ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperuser@COMPANY.COM SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip>
# Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
A workaround is to simply try again, since this a race condition, and you might beat the watchdog on subsequent retries. Otherwise, disable ad_use_ldaps until a fix is available.
[Testcase]
You will need a Windows 2k19 server with Active Directory installed and configured, and create some users in Active Directory.
On the Ubuntu client, join the AD server using realm. You will need to import the AD certificate too.
When importing the TLS certificate, you can add it to /etc/ssl/ca-certificates, and edit /etc/ldap/ldap.conf and set:
TLS_CACERT /etc/ssl/certs/ca-certificates.crt
Edit /etc/sssd/sssd.conf and ensure that ldap_tls_cacert is set correctly to "ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt", and enable "ad_use_ldaps = True".
Then restart sssd with:
$ sudo systemctl restart sssd.service
If you have a slow server or busy network, the watchdog will kill the call to ldap_install_tls() before it completes, and sssd will fail to start. You may need several attempts to reproduce. Just keep restarting sssd.service.
[Where problems could occur]
The changes only affect users who implement ad_use_ldaps, and only those who use TLS. Those using GSS-SPNEGO with ad_use_ldaps would not be affected, and neither those not using ad_use_ldaps.
The patch checks for failure of TLS handshake with the AD server, and adds a retry if the failure was caused by the watchdog killing the call to ldap_install_tls(). This happens very early on in sssd service startup, and if a regression were to occur, a system administrator would notice almost immediately and downgrade the package.
If a regression were to occur, a workaround is to 1) change from tls to GSS_SPNEGO, or 2) disable ad_use_ldaps.
[Other info]
This is reported upstream in:
https://github.com/SSSD/sssd/issues/5531
The commit which fixes the issue is:
commit da55e3e69707de416b7949d08c165c950090bbb6
From: Iker Pedrosa <ipedrosa@redhat.com>
Date: Wed, 3 Mar 2021 15:34:49 +0100
Subject: ldap: retry ldap_install_tls() when watchdog interruption
Link: https://github.com/SSSD/sssd/commit/da55e3e69707de416b7949d08c165c950090bbb6
This landed in sssd 2.5.0, so Bionic, Focal, Hirsute and Impish all require fixing. The commit is a cherry pick to Focal, Hirsute and Impish, while Bionic requires a backport for minor context adjustments. |
|
2021-10-11 00:09:07 |
Matthew Ruffell |
description |
[Impact]
If you enable ad_use_ldaps on your sssd config, and have your sssd configured to use TLS instead of the regular GSS-SPNEGO or GSSAPI encryption, if you have a slow AD server or a busy network, the watchdog could timeout the call to ldap_install_tls() before it completes, and you won't be able to connect to the AD server, since the TLS handshake will fail.
If you set debug_level to 4 or higher, you will see the following in sssd_ldap_server.log:
[set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved'
[be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600
[ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com'
[ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com'
[sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting
[sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)]
[sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18]
[sss_ldap_init_state_destructor] (0x0400): closing socket [18]
[sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error.
[fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working'
[fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working'
ldapsearch with ldaps will work correctly in the same environment:
# openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005)
# ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperuser@COMPANY.COM SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip>
# Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
A workaround is to simply try again, since this a race condition, and you might beat the watchdog on subsequent retries. Otherwise, disable ad_use_ldaps until a fix is available.
[Testcase]
You will need a Windows 2k19 server with Active Directory installed and configured, and create some users in Active Directory.
On the Ubuntu client, join the AD server using realm. You will need to import the AD certificate too.
When importing the TLS certificate, you can add it to /etc/ssl/ca-certificates, and edit /etc/ldap/ldap.conf and set:
TLS_CACERT /etc/ssl/certs/ca-certificates.crt
Edit /etc/sssd/sssd.conf and ensure that ldap_tls_cacert is set correctly to "ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt", and enable "ad_use_ldaps = True".
Then restart sssd with:
$ sudo systemctl restart sssd.service
If you have a slow server or busy network, the watchdog will kill the call to ldap_install_tls() before it completes, and sssd will fail to start. You may need several attempts to reproduce. Just keep restarting sssd.service.
[Where problems could occur]
The changes only affect users who implement ad_use_ldaps, and only those who use TLS. Those using GSS-SPNEGO with ad_use_ldaps would not be affected, and neither those not using ad_use_ldaps.
The patch checks for failure of TLS handshake with the AD server, and adds a retry if the failure was caused by the watchdog killing the call to ldap_install_tls(). This happens very early on in sssd service startup, and if a regression were to occur, a system administrator would notice almost immediately and downgrade the package.
If a regression were to occur, a workaround is to 1) change from tls to GSS_SPNEGO, or 2) disable ad_use_ldaps.
[Other info]
This is reported upstream in:
https://github.com/SSSD/sssd/issues/5531
The commit which fixes the issue is:
commit da55e3e69707de416b7949d08c165c950090bbb6
From: Iker Pedrosa <ipedrosa@redhat.com>
Date: Wed, 3 Mar 2021 15:34:49 +0100
Subject: ldap: retry ldap_install_tls() when watchdog interruption
Link: https://github.com/SSSD/sssd/commit/da55e3e69707de416b7949d08c165c950090bbb6
This landed in sssd 2.5.0, so Bionic, Focal, Hirsute and Impish all require fixing. The commit is a cherry pick to Focal, Hirsute and Impish, while Bionic requires a backport for minor context adjustments. |
[Impact]
If you enable ad_use_ldaps on your sssd config, and have your sssd configured to use TLS instead of the regular GSS-SPNEGO or GSSAPI encryption, if you have a slow AD server or a busy network, the watchdog could timeout the call to ldap_install_tls() before it completes, and you won't be able to connect to the AD server, since the TLS handshake will fail.
If you set debug_level to 4 or higher, you will see the following in sssd_ldap_server.log:
[set_server_common_status] (0x0100): Marking server 'ad-server.company.com' as 'name resolved'
[be_resolve_server_process] (0x0200): Found address for server ad-server.company.com: [y.y.y.y] TTL 3600
[ad_resolve_callback] (0x0100): Constructed uri 'ldaps://ad-server.company.com'
[ad_resolve_callback] (0x0100): Constructed GC uri 'ldaps://ad-server.company.com'
[sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting
[sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect error] [(unknown error code)]
[sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for ldap:[0x55d1149ef6e0] sd:[18]
[sss_ldap_init_state_destructor] (0x0400): closing socket [18]
[sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [5]: Input/output error.
[fo_set_port_status] (0x0100): Marking port 389 of server 'ad-server.company.com' as 'not working'
[fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ad-server.company.com' as 'not working'
ldapsearch with ldaps will work correctly in the same environment:
# openssl s_client -connect company-ad-server.company.com:636 CONNECTED(00000005)
# ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b "dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication started SASL username: superduperuser@COMPANY.COM SASL SSF: 0 filter: (sAMAccountName=superduperuser) requesting: All userApplication attributes <snip>
# Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
A workaround is to simply try again, since this a race condition, and you might beat the watchdog on subsequent retries. Otherwise, disable ad_use_ldaps until a fix is available.
[Testcase]
You will need a Windows 2k19 server with Active Directory installed and configured, and create some users in Active Directory.
On the Ubuntu client, join the AD server using realm. You will need to import the AD certificate too.
When importing the TLS certificate, you can add it to /etc/ssl/ca-certificates, and edit /etc/ldap/ldap.conf and set:
TLS_CACERT /etc/ssl/certs/ca-certificates.crt
Edit /etc/sssd/sssd.conf and ensure that ldap_tls_cacert is set correctly to "ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt", and enable "ad_use_ldaps = True".
Then restart sssd with:
$ sudo systemctl restart sssd.service
If you have a slow server or busy network, the watchdog will kill the call to ldap_install_tls() before it completes, and sssd will fail to start. You may need several attempts to reproduce. Just keep restarting sssd.service.
Test packages are available in the below ppa:
https://launchpad.net/~mruffell/+archive/ubuntu/lp1921494-test
When using the test packages, sssd should start reliably everytime.
[Where problems could occur]
The changes only affect users who implement ad_use_ldaps, and only those who use TLS. Those using GSS-SPNEGO with ad_use_ldaps would not be affected, and neither those not using ad_use_ldaps.
The patch checks for failure of TLS handshake with the AD server, and adds a retry if the failure was caused by the watchdog killing the call to ldap_install_tls(). This happens very early on in sssd service startup, and if a regression were to occur, a system administrator would notice almost immediately and downgrade the package.
If a regression were to occur, a workaround is to 1) change from tls to GSS_SPNEGO, or 2) disable ad_use_ldaps.
[Other info]
This is reported upstream in:
https://github.com/SSSD/sssd/issues/5531
The commit which fixes the issue is:
commit da55e3e69707de416b7949d08c165c950090bbb6
From: Iker Pedrosa <ipedrosa@redhat.com>
Date: Wed, 3 Mar 2021 15:34:49 +0100
Subject: ldap: retry ldap_install_tls() when watchdog interruption
Link: https://github.com/SSSD/sssd/commit/da55e3e69707de416b7949d08c165c950090bbb6
This landed in sssd 2.5.0, so Bionic, Focal, Hirsute and Impish all require fixing. The commit is a cherry pick to Focal, Hirsute and Impish, while Bionic requires a backport for minor context adjustments. |
|
2021-10-11 08:38:26 |
Matthias Winkler |
attachment added |
|
sssd_domain.log https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+attachment/5531887/+files/sssd_domain.log |
|
2021-10-11 11:52:28 |
Rex Goldsmith |
attachment added |
|
sssd with lp1921494-test patches applied https://bugs.launchpad.net/ubuntu/hirsute/+source/sssd/+bug/1921494/+attachment/5531905/+files/sssd_company.com.log |
|
2021-10-12 10:37:56 |
Matthias Winkler |
bug |
|
|
added subscriber Snakekick |
2021-10-19 01:28:31 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2021-10-20 10:44:12 |
Matthias Winkler |
attachment added |
|
ldap_search.JPG https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+attachment/5534569/+files/ldap_search.JPG |
|
2021-10-21 08:26:21 |
Matthias Winkler |
attachment added |
|
sssd_xxx.xx.de.log https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+attachment/5534803/+files/sssd_xxx.xx.de.log |
|
2022-01-26 21:58:03 |
Brian Murray |
sssd (Ubuntu Hirsute): status |
In Progress |
Won't Fix |
|
2022-07-18 22:58:41 |
Brian Murray |
sssd (Ubuntu Impish): status |
In Progress |
Won't Fix |
|