Activity log for bug #1723350

Date Who What changed Old value New value Message
2017-10-13 08:11:20 Marian Rainer-Harbach bug added bug
2017-10-13 08:11:20 Marian Rainer-Harbach attachment added sssd.conf https://bugs.launchpad.net/bugs/1723350/+attachment/4969606/+files/sssd.conf
2017-10-13 13:01:57 renbag attachment added sssd.conf https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1723350/+attachment/4969945/+files/sssd.conf
2017-10-13 13:02:31 renbag bug added subscriber Renzo Bagnati
2017-10-13 13:08:04 Launchpad Janitor sssd (Ubuntu): status New Confirmed
2017-10-13 21:45:39 Andreas Hasenack sssd (Ubuntu): status Confirmed Triaged
2017-10-13 21:45:46 Andreas Hasenack sssd (Ubuntu): importance Undecided Medium
2017-10-13 21:46:54 Andreas Hasenack bug added subscriber Ubuntu Server Team
2017-10-13 21:46:59 Andreas Hasenack bug added subscriber Andreas Hasenack
2017-12-22 16:18:26 cbvjohn bug added subscriber cbvjohn
2018-03-20 23:07:04 Simon Elmir summary sssd offline on boot, stays offline forever (artful) sssd offline on boot, stays offline forever (artful, bionic)
2018-03-20 23:08:52 Simon Elmir bug added subscriber Simon Elmir
2018-05-07 20:53:31 Andreas Hasenack sssd (Ubuntu): status Triaged Incomplete
2018-05-08 13:41:08 Andreas Hasenack nominated for series Ubuntu Artful
2018-05-08 13:41:08 Andreas Hasenack bug task added sssd (Ubuntu Artful)
2018-05-08 13:41:37 Andreas Hasenack sssd (Ubuntu): status Incomplete Fix Released
2018-05-08 13:41:57 Andreas Hasenack summary sssd offline on boot, stays offline forever (artful, bionic) sssd offline on boot, stays offline forever
2018-05-08 13:42:06 Andreas Hasenack sssd (Ubuntu Artful): status New Triaged
2018-05-08 13:42:11 Andreas Hasenack sssd (Ubuntu Artful): importance Undecided Medium
2018-05-09 10:53:05 renbag attachment added systemd-analyze_logs__artful.tgz https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1723350/+attachment/5136654/+files/systemd-analyze_logs__artful.tgz
2018-05-09 10:54:22 renbag attachment added var_log_sssd__artful.tgz https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1723350/+attachment/5136655/+files/var_log_sssd__artful.tgz
2018-06-10 01:09:38 OliFre bug added subscriber OliFre
2018-06-11 10:09:21 OliFre bug added subscriber Peter Wienemann
2018-07-11 04:27:56 Andrew Conway bug added subscriber Andrew Conway
2018-09-10 19:42:48 Mark Foster bug added subscriber Mark Foster
2018-09-10 22:56:19 Andreas Hasenack nominated for series Ubuntu Bionic
2018-09-10 22:56:19 Andreas Hasenack bug task added sssd (Ubuntu Bionic)
2018-09-10 22:56:28 Andreas Hasenack sssd (Ubuntu Artful): status Triaged Won't Fix
2018-09-18 16:55:47 Launchpad Janitor sssd (Ubuntu Bionic): status New Confirmed
2019-09-13 15:19:02 Orion-cora bug added subscriber Orion-cora
2019-10-04 11:50:26 meskaya bug added subscriber meskaya
2019-11-01 17:18:14 Andreas Hasenack sssd (Ubuntu Bionic): importance Undecided High
2019-11-06 17:33:24 Andreas Hasenack tags server-next
2020-05-11 17:38:31 Andreas Hasenack sssd (Ubuntu Bionic): assignee Andreas Hasenack (ahasenack)
2020-05-11 17:38:34 Andreas Hasenack sssd (Ubuntu Bionic): status Confirmed In Progress
2020-05-12 19:32:36 Andreas Hasenack nominated for series Ubuntu Eoan
2020-05-12 19:32:36 Andreas Hasenack bug task added sssd (Ubuntu Eoan)
2020-05-13 19:15:58 Andreas Hasenack sssd (Ubuntu Eoan): assignee Andreas Hasenack (ahasenack)
2020-05-13 19:16:00 Andreas Hasenack sssd (Ubuntu Eoan): status New In Progress
2020-05-13 21:17:56 Andreas Hasenack description SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd ldap-utils dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] * discussion of how regressions are most likely to manifest as a result of this change. * It is assumed that any SRU candidate patch is well-tested before upload and has a low overall risk of regression, but it's important to make the effort to think about what ''could'' happen in the event of a regression. * This both shows the SRU team that the risks have been considered, and provides guidance to testers in regression-testing the SRU. [Other Info] * Anything else you think is useful to include * Anticipate questions from users, SRU, +1 maintenance, security teams and the Technical Board * and address these questions in advance [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-13 21:22:38 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd ldap-utils dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] * discussion of how regressions are most likely to manifest as a result of this change. * It is assumed that any SRU candidate patch is well-tested before upload and has a low overall risk of regression, but it's important to make the effort to think about what ''could'' happen in the event of a regression. * This both shows the SRU team that the risks have been considered, and provides guidance to testers in regression-testing the SRU. [Other Info] * Anything else you think is useful to include * Anticipate questions from users, SRU, +1 maintenance, security teams and the Technical Board * and address these questions in advance [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd ldap-utils dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 14:58:09 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd ldap-utils dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd ldap-utils dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 17:07:09 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd ldap-utils dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 17:17:54 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 17:19:46 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com For the above steps, the log file being tailed will show this for the startup of sssd: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 And this for when the symlink is fixed: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 17:21:10 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com For the above steps, the log file being tailed will show this for the startup of sssd: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 And this for when the symlink is fixed: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com For the above steps, the log file being tailed will show this for the startup of sssd: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 And this for when the symlink is fixed: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 17:36:51 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd Repeat the sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com For the above steps, the log file being tailed will show this for the startup of sssd: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 And this for when the symlink is fixed: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 17:49:48 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 18:06:27 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP #debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF After unbreaking the symbolic link, in a few seconds (5s at most), sssctl should show the service as being online, if using the fixed packages. [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 18:13:48 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 18:34:02 Launchpad Janitor merge proposal linked https://code.launchpad.net/~ahasenack/ubuntu/+source/sssd/+git/sssd/+merge/384137
2020-05-18 18:36:49 Launchpad Janitor merge proposal linked https://code.launchpad.net/~ahasenack/ubuntu/+source/sssd/+git/sssd/+merge/384138
2020-05-18 18:39:51 Andreas Hasenack sssd (Ubuntu Eoan): importance Undecided High
2020-05-18 18:46:35 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] TBD [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-18 19:51:43 Mark Foster (ExtraHop) removed subscriber Mark Foster (ExtraHop)
2020-05-28 18:59:03 Leif M bug added subscriber Leif M
2020-05-29 14:31:57 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -f /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -F /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-05-29 15:15:34 Robie Basak sssd (Ubuntu Eoan): status In Progress Fix Committed
2020-05-29 15:15:36 Robie Basak bug added subscriber Ubuntu Stable Release Updates Team
2020-05-29 15:15:37 Robie Basak bug added subscriber SRU Verification
2020-05-29 15:15:42 Robie Basak tags server-next server-next verification-needed verification-needed-eoan
2020-05-29 15:15:58 Robie Basak sssd (Ubuntu Bionic): status In Progress Fix Committed
2020-05-29 15:16:04 Robie Basak tags server-next verification-needed verification-needed-eoan server-next verification-needed verification-needed-bionic verification-needed-eoan
2020-06-02 14:23:12 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -F /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip interface=$interface except-interface=lo bind-interface EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -F /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-06-02 14:24:17 Andreas Hasenack tags server-next verification-needed verification-needed-bionic verification-needed-eoan server-next verification-done-eoan verification-needed verification-needed-bionic
2020-06-02 14:38:09 Andreas Hasenack description [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip interface=$interface except-interface=lo bind-interface EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -F /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target [Impact] sssd can switch to an offline mode of operation when it cannot reach the authentication or id backend. It uses several methods to assess the situation, and one of them is monitoring the /etc/resolv.conf file for changes. In ubuntu that file is a symlink to /run/systemd/resolve/stub-resolv.conf, but the target doesn't exist at all times during boot. It's expected that symlink to be broken for a while during boot. Turns out that the monitoring that sssd was doing on /etc/resolv.conf didn't take into consideration that what could change was the *target* of the symlink. it completely ignored that fact, and didn't notice when the resolv.conf contents actually changed in this scenario, which resulted in sssd staying in the offline mode when it shouldn't. There are two fixes being pulled in for this SRU: a) fix the monitoring of the target of the /etc/resolv.conf symlink b) change the fallback polling code to keep trying, instead of giving up right away [Test Case] It's recommended to test this in a lxd container, or a vm. Preparation steps. When prompted for an openldap/slapd password, chose any password you want. It won't be needed again: $ sudo apt update && sudo apt dist-upgrade $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq Become root: $ sudo su - Detect your ip: # export interface=$(ip route | grep default | sed -r 's,^default via .* dev ([a-z0-9]+) .*,\1,') # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print $2}' | cut -d / -f 1) Confirm the $ip variable is correct for your case: # echo $ip Create /etc/dnsmasq.d/sssd-test.conf using your real ip: # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF host-record=ldap01.example.com,$ip listen-address=$ip interface=$interface except-interface=lo bind-interfaces EOF restart dnsmasq # systemctl restart dnsmasq a) inotify test Create /etc/sssd/sssd.conf: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF # chmod 0600 /etc/sssd/sssd.conf # rm /etc/resolv.conf # ln -s /etc/resolv.conf.target /etc/resolv.conf create good resolv.conf: # echo "nameserver $ip" > /etc/resolv.conf.good Confirm /etc/resolv.conf is a broken symlink: # ll /etc/resolv.conf* lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> /etc/resolv.conf.target -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good Open another terminal/screen and tail the sssd logs with a grep: # tail -F /var/log/sssd/sssd.log | grep -i resolv Start sssd # systemctl restart sssd The tail output from that other terminal should say sssd is monitoring /etc/resolv.conf (that's the broken symlink): (Mon May 18 17:32:34 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Repeat this sssctl call until it shows the offline mode persistently: # sssctl domain-status LDAP Online status: Offline Active servers: LDAP: not connected Discovered LDAP servers: - ldap01.example.com "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target The log should now say: (Mon May 18 17:33:30 2020) [sssd] [process_dir_event] (0x0400): Not interested in resolv.conf.target This shows that sssd didn't pick up that resolv.conf changed, via the target of the symlink. Run sssctl again, and the online status will remain offline. With the fixed packages, the sssd startup log will say: (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using function resolv_conf_inotify_cb after delay 1.0 Showing that it's monitoring the symlink target. And after fixing the broken symlink, it will say: (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received notification for watched file [resolv.conf.target] under /etc Run sssctl again, it should almost immediately switch to online: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com b) polling test Repeat the previous test, but with "try_inotify = false" in sssd.conf, like this: # cat > /etc/sssd/sssd.conf <<EOF [sssd] config_file_version = 2 services = nss, pam, ifp domains = LDAP debug_level = 6 try_inotify = false [domain/LDAP] id_provider = ldap ldap_uri = ldap://ldap01.example.com cache_credentials = True ldap_search_base = dc=example,dc=com EOF Then follow the other steps of test case (a). Upon startup, the sssd log will show: (Mon May 18 17:57:04 2020) [sssd] [monitor_config_file_fallback] (0x0080): file [/etc/resolv.conf] is missing. Will not update online status based on watching the file Note how it says it will not watch it for updates. With the fixed package, the log will show this instead: (Mon May 18 18:02:56 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. Notice how it says it will try again, and multiple times if you keep watching it (once every 10s). "Unbreak" the symlink: # cp /etc/resolv.conf.good /etc/resolv.conf.target And the log output will eventually show this: (Mon May 18 18:04:06 2020) [sssd] [monitor_config_file_fallback] (0x0020): file [/etc/resolv.conf] is missing. Will try again later. (Mon May 18 18:04:17 2020) [sssd] [signal_res_init] (0x0040): Reloading Resolv.conf. And sssctl will show the now online status: # sssctl domain-status LDAP Online status: Online Active servers: LDAP: ldap01.example.com Discovered LDAP servers: - ldap01.example.com [Regression Potential] Breaking sssd can mean preventing logins of network remote users on a system, but should still allow local users (and root) to login. Another possible source of regressions, but not related to the code changes in this update, if there are pre-existing incorrect changes to sssd.conf, they might prevent the service from restarting when this update is applied. [Other Info] Not at this time. [Original Description] SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were also affected) is offline on boot and seems to stay offline forever (I waited over 20 minutes). sssd_nss.log: (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline] ... SSSD immediately returns to normal operation after restarting it or after sending SIGUSR2. A workaround for the problem is creating the file /etc/systemd/system/sssd.service.d/override.conf with contents [Unit] Requires=network-online.target After=network-online.target
2020-06-02 14:47:09 Andreas Hasenack tags server-next verification-done-eoan verification-needed verification-needed-bionic server-next verification-done-bionic verification-done-eoan verification-needed
2020-06-09 13:01:36 Robie Basak removed subscriber Ubuntu Stable Release Updates Team
2020-06-09 13:01:35 Launchpad Janitor sssd (Ubuntu Eoan): status Fix Committed Fix Released
2020-06-09 13:01:53 Launchpad Janitor sssd (Ubuntu Bionic): status Fix Committed Fix Released