autofs races with sssd on startup

Bug #1566508 reported by Maciej Puzio on 2016-04-05
32
This bug affects 4 people
Affects Status Importance Assigned to Milestone
sssd (Ubuntu)
Medium
Victor Tapia
Declined for Zesty by Brian Murray
Trusty
Medium
Victor Tapia
Xenial
Medium
Victor Tapia
Yakkety
Medium
Victor Tapia

Bug Description

[Impact]

 * SSSD is set as started before its responders are active.
 * Depending services (e.g. autofs) start before those responders are working, and a manual restart is required to make them work after boot
 * This happens with upstart and, with less frequence, systemd

[Test Case]

 * Configure an LDAP server containing the direct mappings for autofs
 * Configure SSSD to read from that LDAP server
 * Add ‘automount: sss’ to /etc/nsswitch.conf
 * Reboot the machine, login, and try reading the contents of a direct mapping.
 * If the environment has booted correctly the mapping will be available. Otherwise, it will not.

[Regression Potential]

 * This is a cherry-pick from an upstream fix
 * sssd could not start automatically

[Other Info]

 * Upstream commit: https://git.fedorahosted.org/cgit/sssd.git/commit/?id=d4063e9a21a4e203bee7e0a0144fa8cabb14cc46

[Original Description]

This report concerns a configuration where autofs and sssd are both installed, sssd is configured to provide automount maps, and nsswitch.conf directs autofs to use sssd. In such a configuration autofs often fails on boot complaining "no mounts in table". This is because autofs may be started before sssd, or after sssd is started but before its autofs support is ready. If this happens, one can restart autofs and it will work fine.

This bug affects other users:
* Bug 40189 "autofs needs to be restarted to pick up some shares" - a very old bug with invalid status, but see last comment #46, complaining about Ubuntu trusty: https://bugs.launchpad.net/ubuntu/+source/autofs/+bug/40189/comments/46
* Link to SSSD-users mailing list, also complaining about Ubuntu trusty: https://lists.fedorahosted.org/pipermail/sssd-users/2015-July/003166.html

$ lsb_release -rd
Description: Ubuntu 14.04.4 LTS
Release: 14.04

$ apt-cache policy autofs sssd
autofs:
  Installed: 5.0.7-3ubuntu3.2
sssd:
  Installed: 1.11.5-1ubuntu3

$ uname -a
Linux **** 3.13.0-83-generic #127-Ubuntu SMP Fri Mar 11 00:25:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Workaround:
Edit /etc/init/autofs as shown in attached files (both full file and diff provided). Unfortunately, this modification will work only in this particular configuration, thus it is not a good candidate for a patch.
Explanation:
We have to deal with two problems here:
1. Autofs starts on runlevel [2345], and in effect its startup order in relation to sssd is random. We fix this by changing start stanza to "start on started sssd".
2. Unfortunately this is not enough, because sssd emits started event too early, before its autofs support is ready. To work around this, we add a loop to pre-start script that waits for sssd to start listening on /var/lib/sss/pipes/autofs.

Maciej Puzio (maciej-puzio) wrote :
Maciej Puzio (maciej-puzio) wrote :
Victor Tapia (vtapia) on 2016-06-22
tags: added: sts
Victor Tapia (vtapia) wrote :

As mentioned in the description, Autofs starts before SSSD's responders are running, even though SSSD is set to start before Autofs, both in upstart and systemd. The problem is related to the boot process of SSSD because upstart/systemd determine the job as started right after sssd forks, when in reality it's still starting the providers and responders that other services (e.g. autofs) might require.

Regarding AutoFS's required start: if AutoFS starts before SSSD, it'll have an empty mapping table. Once SSSD is running AutoFS will populate it, but only the indirect maps would be applied. According to its man page: Direct maps require a HUP signal be sent to the daemon to refresh their contents as does the master map.

In previous releases, SSSD delayed the creation of its pidfile to address this issue (https://git.fedorahosted.org/cgit/sssd.git/commit/?id=d19c4785215305e6eb5f2fa2fc503a2ba50d3f10). I'll try to bring this back and experiment with the results.

I'm going to provide a different workaround based on sssd.

Victor Tapia (vtapia) wrote :
Victor Tapia (vtapia) wrote :
Timo Aaltonen (tjaalton) wrote :

so this is an issue with systemd too?

Victor Tapia (vtapia) wrote :

Yes. Systemd takes longer to set sssd as started, making the time window smaller for the race condition to happen, but it can still appear when the machine boots fast enough (VMs, for instance, show this issue consistently)

Maciej Puzio (maciej-puzio) wrote :

I finally had a chance to test this issue in Ubuntu 16.04. In a setup described as in the bug description above, autofs still won't start correctly, and this is a result of the whole string of problems. Most of them appear to be separate bugs, but I will list them here for completeness:

1. ifup returns before interface is up.
2. ifup@.service finishes before interface is up. This is for two reasons: because the service calls ifup, and because the service has wrong type (simple instead of oneshot). As a result, network-online.target is reached at a wrong time.
3. sssd.service does not wait for network-online.target.
4. autofs reverted in Xenial to a SysV-style script and does not have a systemd-style service file. Thus it is difficult to request that it is started after sssd. However, fixing #2 causes systemd to run SysV scripts much later, providing a relief for problems #4 and #5.
5. The issue being subject of this bug report is very likely still present, though I was unable to reproduce it exactly. Unfixed issue #2 caused auto fs to fail with a different error message ("setautomntent: lookup(sss): setautomntent: No such file or directory"), while fixed issue #2 hid the bug. The workaround involving waiting for sssd to start listening on /var/lib/sss/pipes/autofs can still be used for extra safety. I will test this further.

I will work more on these problems and submit bug reports for them if they haven't been reported yet.

Jakub Hrozek (jakub-hrozek) wrote :

As long as there are any maps in the cache, these fixes should help:
https://<email address hidden>/message/QKU5H4VUCIZ43LBJTRPPK3XWL6CTQNQ4/

(but upstream didn't merge them yet)

Maciej Puzio (maciej-puzio) wrote :

Jakub, thanks for the information.
Launchpad mangles the URL that you posted in an attempt to hide email addresses from public view, here it is in an equivalent form that does not include '@':
https://lists.fedorahosted.org/archives/list/sssd-devel%40lists.fedorahosted.org/message/QKU5H4VUCIZ43LBJTRPPK3XWL6CTQNQ4/

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in autofs (Ubuntu):
status: New → Confirmed
Changed in sssd (Ubuntu):
status: New → Confirmed
Victor Tapia (vtapia) wrote :

>5. The issue being subject of this bug report is very likely still present, though I was unable to
>reproduce it exactly. Unfixed issue #2 caused auto fs to fail with a different error message
>("setautomntent: lookup(sss): setautomntent: No such file or directory"), while fixed issue #2 hid
>the bug. The workaround involving waiting for sssd to start listening on /var/lib/sss/pipes/autofs
>can still be used for extra safety.

Just for completeness, there are two different bugs going on:

- SSSD set as started before the responders are up, having AutoFS trying to connect to them ("setautomntent: lookup(sss): setautomntent: No such file or directory"). This has been reported at https://fedorahosted.org/sssd/ticket/3080

- SSSD trying to connect to its providers (LDAP, for instance) before the network is ready. This is the case you have detailed here, and is covered by Jakub's patches.

In my tests Xenial/Systemd can hit both of them, while Trusty/Upstart hits just the first one.

Maciej Puzio (maciej-puzio) wrote :

Victor, you are right saying that we are dealing with more than one bug here. It appears to me that we have 3 or 4 separate bugs all conspiring to prevent autofs from starting in the configuration as described above. I think that the "setautomntent" error message is a symptom of a different bug than the one causing the "no mounts in table" error. I experienced both problems in Trusty/upstart, but so far only the "setautomntent" bug in Xenial/Systemd/autofs-with-SysV. According to my tests, the "setautomntent" bug is caused by sssd starting before network is ready, while "no mounts in table" results from sssd reaching ready state before its responders are up. However, Xenial startup is a total mess, and I am still trying to figure out what is really happening, so I may be wrong in the details.

Jakub Hrozek (jakub-hrozek) wrote :

The message "setautomntent: lookup(sss): setautomntent: No such file or directory" is not indicative of any bug whasoever. It just means there are no more entries in this maps.

But in general, I think the most systematic way forward for this kind of startup races would be to socket-activate the responders and switch to let systemd manage all the responders. sssd should not try to be a service manager.

Maciej Puzio (maciej-puzio) wrote :

Jakub, the "setautomntent" message occurs at the same time when autofs fails to mount shares on startup. So, while itself it is not a bug and nobody here suggests that the message should be silenced, it is nonetheless one of the symptoms of the problem, which is caused most likely by a number of individual bugs, none of them actually in autofs code. Similarly a fever is not an illness, but it is usually a symptom of one.

Maciej Puzio (maciej-puzio) wrote :

In order to test whether the bug which is the subject of this bug report (i.e. "no mounts in table", not "setautomntent") occurs in Xenial with systemd, I created autofs.service as attached below. This way I could dispense with /etc/init.d/autofs and directly control when autofs is started. I can report that I was unable to reproduce this bug. When autofs was configured to start after sssd, and sssd started after network was up, autofs mounted all shares as expected and no error messages were logged.

On the other hand, if both sssd and autofs started before network was up, shares would not be mounted, and syslog would contain an error message "setautomntent: lookup(sss): setautomntent: No such file or directory".
If autofs was started before sssd, the error message would be "setautomntent: lookup(sss): setautomntent: Connection refused".
Ensuring that network interfaces were up before systemd reached network.target and network-online.target was a pain in the neck and an exercise of frustration. But this is material for another bug report.

Maciej Puzio (maciej-puzio) wrote :

Here is the autofs.service file mentioned above.

Maciej Puzio (maciej-puzio) wrote :

Issues related to ifup and service files included in ifupdown package are reported as bug 1588915.

Maciej Puzio (maciej-puzio) wrote :

Here is my attempt to sort out how many bugs are related to autofs failing on boot.

This bug (1566508)
As noted earlier, these are actually two issues:
1. autofs starts before sssd.
2. autofs starts after sssd, but before sssd responders are up.
On Trusty, I observed #1 and #2, and on Xenial I observed neither, while Victor Tapia reported only observing #2 on Xenial (comment #13).
Regarding the apparent fix for #1 in Xenial, I believe that it is accidental, and the proper fix is subject of bug 1614248. Regarding #2 in Xenial, I can only say that I tried very hard to reproduce it, and I could not.

In this bug we also discuss another issue that was not subject of the original bug report:
3. sssd trying to connect to its providers (e.g. LDAP) before network is ready.
On Trusty, I observed it only very rarely, but in Xenial this issue occurred on every boot. Victor Tapia reports observing this problem on both Trusty and Xenial.

I do not think that creating a new autofs or sssd bug for this issue would be a good idea, because this problem is not caused directly by autofs or sssd, but rather is a result of ifupdown bug 1588915 (at least this is so on Xenial).

However this bug would be no reason of concern if autofs were able to recover from an initial map read failure. There are two issues here, both covered by bug 1614282:
4. autofs does not appear to read cached automount maps from sssd.
5. autofs does not mount shares even when it is finally able to read maps from sssd, after network is up and sssd has been able to receive maps from its provider (e.g. LDAP).

Regarding #4, the problem may lie with sssd rather than autofs. I am not sure if this issue is related to problem mentioned by Jakub Hrozek in comment #9. However, Jakub's patch will fix #4 (possibly), but not #5; we also want autofs to mount shares, when possible, even when there were no maps in cache on startup.
As to Victor's comment #3, I observed the problem with indirect maps. All maps, including the master map, come from the LDAP server, so perhaps it's the master map that is not updated automatically, rendering everything unusable. Unfortunately, my tests with sending HUP to automount daemon did not produce results as described in the manpage (yet another bug?). In any case, the logic that manpage describes as "change on the fly" should not apply to the situation when there is no master map whatsoever due to an earlier error (as is our case).

Links to related bugs:
Bug 1588915 ifup does not block for interface to be up with static addresses
Bug 1614248 Package autofs does not include autofs.service file
Bug 1614282 autofs does not recover from failure to read maps on startup
Bug 1584818 autofs fails to read sss [duplicate, but difficult to say what of; most likely referring to issue #3]

Malcolm-whsg (malcolm-whsg) wrote :

We are experimenting with using a Debian based distro
and this bug seems to be in all of them. We currently use
OpenSUSE 13.2/Leap 42.1 and this works perfectly. Is it a
daft question to ask why it works for them and not you ?

Ta

M

Maciej Puzio (maciej-puzio) wrote :

> Is it a daft question to ask why it works for them and not you ?

This is a very good question. My take on it is as follows.

There are several issues here, and the majority are related to startup configuration. Speaking of Ubuntu 16.04, the problem is that Ubuntu first switched from SysV startup to Upstart, and then (quite recently) to systemd. Switch to Upstart was difficult and there were many problems similar to this one, but before they were all ironed out, another switch to systemd occurred. The result is a total mess. As an example, systemd service unit is currently missing from autofs package, and so while for sssd the switch was from Upstart to systemd, for autofs it was actually Upstart -> SysV. And this is not the only issue.

Ubuntu 14.04, on the other hand, was released before the Upstart -> systemd switch. Here the problem is that Upstart's event-driven model is fundamentally flawed, and unable to express startup dependencies in a general case (as opposed to installation-specific setup). This is the reason why Upstart-related problems were not entirely fixed before the switch to systemd: because they could not be fixed within Upstart framework.

I have never used OpenSUSE, but I believe that it did not switch its init system to Upstart, but rather directly from SysV to systemd. And it did so much earlier than Ubuntu. Because of that it had a headstart and much less mess to clean up, having to deal with only one switch. No wonder then that its startup works better.

mush (f-mutshe) wrote :

I wasn't able to fix the issue with the solutions proposed up there,
so I managed to find another workaround for autofs to be operational on startup.
The only difference is that im using unbound to resolve my DNS.
Just add this line under EnvironnementFile in /etc/systemd/system/multi-user.target.wants/sssd.service
  EnvironmentFile=-/etc/sysconfig/sssd
+ ExecStartPre=/bin/sleep 10

Then we have to create a script which is going to make sleep 10 sec restart unbound sleep again 10sec then restart autofs in /etc/init.d/restartautofs
After creating the script like the attachment I posted just run:
update-rc.d restartautofs defaults
To be sure the script starts at the end change the names of the 01restartautofs to S99restartautofs
(in each levels /etc/rc2.d/ /etc/rc3.d/ /etc/rc4.d/ /etc/rc5.d/)
So we are good to go with this workaround sssd and networking services are ok before autofs start so its mapping correctly.

Malcolm-whsg (malcolm-whsg) wrote :

#23 works for me - fantastic

Thanks

Mal

Victor Tapia (vtapia) wrote :

A patch has been submitted upstream for review. As soon as it lands, I'll work on an SRU.

Regards,

Victor

Maciej Puzio (maciej-puzio) wrote :

Victor, would you be so kind to post a link to your patch, and also tell us which of the numerous issues discussed in this bug report (comment #20) are solved by the patch?

Victor Tapia (vtapia) wrote :

Hi Maciej,

I'm attaching the suggested patch to this bug, but bear in mind it's adapted to upstream's code.

Originally, SSSD creates the pidfile at an early stage (once it forks) before the providers have started. This patch moves the pidfile creation to a point where the providers are up and ready, allowing us to monitor this file creation in the boot scripts for upstart/systemd. With a few tweaks in the boot scripts, this would fix points #1 and #2 in the list.

Regarding #3, SSSD monitors the status of the network interfaces using libnl, triggering an internal restart of the services when a change happens. It don't think this is a problem by itself, rather than a bad combination with #4

Maciej Puzio (maciej-puzio) wrote :

Victor, thanks a lot for the explanation. I apologize for not volunteering to test your patch, but it fixes an issue (#2) that I am unable to reproduce in Xenial. But I really appreciate your efforts.

Victor Tapia (vtapia) wrote :

The patch was merged[1], so I'll start working in the SRU as soon as I can.

https://git.fedorahosted.org/cgit/sssd.git/commit/?id=d4063e9a21a4e203bee7e0a0144fa8cabb14cc46

Victor Tapia (vtapia) on 2017-01-24
Changed in autofs (Ubuntu):
status: Confirmed → Invalid
Changed in sssd (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Victor Tapia (vtapia)
Victor Tapia (vtapia) on 2017-01-24
description: updated
no longer affects: autofs (Ubuntu)
Changed in sssd (Ubuntu):
importance: Undecided → Critical
importance: Critical → Medium
tags: added: sts-sponsor sts-sru
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package sssd - 1.15.0-3ubuntu1

---------------
sssd (1.15.0-3ubuntu1) zesty; urgency=medium

  * Build without the secrets service as libhttp-parser2.1 is in universe. An
    MIR is pending in LP 1638957; when this is complete, we can revert this.
    - Configure with --without-secrets.
    - Drop build depends on libhttp-parser-dev and libjansson-dev. These are
      only needed for the "secrets service".
    - Remove secrets service -related files from d/sssd-common.install and in
      d/rules.

 -- Robie Basak <email address hidden> Tue, 07 Feb 2017 19:37:45 +0000

Changed in sssd (Ubuntu):
status: In Progress → Fix Released
quess (quess) wrote :

Hi

sorry for my lack of knowledge, but regarding comment #32, I'm waiting for a fix in xenial repo.
It's seems to be present on the server (http://fr.archive.ubuntu.com/ubuntu/pool/main/s/sssd/sssd_1.13.4-1ubuntu1.2_amd64.deb), but probably not in the index of the mirror...

When will it be available on a regular update process?

Best regards,

quess (quess) wrote :

For Ubuntuers dummies like me :)
activate xenial-proposed via https://wiki.ubuntu.com/Testing/EnableProposed to install sssd_1.13.4-1ubuntu1.2 on xenial

Victor Tapia (vtapia) wrote :
Victor Tapia (vtapia) wrote :
Victor Tapia (vtapia) wrote :
Victor Tapia (vtapia) on 2017-03-09
Changed in sssd (Ubuntu Trusty):
assignee: nobody → Victor Tapia (vtapia)
Changed in sssd (Ubuntu Xenial):
assignee: nobody → Victor Tapia (vtapia)
Changed in sssd (Ubuntu Yakkety):
assignee: nobody → Victor Tapia (vtapia)
importance: Undecided → Medium
Changed in sssd (Ubuntu Xenial):
importance: Undecided → Medium
Changed in sssd (Ubuntu Trusty):
importance: Undecided → Medium

Hello Maciej, or anyone else affected,

Accepted sssd into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sssd/1.13.4-3ubuntu0.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in sssd (Ubuntu Yakkety):
status: New → Fix Committed
tags: added: verification-needed
Changed in sssd (Ubuntu Xenial):
status: New → Fix Committed
Timo Aaltonen (tjaalton) wrote :

Hello Maciej, or anyone else affected,

Accepted sssd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sssd/1.13.4-1ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Maciej Puzio (maciej-puzio) wrote :

I did brief testing of sssd-1.13.4-1ubuntu1.3 on Xenial. Autofs still fails to properly start on boot, but I am not sure which of the numerous issues described in this bug report contributes to the failure now. I will do more testing after the weekend.

Maciej Puzio (maciej-puzio) wrote :

Here are the results on my tests of package sssd-1.13.4-1ubuntu1.3 on Xenial.

As I noted in comment #20, this bug report involves two separate issues, both contributing to failure of autofs to properly start on boot:
1. autofs starts before sssd.
2. autofs starts after sssd, but before sssd responders are up.

On Trusty I observed both issues, and on Xenial only issue #1 (contrary to my previous comment #20, I did observe this issue in further tests). As I understand, the new version of the package deals with issue #2. Since I never experienced this issue on Xenial, despite numerous tests intended to reproduce it, and the issue #1 remains unfixed, I can only say that changes introduced in sssd-1.13.4-1ubuntu1.3 have no effect in the configuration that I used for my Xenial test system. However, I a weary from changing the tag to verification-failed, as a fix to "what ain't broke" is neither a success nor a failure.

This does not mean that the fixes are not valid, and in fact I think they may have positive effect in Trusty. As I understand, the fixed package for Trusty has not yet been released.

I did not test the package in Yakkety, as I do not use non-LTS Ubuntu versions. I would be grateful if Yakkety users could test the package.

Louis Bouchard (louis) on 2017-03-22
Changed in sssd (Ubuntu Trusty):
status: New → In Progress
tags: removed: sts-sponsor
Victor Tapia (vtapia) wrote :

Hi Maciej,

After a verification run, I can confirm that it still fails for . I'll update my fixes and upload them again.

tags: added: verification-failed
removed: verification-needed
tags: added: sts-sru-needed
removed: sts-sru
Timo Aaltonen (tjaalton) wrote :

Hello Maciej, or anyone else affected,

Accepted sssd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sssd/1.13.4-1ubuntu1.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-failed
tags: added: verification-needed
Timo Aaltonen (tjaalton) wrote :

Hello Maciej, or anyone else affected,

Accepted sssd into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sssd/1.13.4-3ubuntu0.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Victor Tapia (vtapia) wrote :
Download full text (5.9 KiB)

# VERIFICATION FOR XENIAL

I prepared a reproducer based on the description details (LDAP + NFS) using an entry_cache_timeout of 88000 in sssd.conf to ensure the cache was valid during the validation run. From a remote machine, I ran this script:

#!/bin/bash
OK=0
KO=0
while true ; do
#date
nova reboot vtapia-xenial
sleep 60
nc -z cases 22 || sleep 30
ssh -o PreferredAuthentications=publickey -o PubkeyAuthentication=yes ubuntu@vtapia-xenial "tail /var/log/syslog -n200 | grep -Ei 'DHCP|autofs|automount|sssd' | grep -v apparmor && ls /direct/ok"
if [ $? != 0 ] ; then
KO=$((KO + 1))
break
else
OK=$((OK + 1))
fi
echo "$OK $KO"
done

This script reboots the machine using sssd and tries to access a direct mapping (/direct/). If the boot order is successful (i.e. autofs starts after sssd and its responders have started), the mapping will be available and the file in it ("ok") will be too. If the test passes, the machine is restarted and checked again.

This is an excerpt of the output:

Warning: Permanently added 'vtapia-xenial,10.5.1.88' (ECDSA) to the list of known hosts.
Mar 30 13:31:41 vtapia-xenial dhclient[780]: DHCPDISCOVER on ens3 to 255.255.255.255 port 67 interval 3 (xid=0xe1d78e6a)
Mar 30 13:31:41 vtapia-xenial dhclient[780]: DHCPREQUEST of 10.5.1.88 on ens3 to 255.255.255.255 port 67 (xid=0x6a8ed7e1)
Mar 30 13:31:41 vtapia-xenial sh[767]: DHCPREQUEST of 10.5.1.88 on ens3 to 255.255.255.255 port 67 (xid=0x6a8ed7e1)
Mar 30 13:31:41 vtapia-xenial sh[767]: DHCPOFFER of 10.5.1.88 from 10.5.1.102
Mar 30 13:31:41 vtapia-xenial dhclient[780]: DHCPOFFER of 10.5.1.88 from 10.5.1.102
Mar 30 13:31:41 vtapia-xenial dhclient[780]: DHCPACK of 10.5.1.88 from 10.5.1.102
Mar 30 13:31:41 vtapia-xenial sh[767]: DHCPACK of 10.5.1.88 from 10.5.1.102
Mar 30 13:31:41 vtapia-xenial root: /etc/dhcp/dhclient-enter-hooks.d/samba returned non-zero exit status 1
Mar 30 13:31:41 vtapia-xenial sssd: Starting up
Mar 30 13:31:41 vtapia-xenial sssd[be[openstacklocal]]: Starting up
Mar 30 13:31:41 vtapia-xenial sssd[autofs]: Starting up
Mar 30 13:31:41 vtapia-xenial sssd[nss]: Starting up
Mar 30 13:31:41 vtapia-xenial sssd[pam]: Starting up
Mar 30 13:31:41 vtapia-xenial systemd[1]: Starting LSB: Automounts filesystems on demand...
Mar 30 13:31:41 vtapia-xenial autofs[1143]: * Starting automount...
Mar 30 13:31:41 vtapia-xenial automount[1171]: Starting automounter version 5.1.1, master map /etc/auto.master
Mar 30 13:31:41 vtapia-xenial automount[1171]: using kernel protocol version 5.02
Mar 30 13:31:42 vtapia-xenial automount[1171]: mounted direct on /wololo with timeout 300, freq 75 seconds
Mar 30 13:31:42 vtapia-xenial automount[1171]: mounted direct on /direct with timeout 300, freq 75 seconds
Mar 30 13:31:42 vtapia-xenial automount[1171]: mounted indirect on /home with timeout 300, freq 75 seconds
Mar 30 13:31:42 vtapia-xenial autofs[1143]: ...done.
Mar 30 13:31:42 vtapia-xenial systemd[1]: Started LSB: Automounts filesystems on demand.
Mar 30 13:32:56 vtapia-xenial automount[1171]: attempting to mount entry /home/ubuntu
Mar 30 13:32:56 vtapia-xenial automount[1171]: mounted /home/ubuntu
/direct/ok
1815 0

The log shows that the machine has restarted 1815 tim...

Read more...

tags: added: verification-done-xenial
Maciej Puzio (maciej-puzio) wrote :

I tested sssd-1.13.4-1ubuntu1.4 on xenial on a bare metal machine equipped with Intel X550T, a network controller notorious for long and somewhat random startup initialization time. I am happy to report that autofs boot problems still occur, but at a much diminished rate. While previously autofs was failing at nearly every boot, now I observed only one failure in 50 tries. I believe that this improvement comes from "autofs.service" being added to "Before" line in sssd.service file (a change introduced in sssd-1.13.4-1ubuntu1.4). When I removed this modification, autofs tended to start before sssd, and failed much more often with an error "setautomntent: lookup(sss): setautomntent: Connection refused". On the other hand, with the modification in place, the very occasional failures were accompanied with an error "setautomntent: lookup(sss): setautomntent: No such file or directory", which results from network interface coming online after sssd and autofs have already started. However, the topic of this bug is a race between sssd and autofs, and not the race between sssd and autofs on one side, and network initialization on the other (bug 1588915), even though effects are similar. Thus I would vote to consider this bug fixed.

Victor Tapia (vtapia) wrote :

# VERIFICATION FOR YAKKETY

Using the same script for Yakkety, I can confirm that the fix works as expected. This is the log after 1300 succesful reboots:

Warning: Permanently added 'vtapia-yakkety,10.5.1.90' (ECDSA) to the list of known hosts.
Mar 31 08:40:31 vtapia-yakkety systemd[1]: Started LSB: automatic crash report generation.
Mar 31 08:40:32 vtapia-yakkety sssd: Starting up
Mar 31 08:40:32 vtapia-yakkety sssd[be[openstacklocal]]: Starting up
Mar 31 08:40:33 vtapia-yakkety sssd[pam]: Starting up
Mar 31 08:40:33 vtapia-yakkety sssd[autofs]: Starting up
Mar 31 08:40:33 vtapia-yakkety sssd[nss]: Starting up
Mar 31 08:40:33 vtapia-yakkety autofs[1714]: * Starting automount...
Mar 31 08:40:33 vtapia-yakkety automount[1799]: Starting automounter version 5.1.1, master map /etc/auto.master
Mar 31 08:40:33 vtapia-yakkety automount[1799]: using kernel protocol version 5.02
Mar 31 08:40:34 vtapia-yakkety automount[1799]: mounted direct on /wololo with timeout 300, freq 75 seconds
Mar 31 08:40:34 vtapia-yakkety automount[1799]: mounted direct on /direct with timeout 300, freq 75 seconds
Mar 31 08:40:34 vtapia-yakkety automount[1799]: mounted indirect on /home with timeout 300, freq 75 seconds
Mar 31 08:40:34 vtapia-yakkety autofs[1714]: ...done.
/direct/ok
1300 0

tags: added: verification-done-yakkety
mush (f-mutshe) wrote :

I tested sssd-1.13.4-1ubuntu1.4, and after several reboot on xenial distro I can confirm that the bug is fixed.

An upload of sssd to trusty-proposed has been rejected from the upload queue for the following reason: "This upload adds a new patch and renames existing patches.".

Victor Tapia (vtapia) on 2017-04-12
tags: added: verification-failed
removed: verification-done-xenial verification-done-yakkety verification-needed
Victor Tapia (vtapia) on 2017-04-12
tags: added: verification-failed-xenial verification-failed-yakkety
removed: verification-failed
Łukasz Zemczak (sil2100) wrote :

I will be accepting the latest upload to -proposed but please be sure to include *all* changes mentioned in the debian/changelog. One of the previous uploads was missing the mention of the sssd-common.sssd.upstart.in file being changed - this should be mentioned along with the patch name.

tags: added: verification-needed

Hello Maciej, or anyone else affected,

Accepted sssd into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sssd/1.13.4-3ubuntu0.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Łukasz Zemczak (sil2100) wrote :

Hello Maciej, or anyone else affected,

Accepted sssd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sssd/1.13.4-1ubuntu1.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.