autofs races network interfaces, ends up not working

Bug #733914 reported by Kees Cook on 2011-03-12
50
This bug affects 9 people
Affects Status Importance Assigned to Milestone
autofs5 (Ubuntu)
Medium
Canonical Server Team
Lucid
Undecided
Unassigned
Maverick
Undecided
Unassigned
Natty
Medium
Canonical Server Team

Bug Description

Binary package hint: autofs5

When autofs starts, the network may not be up yet. "started net-device-up IFACE!=lo" does not handle multi-homed machines, bridging, etc. autofs needs to wait until all configured networking has finished coming up before starting.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: autofs5 5.0.5-0ubuntu4
ProcVersionSignature: Ubuntu 2.6.38-6.34-generic 2.6.38-rc7
Uname: Linux 2.6.38-6-generic x86_64
Architecture: amd64
Date: Sat Mar 12 08:58:22 2011
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: autofs5
UpgradeStatus: Upgraded to natty on 2006-11-27 (1565 days ago)

Related branches

Kees Cook (kees) wrote :
tags: added: regression-release
Changed in autofs5 (Ubuntu Natty):
assignee: nobody → Canonical Server Team (canonical-server)
milestone: none → ubuntu-11.04-beta-1
importance: Undecided → Medium
Download full text (3.3 KiB)

I've been thinking a lot about how to handle bugs like these, which seem
very common.

The issue seems to be that there is no event one can point to as "all
configured networking has finished coming up".

Or, is there. "started networking" means ifup -a has returned. From
that, we can imply that all interfaces in /etc/network/interfaces with
an 'auto' stanza have been configured in some way. For dhcp, it means
dhclient has been spawned. For static, it means the configuration is
done.

So this would imply most cases of

net-device-up IFACE!=lo

Can be replaced with

started networking

However, this ignores that dhcp and network-manager owned interfaces
will be missed in this case. Also, there's conceivably situations where
a pre or post script in the ifup configuration blocks and waits for
other things to start. I think this is better handled in the ifup script
hooks though... this seems rather backwards and maybe should be well
documented as a Bad Idea.

There's also a suggestion to use IP_FREEBIND. This is a linux specific
sockopt that makes binding on a socket not care if the IP isn'g actually
available on that socket. Once the IP is available, great, it will be
responded on, but otherwise this is just a dead listening daemon. We
would have to patch a lot of daemons to rely on this.

I think whats needed first is the network-services abstraction I
proposed in bug #701576 . This will allow us to solve this problem in
one place, rather than in the 10 or 15+ (and growing) packages that
define their own start on criteria.

Once thats done, I'd suggest that we make it clear what sorts of
configurations are supported as default, and make sure that works no
matter what. In the case of autofs, changing it to 'started networking'
will at least provide guarantees that it won't start until all static
interfaces have been configured. However it also means if something
in /etc/network/if-*.d depends on files located on autofs managed
mounts, the boot may stall in a circular dependency loop where autofs
cannot start because networking has not started, and networking is
waiting on autofs to start to finish. Other than a timeout, I cannot see
a good way to guard against that danger.

On Sat, 2011-03-12 at 17:03 +0000, Launchpad Bug Tracker wrote:
> Kees Cook (kees) has assigned this bug to you for autofs5 in Ubuntu Natty:
>
> Binary package hint: autofs5
>
> When autofs starts, the network may not be up yet. "started net-device-
> up IFACE!=lo" does not handle multi-homed machines, bridging, etc.
> autofs needs to wait until all configured networking has finished coming
> up before starting.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 11.04
> Package: autofs5 5.0.5-0ubuntu4
> ProcVersionSignature: Ubuntu 2.6.38-6.34-generic 2.6.38-rc7
> Uname: Linux 2.6.38-6-generic x86_64
> Architecture: amd64
> Date: Sat Mar 12 08:58:22 2011
> ProcEnviron:
> LANGUAGE=en_US:en
> PATH=(custom, user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> SourcePackage: autofs5
> UpgradeStatus: Upgraded to natty on 2006-11-27 (1565 days ago)
>
> ** Affects: autofs5 (Ubuntu)
> Importance: Undecided
> Assignee: Canonical Server Team (canonical-server)
> Status: Ne...

Read more...

Dave Walker (davewalker) on 2011-03-18
tags: added: server-nrs
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package autofs5 - 5.0.5-0ubuntu5

---------------
autofs5 (5.0.5-0ubuntu5) natty; urgency=low

  * Improve autofs.conf upstart script. Prevent race
    when trying to start networking. (LP: #733914)
  * debian/autofs5-ldap.install: Install schema in the right place.
    (LP: #699855)
  * Suggest smbfs if you want to use cifs. (LP: #579857)
  * Dropped 13ldap_module_linkage.dpatch no longer needed.
  * Refresh with missing upstream patches.
 -- Chuck Short <email address hidden> Sat, 02 Apr 2011 22:25:34 -0400

Changed in autofs5 (Ubuntu Natty):
status: New → Fix Released
Clint Byrum (clint-fewbar) wrote :

The fix for this:

--- autofs5-5.0.5/debian/autofs5.autofs.upstart
+++ autofs5-5.0.5/debian/autofs5.autofs.upstart
@@ -4,56 +4,15 @@
 start on (filesystem
- and net-device-up IFACE!=lo)
+ and net-device-up
+ and mounting TYPE=nfs)
 stop on runlevel [!2345]

Causes mountall to fail to mount any NFS mounts in /etc/fstab.

The reason is mounting TYPE=nfs comes every time an NFS mount is *attempted* ... even when it will fail..

TO test this:

Add this to /etc/fstab (replacing ip/dir with a valid NFS mount of course):

192.168.122.1:/home/clint /mnt nfs ro,nolock 0 0

With the previous version of autofs, this would work fine on reboot. Now install 5.0.5-0ubuntu5 .. on reboot, /mnt will not be mounted.

Also there is no explanation given as to why we are now ignoring /etc/default/autofs which is a bug since upgrades from Maverick will break.

Dan Bishop (danbishop) wrote :

Autofs still fails to start on a clean natty install today. I've checked the package version and it's 5.0.5-0ubuntu5...

Jungle Boy (mowgli80) wrote :

I am seeing the same problem with all updates ... have to start autofs manually

Excerpts from Jungle Boy's message of Sun Apr 10 09:27:04 UTC 2011:
> I am seeing the same problem with all updates ... have to start autofs
> manually

Hi Jungle Boy,

what version didn't work for you? 5.0.5-0ubuntu6 was just uploaded
which should fix the automatic start.

>
> --
> You received this bug notification because you are a member of Canonical
> Server Team, which is a bug assignee.
> https://bugs.launchpad.net/bugs/733914
>
> Title:
> autofs races network interfaces, ends up not working
>
> Status in “autofs5” package in Ubuntu:
> Fix Released
> Status in “autofs5” source package in Natty:
> Fix Released
>
> Bug description:
> Binary package hint: autofs5
>
> When autofs starts, the network may not be up yet. "started net-
> device-up IFACE!=lo" does not handle multi-homed machines, bridging,
> etc. autofs needs to wait until all configured networking has finished
> coming up before starting.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 11.04
> Package: autofs5 5.0.5-0ubuntu4
> ProcVersionSignature: Ubuntu 2.6.38-6.34-generic 2.6.38-rc7
> Uname: Linux 2.6.38-6-generic x86_64
> Architecture: amd64
> Date: Sat Mar 12 08:58:22 2011
> ProcEnviron:
> LANGUAGE=en_US:en
> PATH=(custom, user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> SourcePackage: autofs5
> UpgradeStatus: Upgraded to natty on 2006-11-27 (1565 days ago)

Vince Marsters (vincemarsters) wrote :

5.0.5-0ubuntu6 has resolved the problem for me

Dan Bishop (danbishop) wrote :

5.0.5-0ubuntu6 working perfectly for me too :)

Jungle Boy (mowgli80) wrote :

hey Clint, its working now with 5.0.5-0ubuntu6 :)

Clint Byrum (clint-fewbar) wrote :

Glad to hear that this is working better for people, thanks for the feedback!

I'm a little torn as to whether this would be appropriate for SRU to lucid.

On one hand I think it works better for most people.

On the other hand, it changes the boot behavior in a way that *might* break peoples' systems if they are somehow depending on autofs for bringing up services which support network interfaces beyond the "first" one.

Still, the bug is in lucid, so I've opened a task for it, and maverick.

Changed in autofs5 (Ubuntu Lucid):
status: New → Confirmed
Changed in autofs5 (Ubuntu Maverick):
status: New → Confirmed
James (james-jamesgao) wrote :

Hi, I'm still having problems with autofs starting correctly when I boot. My network has an LDAP server that hands out the maps, so I have autofs5-ldap installed. Startup worked fine in Lucid and Maverick, but Natty refuses to mount on a boot without restarting autofs. This is what's in the logs on an incorrect start:

May 13 12:05:40 nitrous automount[858]: Starting automounter version 5.0.5, master map /etc/auto.master
May 13 12:05:40 nitrous automount[858]: using kernel protocol version 5.02
May 13 12:05:40 nitrous automount[858]: lookup_nss_read_master: reading master file /etc/auto.master
May 13 12:05:40 nitrous automount[858]: parse_init: parse(sun): init gathered global options: (null)
May 13 12:05:40 nitrous automount[858]: lookup_read_master: lookup(file): read entry +auto.master
May 13 12:05:40 nitrous automount[858]: lookup_nss_read_master: reading master ldap auto.master
May 13 12:05:40 nitrous automount[858]: parse_init: parse(sun): init gathered global options: (null)
May 13 12:05:40 nitrous automount[858]: lookup(file): failed to read included master map auto.master
May 13 12:05:40 nitrous automount[858]: no mounts in table

This is what autofs returns when I restart:

May 13 12:09:34 nitrous automount[1300]: Starting automounter version 5.0.5, master map /etc/auto.master
May 13 12:09:34 nitrous automount[1300]: using kernel protocol version 5.02
May 13 12:09:34 nitrous automount[1300]: lookup_nss_read_master: reading master file /etc/auto.master
May 13 12:09:34 nitrous automount[1300]: parse_init: parse(sun): init gathered global options: (null)
May 13 12:09:34 nitrous automount[1300]: lookup_read_master: lookup(file): read entry +auto.master
May 13 12:09:34 nitrous automount[1300]: lookup_nss_read_master: reading master ldap auto.master
May 13 12:09:34 nitrous automount[1300]: parse_init: parse(sun): init gathered global options: (null)
May 13 12:09:34 nitrous automount[1300]: master_do_mount: mounting /auto
May 13 12:09:34 nitrous automount[1300]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-auto
May 13 12:09:34 nitrous automount[1300]: lookup_nss_read_map: reading map ldap ldap:ou=auto.home,ou=autofs,dc=***,dc=***
May 13 12:09:34 nitrous automount[1300]: parse_init: parse(sun): init gathered global options: (null)
May 13 12:09:34 nitrous automount[1300]: mounted indirect on /auto with timeout 300, freq 75 seconds
May 13 12:09:34 nitrous automount[1300]: st_ready: st_ready(): state = 0 path /auto

Any suggestions?

Clint Byrum (clint-fewbar) wrote :

Shane, I've unmarked this bug as a duplicate. Can you provide justification why you think they are the same problem?

Rolf Leggewie (r0lf) wrote :

maverick has seen the end of its life and is no longer receiving any updates. Marking the maverick task for this ticket as "Won't Fix".

Changed in autofs5 (Ubuntu Maverick):
status: Confirmed → Won't Fix
Rolf Leggewie (r0lf) wrote :

lucid has seen the end of its life and is no longer receiving any updates. Marking the lucid task for this ticket as "Won't Fix".

Changed in autofs5 (Ubuntu Lucid):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers