Avahi needs to be restarted after boot to broadcast netatalk services

Bug #624043 reported by wch
50
This bug affects 10 people
Affects Status Importance Assigned to Milestone
avahi (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

When I boot my server, it doesn't broadcast netatalk services. I have to manually restart avahi with:
  restart avahi-daemon

Once I do this, the server shows up instantly in the Finder of my Mac.

The problem is pretty much exactly the same as reported by the person here:
  http://ubuntuforums.org/showthread.php?t=1482573
The only difference is that the person here fixed it by installing packages from Debian Testing, which is something I'm wary of.

I added a file /etc/avahi/services/afpd.service, with the following contents:
<?xml version="1.0" standalone='no'?><!--*-nxml-*-->
<!DOCTYPE service-group SYSTEM "avahi-service.dtd">
<service-group>
<name replace-wildcards="yes">%h</name>
<service>
<type>_afpovertcp._tcp</type>
<port>548</port>
</service>
<service>
<type>_device-info._tcp</type>
<port>0</port>
<txt-record>model=Xserve</txt-record>
</service>
</service-group>

And there is no .local domain on my network that is causing problems with avahi starting.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: avahi-daemon 0.6.25-1ubuntu6
ProcVersionSignature: Ubuntu 2.6.32-24.41-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-24-generic x86_64
Architecture: amd64
Date: Wed Aug 25 09:42:25 2010
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: avahi

Revision history for this message
wch (winston-stdout) wrote :
Revision history for this message
Mikael Bergqvist (mikaelb) wrote :

I suggest this to be marked as duplicate of bug # 327362
/Mikael

Revision history for this message
wch (winston-stdout) wrote :

This should not be marked as a duplicate. Bug 327362 involves a .local domain, but the problem here does not. There is no mention of a .local domain in the logs or when I restart avahi from the command line.

I think the problem might have to do with netatalk starting before avahi-daemon during the boot process.

Revision history for this message
Mikael Bergqvist (mikaelb) wrote :

I suspect that it really is the the same mechanism. If you start avahi from the command line you will not use the script in /etc/init.d that contains the check if there is a local. domain.

If you do as suggested in bug 327362: and Issue the command:

host -t SOA local.

and see an answer like: "Host local. not found: 3(NXDOMAIN)", this is not the same problem.
If you instead see an answer like (for my ISP): "local has SOA record localhost. backbone.telia.net. 1 3600 900 3600000 3600", but for your ISP, then it is indeed the same problem.

/Mikael

Revision history for this message
wch (winston-stdout) wrote :

Here's the output of the command:

$ host -t SOA local.
Host local. not found: 3(NXDOMAIN)

So it looks like it doesn't have to do with a local. domain.

FWIW, I always use the script in /etc/init.d to restart the avahi daemon. I have to run it every time I reboot, so that the netatalk services will be advertised properly.

Revision history for this message
Mikael Bergqvist (mikaelb) wrote :

Do you have other avahi services enabled?
If not, try to enable other services, like ssh-login or sftp

Then, when you have rebooted the system that acts as the netatalk server and logged in, what do you see if you issue the following on the server:
avahi-discover -all or avahi-browse -a

Do you see the other avahi services? But not the netatalk? For me it is all or nothing.

/Mikael

Revision history for this message
wch (winston-stdout) wrote :

I do have other services enabled. Here is the output of avahi-browse immediately after a reboot, and then after restarting avahi-daemon. (I removed some extraneous information from the output.)

It looks like three things get added after restarting avahi-daemon: "Microsoft Windows Network", "Apple File Sharing", and "_device-info._tcp".

wch@userve:~$ avahi-browse -a
+ eth1 IPv4 userve Web Site local
+ eth1 IPv4 userve _rsp._tcp local
+ eth1 IPv4 userve iTunes Audio Access local
+ eth1 IPv4 wch's remote desktop on userve VNC Remote Access local
+ eth1 IPv4 Virtualization Host userve _libvirt._tcp local
+ eth1 IPv4 userve [8e:50:4a:2c:06:be] Workstation local

wch@userve:~$ sudo /etc/init.d/avahi-daemon restart
Rather than invoking init scripts through /etc/init.d, use the service(8)
utility, e.g. service avahi-daemon restart

Since the script you are attempting to invoke has been converted to an
Upstart job, you may also use the restart(8) utility, e.g. restart avahi-daemon
avahi-daemon start/running, process 2174

wch@userve:~$ avahi-browse -a
+ eth1 IPv4 userve Web Site local
+ eth1 IPv4 userve _rsp._tcp local
+ eth1 IPv4 userve iTunes Audio Access local
+ eth1 IPv4 wch's remote desktop on userve VNC Remote Access local
+ eth1 IPv4 Virtualization Host userve _libvirt._tcp local
+ eth1 IPv4 userve [8e:50:4a:2c:06:be] Workstation local
+ eth1 IPv4 userve Microsoft Windows Network local
+ eth1 IPv4 userve Apple File Sharing local
+ eth1 IPv4 userve _device-info._tcp local

Revision history for this message
Lorenz Bort (lbort) wrote :

Hi there!

I definitetly think this is not a duplicate of bug #327362, because this one is only a matter of which daemon is started when, i.e. avahi has to be started/restarted *after* netatalk in order to advertise these services properly. There is a thread in a german forum which describes the problem, and a possible workaround involving a pre-start script. Somebody commented that the problem starts only if samba sharing is activated, but I think this is not a necessary condition.

http://forum.ubuntuusers.de/topic/samba-und-netatalk-avahi-kommen-sich-ins-gehe/#post-2298101

The workaround to add the following lines to /etc/init/avahi-daemon.conf fixes the problem at boottime but raises others, because this prevents avahi from being (re)started if netatalk is already running.

pre-start script
    /etc/init.d/netatalk start
end script

Solution 1: One adds 2 lines of code to the pre-start script which check wheather netatalk is running during the start of avahi, and only execute "/etc/init.d/netatalk start" it if needed. This should be easy but is a little dirty I think, because this implies that actually not the boot process triggers the start of netatalk, but avahi does.

Solution 2: One diggs a bit deeper in the upstart-thing, which I do not know at all, and moves the start of avahi towards the end of the boot-process. This is the better solution I think, but I still start and stop my daemons with /etc/init.d/.... start/stop and ignore the "this has been converted to an upstart job" message, so I have no clue how upstart works and how much work this might be...

Lorenz

PS: is it possible to "un-duplicate" this Bug, since it isn't a problem with the .local domains?

Revision history for this message
michael woodruff (michael-j-w) wrote :
Download full text (4.5 KiB)

I too have been affected by this bug. No issues with .local multicasting for me either. Running 10.4.1, intel E3300, 2gb ram, g43 chipset, sata hdd's, realtek onbord eth.

michael@server:~$ host -t SOA local
Host local not found: 3(NXDOMAIN)

after reboot:

michael@server:~$ ps -A |grep avahi
  872 ? 00:00:00 avahi-daemon
  873 ? 00:00:00 avahi-daemon
michael@server:~$ avahi-browse -a
+ eth0 IPv4 server [40:61:86:37:**:**] Workstation local
+ eth0 IPv4 Michael-NB Apple File Sharing local
+ eth0 IPv4 Michael-NB Microsoft Windows Network local

Network discovery from my mac does not work up to this point. Then i restart the avahi-daemon and all is good with the world.

michael@server:~$ sudo service avahi-daemon restart
[sudo] password for michael:
avahi-daemon start/running, process 2051
michael@server:~$ avahi-browse -a
+ eth0 IPv4 server [40:61:86:37:**:**] Workstation local
+ eth0 IPv4 server Apple File Sharing local
+ eth0 IPv4 server _device-info._tcp local
+ eth0 IPv4 Michael-NB Apple File Sharing local
+ eth0 IPv4 Michael-NB Microsoft Windows Network local

at this point my server appears in OS X's finder and my shares are accessible.

Here is the output from cat /var/log/syslog | grep avahi for the reboot and then the restart of the service:

Oct 24 23:05:21 server avahi-daemon[872]: Found user 'avahi' (UID 105) and group 'avahi' (GID 111).
Oct 24 23:05:21 server avahi-daemon[872]: Successfully dropped root privileges.
Oct 24 23:05:21 server avahi-daemon[872]: avahi-daemon 0.6.25 starting up.
Oct 24 23:05:21 server avahi-daemon[872]: Successfully called chroot().
Oct 24 23:05:21 server avahi-daemon[872]: Successfully dropped remaining capabilities.
Oct 24 23:05:21 server avahi-daemon[872]: Loading service file /services/afpd.service.
Oct 24 23:05:21 server avahi-daemon[872]: Network interface enumeration completed.
Oct 24 23:05:21 server avahi-daemon[872]: Registering new address record for fe80::4261:86ff:fe37:5e63 on eth0.*.
Oct 24 23:05:21 server avahi-daemon[872]: Server startup complete. Host name is server.local. Local service cookie is 2268463974.
Oct 24 23:05:21 server avahi-daemon[872]: Service "server" (/services/afpd.service) successfully established.
Oct 24 23:05:21 server avahi-daemon[872]: Registering HINFO record with values 'I686'/'LINUX'.
Oct 24 23:05:26 server avahi-daemon[872]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.1.2.
Oct 24 23:05:26 server avahi-daemon[872]: New relevant interface eth0.IPv4 for mDNS.
Oct 24 23:05:26 server avahi-daemon[872]: Registering new address record for 192.168.1.2 on eth0.IPv4.
Oct 24 23:09:27 server avahi-daemon[872]: Got SIGTERM, quitting.
Oct 24 23:09:27 server avahi-daemon[872]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.1.2.
Oct 24 23:09:27 server init: avahi-daemon main process (872) terminated with status 255
Oct 24 23:09:27 se...

Read more...

Revision history for this message
Trent Lloyd (lathiat) wrote :

Hi All, Trent Lloyd here - one of the authors of the Avahi project

The cause of this situation is relatively simple, in all cases I have seen it is caused by a faulty network driver. You will find after you restart Avahi - services work for a few minutes and then they fail again - here is the cause

 => When Avahi starts, it broadcasts out to the network services exists
 => Machines discover the service and add it to the list of available services
 => Once it nears the expiry time, a query is sent out for the services
 => No response is received, because multicast functionality in the network card driver is broken.. multicast queries never reach the avahi daemon
 => Avahi detects this as a "Passive observation of failure" and removes the services from the service list (Mac OS X Bonjour does the same)

The fix is to get a different network card with a working driver support for multicast - or fix the network card driver - it would be useful to post your "lspci" outputs to the issue to identify what kind of network cards you have and mention which one it is (ethernet or wireless)

Is most common on wireless but does occur on some ethernet drivers as well.

Revision history for this message
Lorenz Bort (lbort) wrote :

Hi all, hi Trent

thanks for your quick response, but I have to disagree with you. If I restart avahi, everything works until the next reboot, which takes a month sometimes... And the workaround with the pre-start script implies that it is really a matter of which service is started first.

Perhaps I have to clarify, it is not an client-server issue, it's server-only:

I am using a linux box (mythbuntu 10.04 == server) with netatalk and avahi installed to broadcast the shares to my mac clients (a macbook pro with leopard, and a macbook running tiger). If I boot the server normaly, the netatalk-shares are missing in avahi, exactly like micheal has described it. I can access them manually, by connecting to the IP, but this is not what I want...
My impression is that this happens because avahi is started before netatalk in the boot-process. If avahi is started after netatalk (either by restarting it after booting, or by starting netatalk in the pre-start script of avahi), the shares are visible to avahi and correctly advertised. (see the first lines of michaels post, he is running avahi-browse on the server where avahi and netatalk are running, *not* on one of his clients. As soon as avahi sees and broadcasts the shares, all the mac-clients see the shares and everything works as it should, until the system is rebooted.)

The communication between clients and the server has nothing to do with the problem, at least I haven't seen any correlations. Your explanation may be true for a problem where clients can't see shares avahi is broadcasting, but our problem is that avahi itself doesn't see the shares at startup and doesn't broadcast them at all.

Hope now the problem is clear.

Lorenz

Revision history for this message
Trent Lloyd (lathiat) wrote :

Lorenz,

Thank-you for the extra information and apologies for misunderstanding the bug and not reading it fully.

I guess that nils my theory.. I am curious .. I guess I will have to try this out it doesn't make much sense.

Thanks,
Trent

Revision history for this message
wch (winston-stdout) wrote :

Lorenz and I are experiencing the same problem - the problem is not a network driver. Once avahi is restarted (after netatalk starts), everything works fine, until a reboot. I have used it for weeks at a time without any problems.

Revision history for this message
Lorenz Bort (lbort) wrote :

wow, so much activity here, nice!

@trent: I totally know nothing about upstart and how the sequence is controlled there, but if you do: try loading avahi before/after netatalk and look what happens. If loading avahi after netatalk fixes the problem, which will happen I think, then changing the default order should be enough to fix it. Maybe (I don't know, but the users of the german forum linked above say so) the problem only occurs if samba is installed and active, too.

If you need more information regarding samba-shares and their (sometimes strange) behavior on mac-clients, let me know, but this is something not directly related to this bug...

Revision history for this message
michael woodruff (michael-j-w) wrote :

Avahi seems to work for ever until a reboot for me as well but here is the lspci info requested:

00:00.0 Host bridge: Intel Corporation 4 Series Chipset DRAM Controller (rev 03)
00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)
00:1b.0 Audio device: Intel Corporation N10/ICH 7 Family High Definition Audio Controller (rev 01)
00:1c.0 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 1 (rev 01)
00:1c.1 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 2 (rev 01)
00:1d.0 USB Controller: Intel Corporation N10/ICH7 Family USB UHCI Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01)
00:1f.2 IDE interface: Intel Corporation N10/ICH7 Family SATA IDE Controller (rev 01)
00:1f.3 SMBus: Intel Corporation N10/ICH 7 Family SMBus Controller (rev 01)
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 02)

Also i think the order may not matter either, as you can see, if i stop netatalk and start avahi it still advertises the AFP service:

michael@server:~$ sudo stop avahi-daemon
avahi-daemon stop/waiting
michael@server:~$ sudo service netatalk stop
Stopping Netatalk Daemons: afpd cnid_metad papd timelord atalkd.
michael@server:~$ sudo start avahi-daemon
avahi-daemon start/running, process 3409
michael@server:~$ avahi-browse -a
+ eth0 IPv4 server Apple File Sharing local
+ eth0 IPv4 server _device-info._tcp local
+ eth0 IPv4 server SFTP File Transfer local
+ eth0 IPv4 server [40:61:86:37:5e:63] Workstation local
+ eth0 IPv4 Michael-NB Apple File Sharing local
+ eth0 IPv4 Michael-NB Microsoft Windows Network local
^CGot SIGINT, quitting.

Sorry to shoot down the two main ideas. Oh, and about 20% of the time avahi works after a reboot. discovered this while troubleshooting with about 15 reboots in 2hrs checking each time.

Revision history for this message
michael woodruff (michael-j-w) wrote :

My file server has been rebooted several times and is still displaying the problem. Avahi-daemon need to be restarted manually after boot to get it to advertise any services except the workstation one. Could someone post a any sort of fix? Maybe just a way to automate the restart once boot is complete? Thanks.

Michael

Revision history for this message
Rasmus Malthe Jørgensen (malthe-banan) wrote :

Adding the line:

service avahi-daemon start

to the file /etc/rc.local

seems to have worked for me - this should start avahi-daemon at the end of bootup (just before login) - though I'm not an expert so I might be wrong.

Malthe

Revision history for this message
wch (winston-stdout) wrote :

This problem seems to have disappeared for me. I'm still running 10.04 (Lucid). I'm not sure exactly when it changed, but it might have been one or two months ago.

Revision history for this message
Ralemy (reza.alemy) wrote :

#17 Workaround solved it for me too, and I agree with Malthe that this way it is like restarting the daemon after login. hope the bug gets fixed.
Cheers,
Rex
Lucid 10.04 64bit server,
Avahi and netatalk from apt-get install
OS X snow leopard

Revision history for this message
floid (jkanowitz) wrote :

This may or may not relate to this bug, and may or may not be the workaround, but recently [and only recently] became a recurring issue after accepting some updates on 9.10 machines.

Musing on the source for that mysterious SIGTERM revealed duplicate up/down scripts for dhclient:

in /etc/dhcp3/dhclient-enter-hooks.d/:
-rwxr-xr-x 1 root root 1037 2008-07-27 14:35 avahi-autoipd
-rwxr-xr-x 1 root root 1037 2011-03-04 15:51 avahi-autoipd.dpkg-new

in /etc/dhcp3/dhclient-exit-hooks.d/:
-rwxr-xr-x 1 root root 1039 2008-07-27 14:35 zzz_avahi-autoipd
-rwxr-xr-x 1 root root 1039 2011-03-04 15:51 zzz_avahi-autoipd.dpkg-new

and equivalent mess in /etc/network/if-up.d/ and if-down.d/.

All files diffed identically, so assuming dpkg is at risk to keep making a mess, I removed the earlier copies and kept *.dpkg-new. Things now appear to be working and surviving a "networking restart" and various rude pokings of the dhclient.

Hope this helps someone!

Revision history for this message
floid (jkanowitz) wrote :

[To clarify *my* issue and discovery: random loss of avahi / mDNS *after boot* was determined to be due to the duplicate scripts competing in absurd ways. This may not have anything to do with a separate race with netatalk, but I found this bug based on the SIGTERM logged in https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/624043/comments/9 which could be due to the same SNAFU.]

Revision history for this message
Jeffrey Paxton (aeiv) wrote :

Hi Guys,

I run Arch Linux on a Thinkpad and two different desktops. I mention this because I only have this problem on one desktop so it is logical to assume the problem may be hardware / driver related. At any rate I found a workaround similar to the one mentioned in post #17.

just adding /ect/rc.d/avahi-daemon start to rc.local didn't work. Nor did /etc/rc.d/avahi-daemon restart. With or without avahi-daemon declared in rc.conf it still wasn't working. I made sure that avahi-daemon started after sshd in rc.conf (since I'm trying to file-share over ssh) and still no dice. This is what Finally worked for me;

in rc.conf add avahi-daemon just before gdm

in rc.local add

sleep 10
/etc/rc.d/avahi-daemon stop
/etc/rc.d/avahi-daemon start
/etc/rc.d/avahi-daemon restart

Probably over kill, i don't know what part of this solution is actually working around the issue but it works.

Thanks guys and I hope this helps someone.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in avahi (Ubuntu):
status: New → Confirmed
Revision history for this message
Rory S (glide3) wrote :

Hi. I might be way off here, but I had the same problem with Fedora 17 (most recent at time of comment posting), RHEL 6.2, and CentOS 6.2.

I have to restart "avahi-daemon.service" and or "avahi-daemon.socket" with systemctl or service every couple of minutes in order for iMac machines on the network to discover the AFP server automatically in their finder window. If they restart their machines, come out of standby, sleep or I restart the server then we have to restart this process on the server for them to be able to find the server for a short while again.

My solution was to do all of the following. Probably the 1st that helped the most.

1: Very over the top port opening for avahi, netatalk protocols
---------------------------------------------------------------
Notice the UDP and TCP lines.

548 = AFP over TCP
5353 = Multicast DNS
5354 = Multicast DNS Responder IPC

# AFP / Netatalk / Apple Talk Protocol for Trilogy
#
-A INPUT -p tcp -m state -m tcp --dport 548 --state NEW -j ACCEPT
-A INPUT -p udp -m state -m udp --dport 548 --state NEW -j ACCEPT
#
-A INPUT -p tcp -m state -m tcp --dport 5353 --state NEW -j ACCEPT
-A INPUT -p udp -m state -m udp --dport 5353 --state NEW -j ACCEPT
#
-A INPUT -p tcp -m state -m tcp --dport 5354 --state NEW -j ACCEPT
-A INPUT -p udp -m state -m udp --dport 5354 --state NEW -j ACCEPT

2: Change the following line in the /etc/nsswitch.conf
------------------------------------------------------

#hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4 mdns myhostname
hosts: files mdns4_minimal dns mdns4 mdns

3. Stoped the samba and nmb services
------------------------------------

4. Add the below to rc.local (which needs to be created on some distro's)
-------------------------------------------------------------------------
sleep 15
/etc/rc.d/avahi-daemon stop
/etc/rc.d/avahi-daemon start

Hope this helps someone, drove me almost crazy.

Regards
Rory

Revision history for this message
Jussi Sainio (jussi-sainio) wrote :

Ubuntu 12.04, a DHCP configured local network, and the same problem. Putting lines to rc.local was not clean enough solution for me, so I investigated this a little bit further. I had following symptoms:

/var/log/boot.log:
 * Starting mDNS/DNS-SD daemon [OK]
 * Stopping mDNS/DNS-SD daemon [OK]
[...]
 * Starting configure network device [OK]

/var/log/syslog:
Dec 17 14:01:03 riesling avahi: Avahi detected that your currently configured local DNS server serves a domain .local. This is inherently incompatible with Avahi and thus Avahi disabled itself. If you want to use Avahi in this network, please contact your administrator and convince him to use a different DNS domain, since .local should be used exclusively for Zeroconf technology. For more information, see http://avahi.org/wiki/AvahiAndUnicastDotLocal
Dec 17 14:01:03 riesling dhclient: bound to 192.168.3.130 -- renewal in 40304 seconds.

/var/log/upstart/avahi-daemon.log:
Process 857 died: No such process; trying to remove PID file. (/var/run/avahi-daemon//pid)

(Side note: I hate the lack of timestamps in the two log files. Does anybody know a solution to this?)

While this can be mitigated by disabling the "detect .local" check in /etc/default/avahi-daemon ...:

AVAHI_DAEMON_DETECT_LOCAL=0

...this is a "wrong" (but working) fix, since my network DNS does not serve .local domain as far as I know. Note that in the logs, the network device is brought up after trying starting Avahi. This means that the "detect .local" check is not even done on the actual LAN device, maybe just on the loopback device or on no network devices at all. If we want this "detect .local" feature to work (sounds like a good idea on e.g. mobile zeroconf daemons such as laptops), Avahi daemon must be started after bringing up the network devices.

One a little bit more kosher way to accomplish this, by patching /etc/init/avahi-daemon.conf:

start on (filesystem
          and started dbus and net-device-up IFACE!=lo)

Now, the avahi-daemon starts only after the first non-loopback network device has been brought up and the "detect .local" check should be done there correctly without failing every time.

(Note: I do not know how avahi-daemon works in a changing network scenario, e.g. if it runs on a laptop and the laptop changes network without rebooting. Somebody maybe can test this scenario out and make the avahi-daemon restart on each net-device-up/down event, if avahi-daemon does not handle this internally. My setup is a server on a home network.)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.