Daemon won't start at boot up (18LTS fully patched)

Bug #1774788 reported by Bill Gradwohl
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rsync
Unknown
Unknown
rsync (Ubuntu)
Won't Fix
Low
Unassigned

Bug Description

By adding the 'address=' option to the /etc/rsyncd.conf file, the daemon fails at boot.

Once the NIC(s) is/are up, it will start fine when executed via systemctl start rsync

● rsync.service - fast remote file copy program daemon
   Loaded: loaded (/lib/systemd/system/rsync.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2018-06-02 08:01:31 CST; 52min ago
  Process: 851 ExecStart=/usr/bin/rsync --daemon --no-detach (code=exited, status=10)
 Main PID: 851 (code=exited, status=10)

Jun 02 08:01:31 billlaptop.private.ycc systemd[1]: Started fast remote file copy program daemon.
Jun 02 08:01:31 billlaptop.private.ycc rsyncd[851]: rsyncd version 3.1.2 starting, listening on port 873
Jun 02 08:01:31 billlaptop.private.ycc rsyncd[851]: bind() failed: Cannot assign requested address (address-family 2)
Jun 02 08:01:31 billlaptop.private.ycc systemd[1]: rsync.service: Main process exited, code=exited, status=10/n/a
Jun 02 08:01:31 billlaptop.private.ycc rsyncd[851]: unable to bind any inbound sockets on port 873
Jun 02 08:01:31 billlaptop.private.ycc systemd[1]: rsync.service: Failed with result 'exit-code'.
Jun 02 08:01:31 billlaptop.private.ycc rsyncd[851]: rsync error: error in socket IO (code 10) at socket.c(555) [Receiver=3.1.2]

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: rsync 3.1.2-2.1ubuntu1
ProcVersionSignature: Ubuntu 4.15.0-22.24-generic 4.15.17
Uname: Linux 4.15.0-22-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.1
Architecture: amd64
CurrentDesktop: GNOME
Date: Sat Jun 2 08:48:15 2018
InstallationDate: Installed on 2018-06-01 (0 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
SourcePackage: rsync
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Bill Gradwohl (0cs935-bill) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thanks for filing this bug in Ubuntu.

The scenario is confirmed: when specifying an address to bind to, rsync will fail if that address is not available:
/var/log/syslog:Jun 4 21:12:27 bionic-rsync-1774788 rsyncd[269]: rsyncd version 3.1.2 starting, listening on port 873
/var/log/syslog:Jun 4 21:12:27 bionic-rsync-1774788 rsyncd[269]: bind() failed: Cannot assign requested address (address-family 2)
/var/log/syslog:Jun 4 21:12:27 bionic-rsync-1774788 rsyncd[269]: unable to bind any inbound sockets on port 873
/var/log/syslog:Jun 4 21:12:27 bionic-rsync-1774788 rsyncd[269]: rsync error: error in socket IO (code 10) at socket.c(555) [Receiver=3.1.2]

The solution seems to be for rsync to adopt the socket option IP_FREEBIND.

From the ip(7) manpage:
IP_FREEBIND (since Linux 2.4)
If enabled, this boolean option allows binding to an IP
address that is nonlocal or does not (yet) exist. This per‐
mits listening on a socket, without requiring the underlying
network interface or the specified dynamic IP address to be up
at the time that the application is trying to bind to it.
This option is the per-socket equivalent of the ip_nonlo‐
cal_bind /proc interface described below.

Until then, one workaround would be to configure the service file for rsyncd to wait for the network to be online.

If you run:
sudo systemctl edit rsync.service

It will open an editor. Put these lines in:
[Unit]
After=network.target,network-online.target

Then save. That will create /etc/systemd/system/rsync.service.d/override.conf with the two lines above. Alternatively you can just create the file above directly with the specified content without going through "systemctl edit".

You can then reboot and see if that helps. Note that the job will fail, after reaching the network-online target, if you specify an IP address that doesn't exist at that stage.

Another option I saw in https://unix.stackexchange.com/questions/442181/sshd-failed-due-to-network-not-yet-available but haven't tested is to make a system-wide change in /proc.

Changed in rsync (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Bill Gradwohl (0cs935-bill) wrote :

I'm impressed!

Revision history for this message
Bill Gradwohl (0cs935-bill) wrote :

I'm a recent convert to Ubuntu, having been on Fedora from FC2 through F26; quite a long time. Over the last few years, their support became nonexistent. Bugs reported would not even be acknowledged, much less worked on. To have someone actually respond to a bug report is so unusual in my experience that I'm pleasantly shocked. Thank You. Thank You.

Shouldn't services (sshd, rsync, etc) that depend on a NIC being available ALL wait till that's true? Shouldn't the fix presented for this bug become SOP for all the control files for services that require a NIC to be up?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The default behaviour of these services is to bind to "0.0.0.0", which means "any IP" essentially. That means they will work even when the nic isn't there yet, because localhost is. It also means that once a new nic comes up, they will listen on all its addresses too, without any other changes in configuration or restarts. Same for when an address disappears. It's definitely a good default for network services.

The moment you tell a service to bind to a specific IP, however, if that IP is not available then it will fail. That's where IP_FREEBIND comes in. It requires code changes, though.

This systemd upstream article talks about the pros and cons of depending on network-online.target: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/. In general, it should be avoided if possible.

Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Upstream seems to disagree with the IP_FREEBIND implementation, as per https://bugzilla.samba.org/show_bug.cgi?id=13463. The reason behind it is that the daemon should fail for invalid ip addresses.

Adding After=network-online.target is frowned upon by systemd upstream.

Following up with upstream to exit with a non-zero code seems to be the path forward here.

tags: added: server-todo
Robie Basak (racb)
tags: added: network-online-ordering
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

This is a class[1] of bugs for which we cannot come up with a general solution that will safely and sanely apply to all scenarios. For such cases, local configuration changes should be made to accommodate the intended behavior in each case.

We believe that, in this particular case, since the configuration was explicitly changed to use a specific IP, you should continue with the changes and adjust the systemd unit file for rsync to cope with that. Be it adjust the target to be network-online, or something else that explicitly waits for that very interface to come up. systemd offers mechanisms for such overrides, and it's described in more detail in comment #2.

Regarding the "systemctl start rsync" exit status, it's the way it work with Type=simple systemd services. From the systemd.service manpage:

"""
If set to simple (the default if ExecStart= is specified but neither Type= nor BusName= are) the service manager will consider the unit started immediately after the main service process has been forked off. (...)
Note that this means systemctl start command lines for simple services will report success even if the service's binary cannot be invoked successfully
"""

I tried Type=exec, but it still behaved in the same way (as the error happens after rsync starts up, i.e., the binary was executed).

With Type=forking I got a bit further, but the timeout needs tuning:

root@j1-rsyncd:~# time systemctl start rsync
Job for rsync.service failed because a timeout was exceeded.
See "systemctl status rsync.service" and "journalctl -xeu rsync.service" for details.

real 1m30.246s

With TimeoutStartSec=5 in the unit file it's better:

root@j1-rsyncd:~# time systemctl start rsync
Job for rsync.service failed because a timeout was exceeded.
See "systemctl status rsync.service" and "journalctl -xeu rsync.service" for details.

real 0m5.287s

I think the most reliably way would be Type=notify, but that requires rsync code changes to support systemd's notify mechanism.

In summary, for the specific case of this bug, we believe that systemd overrides are the best answer for now. To detect startup errors immediately, I'm willing to file a separate bug.

1. https://bugs.launchpad.net/ubuntu/+bugs?field.tag=network-online-ordering

Changed in rsync (Ubuntu):
status: Triaged → Won't Fix
Revision history for this message
Simon Déziel (sdeziel) wrote (last edit ):

When rsyncd cannot find the address it was told to bind to, it exits with rc=10 and systemd doesn't even attempt a restart.

To make it restart on such condition, the systemd unit should have `Restart=on-failure` added. Restarting on failure is what systemd recommends for long-running daemons so I proposed this in https://github.com/WayneD/rsync/pull/302

Alternatively, using `RestartForceExitStatus=10` would also work.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.