ntpd not started when using ntpdate

Bug #1577596 reported by Paul Donohue
190
This bug affects 35 people
Affects Status Importance Assigned to Milestone
init-system-helpers (Ubuntu)
Confirmed
Undecided
Unassigned
ntp (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

After updating from 14.04 to 16.04 on a number of my systems, ntpd no longer starts at boot on any of those systems.

`systemctl status ntp` shows:
   ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:systemd-sysv-generator(8)
May 02 19:10:14 host systemd[1]: Stopped LSB: Start NTP daemon.
May 02 19:10:17 host systemd[1]: Stopped LSB: Start NTP daemon.

Manually starting it using `systemctl start ntp` works fine. However, systemd does not seem to want to start it automatically at boot time.

As best as I can tell based on trial and error, there is something special about the combination of the service being named "ntp.service" and the service depending on network.target. However, I haven't been able to identify exactly what is causing this.

If I copy the init script to any other name, everything works fine:
cp /etc/init.d/ntp /etc/init.d/ntpd
Edit /etc/init.d/ntpd and change "Provides: ntp" to "Provides: ntpd"
systemctl enable ntpd
# After a reboot, ntpd.service is started, but ntp.service is not.

If I remove "$network" from the "# Required-Start: $network $remote_fs $syslog" line in /etc/init.d/ntp, then systemd starts it automatically ... But of course it is started before the network comes up, so it fails.

If I replace /etc/init.d/ntp with a file containing only the following, systemd won't try to start it automatically at boot:
#!/bin/sh
### BEGIN INIT INFO
# Provides: ntp
# Required-Start: $network
# Required-Stop: $network
# Default-Start: 2 3 4 5
# Default-Stop: 1
# Short-Description: Start NTP daemon
### END INIT INFO
echo "script was run" >> /ntp.log

If I rename that same dummy script to /etc/init.d/ntp2, it is started automatically at boot.

However, grepping the systemd source code and my systemd config files for ntp doesn't seem to find anything that might cause this behavior:
/etc/systemd# grep -iR ntp *
timesyncd.conf:#NTP=
timesyncd.conf:#FallbackNTP=ntp.ubuntu.com
/lib/systemd# grep -R ntp *
system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/ntpd
system/systemd-timesyncd.service.d/disable-with-time-daemon.conf:ConditionFileIsExecutable=!/usr/sbin/openntpd
Binary file systemd-networkd matches
Binary file systemd-timedated matches
Binary file systemd-timesyncd matches

What else can I do to debug this further?

Tags: patch sts
Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :

I came across a box that was running Ubuntu 15.10 with ntpd. systemd was automatically starting ntpd on boot on that system:
# systemctl status ntp
   ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp)
   Active: active (running) since Sun 2016-04-24 14:48:26 EDT; 1 weeks 1 days ago
     Docs: man:systemd-sysv-generator(8)
   CGroup: /system.slice/ntp.service
           └─849 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 106:115
Apr 24 14:44:41 host systemd[1]: Starting LSB: Start NTP daemon...
Apr 24 14:44:41 host ntp[894]: * Starting NTP server ntpd
Apr 24 14:48:26 host ntp[894]: lockfile creation failed: exceeded maximum number of lock attempts
Apr 24 14:48:26 host ntp[894]: ...done.
Apr 24 14:48:26 host systemd[1]: Started LSB: Start NTP daemon.

After updating to 16.04 (ntp:amd64 1:4.2.6.p5+dfsg-3ubuntu8.2 -> 1:4.2.8p4+dfsg-3ubuntu5, systemd:amd64 225-1ubuntu9.1 -> 229-4ubuntu4, systemd-sysv:amd64 225-1ubuntu9.1 -> 229-4ubuntu4), systemd no longer starts ntpd on boot:
# systemctl status ntp
   ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:systemd-sysv-generator(8)
May 02 23:22:37 host systemd[1]: Stopped LSB: Start NTP daemon.
May 02 23:22:37 host systemd[1]: Stopped LSB: Start NTP daemon.

Robie Basak (racb)
Changed in ntp (Ubuntu):
importance: Undecided → High
Revision history for this message
Martin Pitt (pitti) wrote :

I cannot immediately reproduce that by installing ntp on 16.04, that works fine and starts both after package install and after reboot. I suppose that this causes a dependency cycle somewhere. Please do "sudo journalctl -b > /tmp/journal.txt" after a clean boot and attach /tmp/journal.txt.

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ntp (Ubuntu):
status: New → Confirmed
Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :

Attached.

Also attached the output of `systemctl list-dependencies` in case that helps any.

$ systemctl status ntp
   ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:systemd-sysv-generator(8)

May 03 19:28:45 Fusion systemd[1]: Stopped LSB: Start NTP daemon.
May 03 19:28:48 Fusion systemd[1]: Stopped LSB: Start NTP daemon.

$ systemctl is-enabled ntp
ntp.service is not a native service, redirecting to systemd-sysv-install
Executing /lib/systemd/systemd-sysv-install is-enabled ntp
enabled

Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :
Revision history for this message
Mike Wescott (wescott) wrote :

This problem occurs when the ntpdate package is also installed. Remove ntpdate and ntp starts correctly.

This smells like there is some expectation of these being mutually exclusive.

Revision history for this message
Mike Wescott (wescott) wrote :

On second thought, It maybe that ntp can't open port 123 since it's already opened by ntpdate.

Martin Pitt (pitti)
summary: - ntpd not started by systemd
+ ntpd not started when using ntpdate
no longer affects: systemd (Ubuntu)
Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :

Aha! I've found the issue.

/etc/network/if-up.d/ntpdate is called when each network interface comes up. This happens before network.target is reached, so it happens before ntp.service would normally be automatically started by systemd.

/etc/network/if-up.d/ntpdate calls `invoke-rc.d ntp stop`, then it runs ntpdate, then it calls `invoke-rc.d ntp start`.

`invoke-rc.d ntp stop` runs `systemctl stop ntp.service`, which causes systemd to cancel the ntp.service start task that was automatically scheduled to happen after network.target.

`invoke-rc.d ntp start` calls `/sbin/runlevel` to determine the current runlevel so that it can verify the existence of a /etc/rc?.d/S??ntp symlink for the current runlevel. However, `/sbin/runlevel` returns "unknown" because systemd has not reached multi-user.target yet. Therefore, invoke-rc.d determines that the appropriate /etc/rc?.d/S??ntp symlink does not exist, so it does not call `systemctl start ntp.service` to start ntp.

Changing /etc/network/if-up.d/ntpdate so that it calls `systemctl start ntp.service` instead of `invoke-rc.d ntp start` fixes the problem.

However, I think I would consider this to be a bug in invoke-rc.d and not ntpdate, since invoke-rc.d simply does not work properly when systemd is being used and invoke-rc.d is called at boot time. At the very least, I would think invoke-rc.d should document that this is unsupported, and it should detect and report this condition if invoke-rc.d is called at boot time (rather than just silently failing).

Revision history for this message
Robie Basak (racb) wrote :

This reminds me of bug 1575572. invoke-rc.d's behaviour is adjusted there. I wonder if that fix is related to this problem?

Revision history for this message
Robie Basak (racb) wrote :

Paul, that's a really handy analysis - thank you.

Revision history for this message
Martin Pitt (pitti) wrote :

Indeed this is a duplicate of bug 1575572, so marking accordingly. Thanks Paul for the analysis!

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in init-system-helpers (Ubuntu):
status: New → Confirmed
Revision history for this message
Lars Kollstedt (lk-x) wrote :

Hi Martin,

the fix for Bug #1575572 released yesterday turns the situation from bad to worse from my experience. Since I saw this on i686 only before, I am experiencing this also on amd64 since installing the init-system-helpers update today.

| yellow Other updates (2): apt-get install init init-system-helpers
| init (1.29ubuntu1 1.29ubuntu2)
| init-system-helpers (1.29ubuntu1 1.29ubuntu2)

From my experience this is something that happens when ntpdate isn't ready when systemd tries to start ntpd and ntpd wasn't started before /etc/network/if-up.d/ntpdate is started.

As far as I can see this is because /etc/network/if-up.d/ntpdate from ntpdate and /etc/init.d/ntp from ntp try to handle each other's locks but don't do that properly. The init.d script simply breaks the locks from ntpdate, even if ntpdate is still running.

My workarround I had already used on Ubuntu 12.4 LTS (precise), but it was unnecessary on Ubuntu 14.4 LTS (trusty), and now it's in again for 16.4 LTS (xenial) was to let the ifup-Script wait for the ntpd.

But that's possibly not the best solution. ;-)

For the Question why to use ntpdate *and* ntpd: We do this to immediately sync the clocks of physical test servers (which are not always on, to save energy) on boot, whereas during normal operation the slow drift of ntpd is the wished behavior.

So it possibly would be a much better solution (for me) to remove /etc/network/if-up.d/ntpdate, and start the ntpdate before ntpd from /etc/init.d/ntp if it's present, and of course wait until it's ready before starting ntpd.

Kind regards,

Revision history for this message
Lars Kollstedt (lk-x) wrote :

Send to fast. :-( Tab, Enter wasn't a good idea, in the Bugtrackers WebGUI. :-( ;-)

The differences between the versions and platforms are probably just timing. But this happens deterministically on xenial, might be the time difference was large enough.

Kind regards,
   Lars

Revision history for this message
John Sopko (sopko) wrote :

I am seeing this on some but not all systems. I can only manually start after first doing a stop:

root@tophat:~# ps -ef | grep ntp
root 2380 2365 0 11:51 pts/8 00:00:00 grep --color=auto ntp
root@tophat:~# systemctl start ntp
root@tophat:~# ps -ef | grep ntp
root 2384 2365 0 11:51 pts/8 00:00:00 grep --color=auto ntp
root@tophat:~# systemctl stop ntp
root@tophat:~# systemctl start ntp
root@tophat:~# ps -ef | grep ntp
ntp 2414 1 0 11:51 ? 00:00:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 111:116
root 2416 2365 0 11:51 pts/8 00:00:00 grep --color=auto ntp

Revision history for this message
John Sopko (sopko) wrote :

Another observation, the ntpdate command is really slow on Ubuntu 14.04 and 16.04. On average it takes about 6.1 seconds to run the ntpdate command, I am running ntpdate after boot. Our Red hat 6.8 machines take about 0.1 seconds. We manage 300+ Ubuntu 14.04 and 16.04 systems and I checked ntpdate on several of them. Still can't figure out why most machines work but some consistently fail to start ntpd. Even the ones that do start ntpd ntpdate still takes 6+ seconds to run. And we have our own in house stratum 1 time servers that are on the same internal network.

Revision history for this message
Paul Crawford (psc-sat) wrote :

I don't think this is a duplicate of bug #1575572
I am seeing this problem on an Ubuntu 16.04 machine that has the update "This bug was fixed in the package init-system-helpers - 1.29ubuntu2" mentioned as being a fix for that bug but ntpd is still broken on booting. I am seeing this:

Jul 12 14:14:43 metop ntpd[1933]: unable to bind to wildcard address :: - another process may be running - EXITING
Jul 12 14:14:50 metop ntpdate[1758]: step time server 134.36.22.27 offset 0.264278 sec

So basically the systemd dependency arrangement is broken: ntpdate is slow now (as for comment #16 above) and not exiting before ntpd is started, it then it fails due to ntpdate still running.

A later manual start of ntpd works fine, but that is not an acceptable situation for machines that are unattended and/or used by people that don't have administrative rights (or knowledge) to manually start ntpd later.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

I'm also seeing this still fail with init-system-helpers 1.29ubuntu2, also suggesting this is not a duplicate of #1575572.

It does not seem to fail reliably, but for what it's worth this was a clean install of 16.04 server on a fast-ish real hardware machine.

Revision history for this message
Robie Basak (racb) wrote :

OK, unduping for now.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Hmm, I also found this line in my log (this time it booted OK):

Jul 12 18:41:47 redacted.example.org ntpdate[2184]: name server cannot be used: Temporary failure in name resolution (-3)

I'm wondering whether it's either

a) failing to start ntp because ntpdate isn't running because it couldn't resolve names because bind9 had not yet started, and the drift was too large (sometimes)

b) (the opposite) only working when ntpdate times out if it starts its query too early.

Seems to me like ntpdate should be waiting to start until name resolution works - probably ntpd too.

Trent Lloyd (lathiat)
tags: added: sts
Revision history for this message
Rick Frey (gribnut) wrote :

I'm seeing same behavior as Alex where failure of ntpdate is preventing ntpd from starting. ntpdate is logging same message indicating failure due to name resolution (Bind not running yet at boot):

Sep 16 10:40:09 scorpion ntpdate[1033]: name server cannot be used: Temporary failure in name resolution (-3)

The script /etc/network/if-up.d/ntpdate does attempt to stop any running ntp service prior to running ntpdate but then exits without restarting ntp if ntpdate-debian fails.

I modified /etc/network/if-up.d/ntpdate to only log a message and exit as a workaround and ntpd now starts at boot.

I'm still a bit puzzled why ntpd doesn't attempt to start after /etc/network/if-up.d/ntpdate fails (multiple times in my case due to multiple interfaces on host). When /etc/network/if-up.d/ntpdate runs/fails, there is no log message from ntp indicating it tried to start (neither before nor after ntpdate fails). However, after applying my workaround, the log message for ntp service is one second after last run of /etc/network/if-up.d/ntpdate. I would have thought that the failure of ntpdate would not prevent the later startup of ntp. I'm guessing that /etc/network/if-up.d/ntpdate initiates a stop in parallel to systemd start of ntpd before it logs anything to syslog.

Revision history for this message
Robie Basak (racb) wrote :

I think that this belongs to the class of bugs we discussed on ubuntu-devel recently[1].

The summary is:

1. ntpdate is deprecated
2. Various patches got polished, tested and provided to Debian but not yet picked up
3. We are unwilling to add more delta to Debian for a deprecated binary

So if they are merged in Debian we will pick it up in the next Release, but unlikely do an SRU for it.

Therefore, I'm marking this bug as Won't Fix for the ntp package in Ubuntu, and encourage users to use systemd-timesyncd instead.

If there is a fix that doesn't involve adding a delta to Ubuntu for the ntpdate package, then we can do that, however.

Revision history for this message
Robie Basak (racb) wrote :
Changed in ntp (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
Rick Frey (gribnut) wrote :

Thanks for the update Robie. I was not aware of ntpdate being deprecated (appears to have been deprecated years ago).

For those like myself that require ntpd (the suggested alternative systemd-timesyncd uses sntp which may not suffice in all use cases), I think the best fix/workaround is to merely remove the ntpdate package. I didn't really use ntpdate anymore and suspect it was still installed from an earlier version of Ubuntu in my case (I've performed a fair number of inline upgrades on system).

Historically, ntpdate was run prior to starting ntpd in case the clock was too far off for ntpd to sync. In looking at the ntp package further, I see that /etc/default/ntp includes the '-g' option which allows ntpd to perform a one time sync that would accommodate a clock with any delta. This in itself makes ntpdate unneeded for those running ntp service. Additionally, ntpd can also be run with arguments to simulate behavior of ntpdate if needed.

So, if I understand correctly, the ntp package really has no bug (at least related to starting at boot). Issue was really due to bug with deprecated ntpdate package which should be removed if running ntp anyway.

Revision history for this message
Robie Basak (racb) wrote : Re: [Bug 1577596] Re: ntpd not started when using ntpdate

Hi Rick,

On Mon, Sep 19, 2016 at 02:17:58PM -0000, Rick Frey wrote:
> So, if I understand correctly, the ntp package really has no bug (at
> least related to starting at boot). Issue was really due to bug with
> deprecated ntpdate package which should be removed if running ntp
> anyway.

That's right. In Trusty, you may not be permitted to remove the ntpdate
package without removing a task metapackage (generally undesirable).
IIRC, I noted in the ML thread some options to disable ntpdate instead.

Since Xenial, it makes sense just to not have ntpdate installed at all,
IMHO. Though I still welcome further discussion if there are use cases
this breaks.

Revision history for this message
Steven (stevenbrs) wrote :

/etc/network/if-up.d/ntpdate executes it's code between ()& starting on line 25.
this is to be able to wait until /usr is mounted, when it isn't already.
If that check is disabled, and the run-in-background ampersand is commented out (line 46), the ntp daemon starts just fine.

No matter to further explain my view on this thing, since ntpdate is deprecated ;-)
But maybe it's an easy and quick fix to remove /etc/network/if-up/ntpdate from the repo ?
Without that script, there is no problem...
Maybe it can be replaced by a normal startup-script as there was in earlier times ?

Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :

My original issue is definitely a duplicate of Bug #1575572 and the fix for that bug solves the specific problem described in Comment #8.

The issues described by Lars Kollstedt and others are a separate issue ... My original issue was that systemd would not start ntp.service if /etc/network/if-up.d/ntpdate was called before systemd started ntp.service on its own ... This new issue is that ntpd will not start if /etc/network/if-up.d/ntpdate is called multiple times (due to multiple network interfaces being brought up), and an ntpdate command is still running when ntpd is started.

I believe this new issue is caused by the fact that /etc/network/if-up.d/ntpdate uses /run/lock/ntpdate as its lock file, but /etc/init.d/ntp uses /var/lock/ntpdate as its lock file. I believe that using the same lock file in both of those scripts should fix this issue. Or, as was mentioned in several comments, removing ntpdate or disabling /etc/network/if-up.d/ntpdate (and if necessary using `ntpd -q` instead of ntpdate to step the clock) should also fix it.

In my case, the reason to have ntpdate installed is for testing/troubleshooting purposes. As far as I can tell, ntpd does NOT have any options that are equivalent to `ntpdate -qu <server>` or `ntpdate -du <server>`. Since ntpdate is being deprecated, should I file a feature request against ntpd to have equivalent options added to ntpd?

Revision history for this message
Robie Basak (racb) wrote :

According to http://support.ntp.org/bin/view/Dev/DeprecatingNtpdate, it seems that they expect you to use the sntp client for that functionality. Is this sufficient?

But yes - given that ntpdate is deprecated upstream, in general it would make sense to communicate upstream for any lost use cases. In this case, perhaps they have already communicated the answer?

Revision history for this message
Heinrich Hartl (hkh) wrote :

I am affected too.
Manually starting ntpd using `systemctl start ntp` doesn't work for me.
ps -ef | grep ntpd # shows no started ntpd.
systemctl status ntp # does not change active (excited), only another last line is added.

root@ILS-AP2:~# systemctl start ntp
root@ILS-AP2:~# ps -ef | grep ntp
root 10186 10166 0 14:22 pts/6 00:00:00 grep --color=auto ntp
root@ILS-AP2:~# systemctl status ntp
● ntp.service - LSB: Start NTP daemon
   Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
   Active: active (exited) since Mi 2016-11-02 23:50:59 CET; 1 day 14h ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1227 ExecStop=/etc/init.d/ntp stop (code=exited, status=0/SUCCESS)
  Process: 1367 ExecStart=/etc/init.d/ntp start (code=exited, status=0/SUCCESS)

Nov 02 23:50:59 ILS-AP2 systemd[1]: Starting LSB: Start NTP daemon...
Nov 02 23:50:59 ILS-AP2 ntp[1367]: * Starting NTP server ntpd
Nov 02 23:50:59 ILS-AP2 ntp[1367]: ...done.
Nov 02 23:50:59 ILS-AP2 ntpd[1381]: proto: precision = 0.129 usec (-23)
Nov 02 23:50:59 ILS-AP2 systemd[1]: Started LSB: Start NTP daemon.
Nov 02 23:51:05 ILS-AP2 systemd[1]: Started LSB: Start NTP daemon.
Nov 04 12:36:22 ILS-AP2 systemd[1]: Started LSB: Start NTP daemon.
Nov 04 12:38:45 ILS-AP2 systemd[1]: Started LSB: Start NTP daemon.
Nov 04 14:22:21 ILS-AP2 systemd[1]: Started LSB: Start NTP daemon.
root@ILS-AP2:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.1 LTS
Release: 16.04
Codename: xenial

Revision history for this message
Heinrich Hartl (hkh) wrote :

systemctl restart-failed ntp # was no help
# However after
systemctl stop ntp # changes status active (exited) to inactive (dead)
systemctl start ntp # succeeds.

Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :

Yes, it looks to me like sntp is a sufficient replacement for ntpdate.

However, sntp is not currently packaged for Debian or Ubuntu: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=793837

In addition, prior to that bug being introduced, sntp was included in the 'ntp' package, so sntp could not be installed without installing ntpd. It would be useful to be able to install sntp so it can be used for troubleshooting even when using systemd-timesyncd or chrony or another ntp daemon.

Could you add sntp back to the build and package it in a separate 'sntp' package?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Heinrich,
to check if that is the same or a different bug - is ntpdate installed for you?
If not that is a "different" issue than what was discussed here, so please confirm or if not please open a new bug to be discussed separately.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Paul,
yeah I recently realized as well that sntp is missing in later releases.

Long Term I really look forward to see how it ends up if ntpsec will replace ntp (that should hopefully fix most open issues by being almost a full rewrite/cleanup, but as everything new surely needs some test/fixup).

We surely can take a look at adding sntp, not sure on the priority though and if eventually that qualifies for a SRU, but I'll mark the bug as triaged as at least for the "missing sntp" part it would be rather clear what to do about it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I knew I remembered something about it - so missing sntp is https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/1604010

Keep this bug here for the issue starting ntpd when using ntpdate that has to be viewed under the constraints outlined by rbasak in comment #22 / #23

Revision history for this message
Heinrich Hartl (hkh) wrote :

hhl@ILS-AP2:~$ which ntpdate
/usr/sbin/ntpdate
hhl@ILS-AP2:~$ dpkg -l | grep ntpdate
ii ntpdate 1:4.2.8p4+dfsg-3ubuntu5.3 i386 client for setting system time from NTP servers
hhl@ILS-AP2:~$

Revision history for this message
Heinrich Hartl (hkh) wrote :

I hope my comment #35 answered the question raised in comment #32.
I do not fully understand the discussion and I am a newbe in that knowledge domain however I would like to have the problem solved.
Please let me know if you think the following attempt makes sense.

Grep'ing through /etc I found /etc/network/if-up.d/ntpdate
which is at least one place where ntpdate is used. However there I find also a suggestion that we could skip that action!
So I suggest:
#> mv /etc/network/if-up.d/ntpdate /etc/network/if-up.d/ntpdate_bak
#> cat - >/etc/network/if-up.d/ntpdate
#!/bin/sh
        exit 0
^d
chmod 755 /etc/network/if-up.d/ntpdate

# now try a reboot and see if the problem (no ntpd running) persists.

Revision history for this message
Heinrich Hartl (hkh) wrote :

The solution suggested in comment #36 works for me, ntpd is running after reboot.
Actually I did not create ntpdate_bak in the directory, I simply inserted the exit 0 statement in the script.
========= To resolve the bug
========= I suggest if-up.d/ntpdate is removed from the system.
========= Moreover I suggest ntpd to be started with option -g.
========= Moreover I suggest a surveillance task (cronjob) for ntpd and hwclock update.
And here are my arguments for the suggested solution:

I tried to find a real world situation where if-up.d/ntpdate makes any sense. I could not find one. Apparently it is difficult to pin point the precise cause of the ntpdate/ntpd interference or at least it is difficult to decide how to correct it. This script seems to be the only one using deprecated ntpdate. If script removal is considered too aggressive just disable it by inserting the exit 0 statement.

I assume that upon shut down the hwclock is set to the latest state of the the system clock. Upon (re)boot the time value kept in the hwclock is brought back to the systemclock. This should guarantee that the system clock even very early in the boot sequence is pretty close to the perfect value. There is no need to leap the time gap as is done by deprecated ntpdate. During normal operation there is no need either - ntpdate never does a better job than ntpd when called in a system that runs ntpd. Ironically if-up.d/ntpdate would bring up ntpd which is not fair in a system that was configured for some other method to maintain a proper clock setting. The only situation where I can imagine a leap in setting the time is desirable is when the hwclock fails. This is a bad situation that I had on a machine shut down for a long time and the battery of the hwclock had stopped to provide power. Unless the -g option is used ntpd dies because the time difference is larger than the panic threshold. The -g option does no harm in normal situations but allows for one big step in time to cope with that situation. If e.g. ntpd is configured with just one time resource and that resource suddenly changes by more that the panic threshold than ntpd would die. To cope with such a situation a surveillance mechanism (e.g. cronjob) should be in place including a restart mechanism. A cronjob could also update the hwclock on a regular (monthly) basis. Some systems really are up for years and then the hwclock might be off by more than the panic threshhold (1000") if there is an irregular shutdown (loss of power) .

Revision history for this message
Rob X (robx) wrote :

Hi,

there is a much easier solution.
"/etc/network/if-up.d/ntpdate" use "/usr/sbin/ntpdate-debian" which read the configfile "/etc/default/ntpdate" (man ntpdate-debian). If you modify /etc/default/ntpdate with NTPOPTIONS="-u" then "/usr/sbin/ntpdate-debian" use an unpriviledge port (>1023) if ifup is running and don't block port udp/123 at the start of ntpd.

Revision history for this message
Heinrich Hartl (hkh) wrote :

I tested the suggestion from comment #38 by robx. ======== It works for me.

Deviating from his proposal I only did the following change:
hhl@ILS-AP2:~$ diff ntpdate /etc/network/if-up.d/ntpdate
9a10
> # exit 0 # hh_161120 - ntpdate prevents start ntpd Ubuntu Bug #1577596
42c43,44
< flock -n /run/lock/ntpdate /usr/sbin/ntpdate-debian -s $OPTS 2>/dev/null || :
---
> # avoid conflict with ntpd: option -u have ntpdate-debian use a non priveledged source port
> flock -n /run/lock/ntpdate /usr/sbin/ntpdate-debian -su $OPTS 2>/dev/null || :
hhl@ILS-AP2:~$

If -u is specified for ntpdate-debian it might also be possible to bypass stop/start of ntpd sind ntpdate and ntpd are no longer in conflict for port UDP/123. I hope there is no race for other resorces (hwclock?). I did disable stop/start and it works for me.
hhl@ILS-AP2:~$ diff ntpdate /etc/network/if-up.d/ntpdate # disabling ntpd stop/start
...
39c40,41
< invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true
---
> # stopping service ntp is no loger required if ntpdate-debian is not using source port UDP/123
> # invoke-rc.d --quiet $service stop >/dev/null 2>&1 || true
...
44c47
< invoke-rc.d --quiet $service start >/dev/null 2>&1 || true
---
> # invoke-rc.d --quiet $service start >/dev/null 2>&1 || true
hhl@ILS-AP2:~$

I still think package ntpdate should not be in the system unless explicitely requested.

I suggest package ntpdate to be modified to include option -u when invoking ntpdate-debian.

The thorough analysis of what went wrong was done by others - see comments. I am confident a simple change solves the bug and that it works in general and not only for me!

Revision history for this message
Paul Donohue (s-launchpad-paulsd-com) wrote :

I don't think -u would be necessary if /etc/network/if-up.d/ntpdate were using the correct lock file (/var/lock/ntpdate vs /run/lock/ntpdate, as I pointed out in comment #27).

But I do agree that either adding -u or fixing the lock file path would be a simple solution to this problem.

Revision history for this message
Heinrich Hartl (hkh) wrote :

attached version of ntpdate solves the problem for me and hopefully for everybody.
I do not mind if comments are removed.
The attached version avoids the conflict between ntpd and ntpdate and also makes the skript simpler.

Revision history for this message
Heinrich Hartl (hkh) wrote :

Please do not blame me too hard if I provided a "patch" not in the correct format - but the script does the job for me. ntpd is started and continues to run, ntpdate seems to be invoked twice. The first invocation is too early and failes. The next one does its job but doesn't do a very impressive correction of the system time.

And here is an excerpt from my reboot log:
$ grep -i ntp syslog_1743
Nov 28 17:33:49 localhost kernel: [ 0.076562] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
Nov 28 17:33:49 localhost ntpdate[781]: no servers can be used, exiting
Nov 28 17:34:01 localhost systemd[1]: Starting LSB: Start NTP daemon...
Nov 28 17:34:02 localhost ntp[1285]: * Starting NTP server ntpd
Nov 28 17:34:02 localhost ntpd[1306]: ntpd 4.2.8p4@1.3265-o Wed Oct 5 12:34:48 UTC 2016 (1): Starting
Nov 28 17:34:02 localhost ntpd[1306]: Command line: /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 116:125
Nov 28 17:34:02 localhost ntp[1285]: ...done.
Nov 28 17:34:02 localhost systemd[1]: Started LSB: Start NTP daemon.
Nov 28 17:34:02 localhost ntpd[1309]: proto: precision = 0.129 usec (-23)
Nov 28 17:34:02 localhost ntpd[1309]: restrict 0.0.0.0: KOD does nothing without LIMITED.
Nov 28 17:34:02 localhost ntpd[1309]: restrict ::: KOD does nothing without LIMITED.
Nov 28 17:34:02 localhost ntpd[1309]: Listen and drop on 0 v6wildcard [::]:123
Nov 28 17:34:02 localhost ntpd[1309]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Nov 28 17:34:02 localhost ntpd[1309]: Listen normally on 2 lo 127.0.0.1:123
Nov 28 17:34:02 localhost ntpd[1309]: Listen normally on 3 eth0 192.168.0.12:123
Nov 28 17:34:02 localhost ntpd[1309]: Listen normally on 4 lo [::1]:123
Nov 28 17:34:02 localhost ntpd[1309]: Listen normally on 5 eth0 [fe80::221:9bff:fe4c:d684%2]:123
Nov 28 17:34:02 localhost ntpd[1309]: Listening on routing socket on fd #22 for interface updates
Nov 28 17:34:05 localhost ntpdate[1154]: adjust time server 172.26.195.1 offset -0.000028 sec

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "ntpdate_HH is a revised version of /etc/network/if-up.d/ntpdate" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Heinrich,
thank you for picking up working on this.
I must beg your pardon as I come by this so late (as I'm currently cleaning bugs too long dormant).

I like the suggestion of a fix by adapting the scripts, but following comments #22 and #23 that would have to be accepted in Debian instead. If you wouldn't mind could you file a bug there and link it here?
Although the reason why we are somewhat reluctant to take the changes (details in the comments that I mentioned) also implies that the chance this is accepted is rather low.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Everybody,
I looked into this again as a friend asked me. And it turns out there is an update probably worth it.

Two things to mention:
1. the locking issue of /run/lock/ntpdate vs /var/lock/ntpdate (Thanks Paul for finding that).
   While I agree it is an issue by reading the code, I found that it is not that much of an issue
   effectively. The reason is that the base directories are linked.
   # ll /var/lock
   lrwxrwxrwx 1 root root 9 Apr 25 09:51 /var/lock -> /run/lock/

2. The collision of the stop/starting of ntpd due to the ntpdate hook.
   That is a real issue and actually causing more than just your symptom here.
   That part of it will be addressed in bug 1593907 which is about to be SRUed soon I hope.

Now what does this imply for this bug here, I'd say there is quite some hope that the issue resolves once the SRU of bug 1593907 is complete. I couldn't yet reproduce so I can't verify yet.

But I want you involved and get the automatic updates of the SRU process so that you can check any effect on your issue when it is available.
What I'll do is to mark this bug as a duplicate to 1593907. Once that one is in proposed please update this bug over here (not on the one I it dup to).
If it fixes the issue great - and I hope/expect it does - great, if not we can un-dup this one again.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.