systemd starts two copies of ntp at boot time

Bug #1632575 reported by Marius Gedminas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ntp (Ubuntu)
Expired
Low
Unassigned

Bug Description

My system monitoring check noticed a stale /var/run/ntpd.pid pidfile on a xenial machine that rebooted at 2 AM after unattended upgrades. ntpd itself was running, but with a different pid.

Digging through journalctl I can see

    Oct 12 02:00:11 xenial64 ntp[681]: * Starting NTP server ntpd
    Oct 12 02:00:11 xenial64 systemd[1]: Started Getty on tty1.
    Oct 12 02:00:11 xenial64 systemd[1]: Reached target Login Prompts.
    Oct 12 02:00:11 xenial64 systemd[1]: Stopped LSB: Start NTP daemon.
    Oct 12 02:00:11 xenial64 systemd[1]: Stopped LSB: Start NTP daemon.
    Oct 12 02:00:11 xenial64 run-parts[760]: run-parts: executing /etc/redis/redis-server.post-up.d/00_example
    Oct 12 02:00:11 xenial64 systemd[1]: Starting LSB: Start NTP daemon...
    Oct 12 02:00:11 xenial64 ntpd[719]: ntpd 4.2.8p4@1.3265-o Wed Oct 5 12:34:45 UTC 2016 (1): Starting
    Oct 12 02:00:11 xenial64 ntpd[719]: Command line: /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 109:114
    Oct 12 02:00:11 xenial64 ntp[770]: * Starting NTP server ntpd
    Oct 12 02:00:11 xenial64 ntpd[778]: ntpd 4.2.8p4@1.3265-o Wed Oct 5 12:34:45 UTC 2016 (1): Starting
    Oct 12 02:00:11 xenial64 ntpd[778]: Command line: /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 109:114
    Oct 12 02:00:11 xenial64 ntp[770]: ...done.
    Oct 12 02:00:11 xenial64 systemd[1]: Started LSB: Start NTP daemon.
    Oct 12 02:00:11 xenial64 systemd[1]: Started Advanced key-value store.
    Oct 12 02:00:11 xenial64 ntpd[773]: proto: precision = 0.077 usec (-24)
    Oct 12 02:00:11 xenial64 ntpd[780]: proto: precision = 0.076 usec (-24)
    Oct 12 02:00:11 xenial64 ntpd[780]: unable to bind to wildcard address :: - another process may be running - EXITING
    Oct 12 02:00:11 xenial64 ntpd[773]: Listen and drop on 0 v6wildcard [::]:123
    Oct 12 02:00:11 xenial64 ntpd[773]: Listen and drop on 1 v4wildcard 0.0.0.0:123

As you can see, for some reason systemd starts ntpd twice, so there's a race to take over the socket, and a race to take over the pidfile. ntpd[773] gets the socket, ntpd[780] gets the pidfile.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: ntp 1:4.2.8p4+dfsg-3ubuntu5.3
ProcVersionSignature: Ubuntu 4.4.0-42.62-generic 4.4.21
Uname: Linux 4.4.0-42-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
Date: Wed Oct 12 06:18:29 2016
JournalErrors:
 Error: command ['journalctl', '-b', '--priority=warning', '--lines=1000'] failed with exit code 1: Hint: You are currently not seeing messages from other users and the system.
       Users in the 'systemd-journal' group can see all messages. Pass -q to
       turn off this notice.
 No journal files were opened due to insufficient permissions.
ProcEnviron:
 LC_CTYPE=lt_LT.UTF-8
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: ntp
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Marius Gedminas (mgedmin) wrote :
Revision history for this message
Nish Aravamudan (nacc) wrote :

Thank you for filing this bug, it does look like a race.

We probably will need to see the systemd service files for ntp, at least. I wonder if any of the `systemd-analyze` subcommands might be helpful here to determine why ntpd is started twice.

Changed in ntp (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Marius Gedminas (mgedmin) wrote :

Unfortunately there were more unattended upgrades that require a reboot since I filed this bug, so systemd-analyze can't see what happened any more.

As for systemd service files, I believe ntp has none. systemctl status ntp says

   Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)

which AFAIU means it's still using legacy sysvinit scripts.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah this is just using a generated sytsemd unit for the sysv init file.
Never the less I wasn't able to trigger that throughout some ntp upgrades in containers.

The bug was already inactive for very long, but I think the reason is currently that it isn#t really actionable.

@Marius - was that a one time issue or did you get that to be reproducible and debuggable back then?

Changed in ntp (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Marius Gedminas (mgedmin) wrote :

No, I never saw this again.

The very next day after filing this bug I had another monitoring warning about a stale ntpd pidfile, but this time there was no second ntpd process running, and journalctl showed a simple conflict with ntpdate during startup.

I apt-get removed ntpdate and never saw any more ntpd problems again on this machine.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for ntp (Ubuntu) because there has been no activity for 60 days.]

Changed in ntp (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.