check_ntpmon fails if host machine is running LXC using different NTP service

Bug #1933525 reported by Zachary Zehring
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
NTP Charm
New
Undecided
Unassigned

Bug Description

Machine: Bionic (running chronyd)
LXC: Xenial (running ntpd)

After upgrading NTP charm to latest (rev 47), I received an alert for Unknown error: NRPE: Unable to read output. Running the check manually, I see that the check was trying to run an "ntpq" command on a machine running chrony. Diving into the code, I found that ntpmon-ntp-charm/process.py determines what NTP service is running by checking psutil and iterating through processes. However, this is flawed because it also picks up processes running in containers. So in this case, ntpd came first in the list, so the check incorrectly used ntpq.

EX:
$ ps aux | egrep 'chronyd|ntpd'
100112 1547742 0.0 0.0 110616 5348 ? Ssl 2020 62:49 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 112:116
_chrony 1701357 0.0 0.0 105588 2876 ? S 15:57 0:00 /usr/sbin/chronyd

As a workaround, I just edited the NTPProcess.names list, removing ntpd entirely so it only checks for chronyd (which we know is running).

To fix, the NTPProcess should be reworked to more reliably select the correct NTP implementation.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

This was found on a focal deployment with cs:ntp deployed as a subordinate to the metal charm, and ceph-fs deployed on an lxd on top of that metal. It's random as to which of the processes, ntpd in the container, or chronyd on the metal, comes up in the proc table first.

For a very simple fix, since we typically would not run cs:ntp inside a container, you could ignore containerized ntpd processes by requiring PPID of the process to be 1 as well as matching the process name to the list of options.

Suggest changing process.py line 180 from:
                if name in self.names:
to:
                ppid = proc.ppid() if self.PSUTIL2 else proc.ppid
                if ppid == 1 and name in self.names:

Ultimately, checking the package(s) installed in the current system is much more valuable to determine which self.names should be used.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.