The default requirement to have 4 or more NTP servers is too strict

Bug #1934876 reported by Nobuto Murata
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
NTP Charm
Confirmed
Undecided
Unassigned

Bug Description

At this moment, ntp charm has Nagios checks as follows:
https://jaas.ai/ntp/47#charm-config-nagios_ntpmon_checks
"offset peers reach sync proc vars"

If I'm not mistaken "peers" will check the number of available NTP servers and if it's less than 3 then it gives a warning/error.

In a typical firewalled environment, it's pretty common to have two AD servers, two DNS servers, and two NTP servers in a local network. So we would like to check if "at least 2 ntp servers are available" out of the box instead.

Revision history for this message
Drew Freiberger (afreiberger) wrote :

Thanks, Nobuto. I agree with the sentiment that this should be configurable.

To that end, I've opened a request upstream for configurable metric alerting thresholds in check_ntpmon.py.

https://github.com/paulgear/ntpmon/issues/14

In the mean time, you can disable the check by removing the peers check from your specific deployment, as the reach check should provide you enough feedback about ntp server connectivity.

The upstream recommendation is to swap to tracking NTP via telegraf plugin and configure alerting thresholds within Alertmanager.

As a philosophical side note, NTP does better keeping an agreed time across a platform with more sources rather than fewer. Given this, we should encourage adoption of additional ntp sources in these environments where there are only two servers available.

Changed in ntp-charm:
status: New → Confirmed
Revision history for this message
Nobuto Murata (nobuto) wrote :

Looks like 4 is the minimum number actually to have OK status:

https://github.com/paulgear/ntpmon
> peers:
> Are there more than the minimum number of peers active? The NTP
> algorithms require a minimum of 3 peers for accurate clock management; to
> allow for failure or maintenance of one peer at all times, NTPmon returns
> OK for 4 or more configured peers, CRITICAL for 1 or 0, and WARNING for
> 2-3.

summary: - The default requirement to have 3 or more NTP servers is too strict
+ The default requirement to have 4 or more NTP servers is too strict
Revision history for this message
Nobuto Murata (nobuto) wrote :

Just for the record from the actual environment.

[peers=20 by using the default pools]

$ juju run-action --wait nrpe/0 run-nrpe-check name=check-ntpmon
unit-nrpe-0:
  UnitId: nrpe/0
  id: "12"
  results:
    check-output: 'OK: offset is 0.000289 | frequency=0.234000 offset=0.000289 peers=20
      reach=96.875000 result=0 rootdelay=0.001704 rootdisp=0.000376 runtime=17650
      stratum=2 sync=1.000000 sysjitter= sysoffset=-0.000064300 tracehosts= traceloops=
      tracetime='
  status: completed
  timing:
    completed: 2022-03-15 12:07:43 +0000 UTC
    enqueued: 2022-03-15 12:07:42 +0000 UTC
    started: 2022-03-15 12:07:43 +0000 UTC

[peers=3]

$ juju run-action --wait nrpe/0 run-nrpe-check name=check-ntpmon
unit-nrpe-0:
  UnitId: nrpe/0
  id: "18"
  results:
    check-output: 'WARNING: Number of peers is too low (3) - should be greater than
      3 | frequency=0.234000 offset=0.000573 peers=3 reach=62.500000 result=0 rootdelay=0.226154
      rootdisp=0.031747 runtime=114 stratum=3 sync=1.000000 sysjitter= sysoffset=-0.000342362
      tracehosts= traceloops= tracetime='
  status: completed
  timing:
    completed: 2022-03-15 12:11:29 +0000 UTC
    enqueued: 2022-03-15 12:11:29 +0000 UTC
    started: 2022-03-15 12:11:29 +0000 UTC

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.