MAAS does not use NTP servers specified in DHCPD options

Bug #1257082 reported by Brian Rzycki
This bug report is a duplicate of:  Bug #427775: ntpdate.dhcp always ignored. Edit Remove
40
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned
isc-dhcp (Ubuntu)
Invalid
Undecided
Unassigned
Precise
New
Undecided
Unassigned
Trusty
New
Undecided
Unassigned
ntp (Debian)
New
Unknown
ntp (Ubuntu)
Confirmed
Medium
Unassigned
Precise
In Progress
Undecided
Unassigned
Trusty
In Progress
Undecided
Unassigned

Bug Description

[Impact]

MAAS-deployed systems *that do not have persistent RTCs* (unusual) have difficulty with time and authentication, generally making these nodes unusable.

[Workaround]

See comment #8.

[Original Description]

I have tried setting up NTP servers as DHCP options to MAAS nodes because I am behind a proxy here at work that cannot contact ntp.ubuntu.com. Here is the top of the dhcpd.conf file on on my MAAS head node, maas01:

root@maas01:/etc/dhcp# less dhcpd.conf
default-lease-time 600;
max-lease-time 7200;

subnet 192.168.0.0 netmask 255.255.0.0 {
  option domain-name "mgmt";
  option domain-name-servers 192.168.255.254;
  option routers 192.168.255.254;

  pool {
    range 192.168.0.1 192.168.255.253;
    deny unknown-clients;
  }
}

subnet 10.255.0.0 netmask 255.255.0.0 {
  option domain-name "maas";
  option domain-name-servers 10.255.0.1;
  option routers 10.255.0.1;
  option ntp-servers 172.31.22.1, 172.31.23.1, 172.31.20.104;
  next-server 10.255.0.1;

  pool {
    range 10.255.1.0 10.255.255.254;
    deny unknown-clients;
  }
}

I have also verified the parameter is being sent to a client’s DHCP lease:

ubuntu@sled204n0:/var$ cat /var/lib/dhcp/dhclient.eth0.leases
lease {
  interface "eth0";
  fixed-address 10.255.4.44;
  option subnet-mask 255.255.0.0;
  option routers 10.255.0.1;
  option dhcp-lease-time 600;
  option dhcp-message-type 5;
  option domain-name-servers 10.255.0.1;
  option dhcp-server-identifier 10.255.0.1;
  option ntp-servers 172.31.22.1,172.31.23.1,172.31.20.104;
  option domain-name "maas";
  renew 4 2000/01/06 19:40:51;
  rebind 4 2000/01/06 19:40:51;
  expire 4 2000/01/06 19:40:51;
}

Even so, the date on the target node is still incorrect.

ubuntu@sled204n0:/etc$ date
Thu Jan 6 19:52:29 UTC 2000

The ntpdate defaults are the following (unchanged from MAAS defaults):

ubuntu@sled204n0:/etc$ cat /etc/default/ntpdate
# The settings in this file are used by the program ntpdate-debian, but not
# by the upstream program ntpdate.

# Set to "yes" to take the server list from /etc/ntp.conf, from package ntp,
# so you only have to keep it in one place.
NTPDATE_USE_NTP_CONF=yes

# List of NTP servers to use (Separate multiple servers with spaces.)
# Not used if NTPDATE_USE_NTP_CONF is yes.
NTPSERVERS="ntp.ubuntu.com"

# Additional options to pass to ntpdate
NTPOPTIONS=""

And the DHCP generated NTP server file is correct:

ubuntu@sled204n0:/etc$ cat /var/lib/ntpdate/default.dhcp
# NTP server entries received from DHCP server
NTPSERVERS='172.31.22.1 172.31.23.1 172.31.20.104'

The culprit seems to be in how ntpdate-debian is programmed. the logic ignores /var/lib/ntpdate/default.dhcp if /etc/default/ntpdate sets NTPDATE_USE_NTP_CONF=yes (the default).

After examining the script further my recommendation would be for the /etc/dhcp/dhclient-exit-hooks.d/ntpdate to create the file /var/lib/ntp/ntp.conf.dhcp. By doing so ntpdate-debian will work transparently with /etc/defaults/ntpdate and NTP servers advertised by DHCPD.

-------------------------------------------------------------------------
(contents of /var/log/maas/* is 125MB in size, will post data from there if requested)

# dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==================================-======================================-============-==========================================================================
ii maas 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Server
ii maas-cli 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Client Tool
ii maas-cluster-controller 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Cluster Controller
ii maas-common 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Server
un maas-dhcp <none> (no description available)
un maas-dns <none> (no description available)
ii maas-region-controller 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Server
ii python-django-maas 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Server - (django files)
ii python-maas-client 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS API Client - (python files)
ii python-maas-provisioningserver 1.3+bzr1461+dfsg-0ubuntu2.2+tay.8 all Ubuntu MAAS Server

[Test Case]:

1) Configure a MAAS server to pass via DHCP the following options:
    option ntp-servers your-ntp-server-address.
2) Boot a new MAAS node that do not have persistent RTC.
3) Check that the contents of /var/lib/ntpdate/default.dhcp exists after boot and has the correct ntp-servers value.
3) Check that the `date` is correct according to your DHCP defined ntp-servers.
4) If the date is correct according to your DHCP defined ntp-servers, the problem is fixed.

Regression :

None expected since if NTPDATE_USE_NTP_CONF is set to YES, and some of the default ntp.conf files is found that will be used.

Tags: cts
Revision history for this message
Brian Rzycki (b-rzycki) wrote :

I should also mention that Juju deployment of charms fails when the time on the Juju bootstrap node is too old. This bug had the surprising consequence of not being able to deploy any charms to other MAAS nodes.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I'm retargeting to isc-dhcp, since it looks like dhclient is not honouring the ntp-server option.

Changed in maas:
status: New → Invalid
Revision history for this message
Scott Moser (smoser) wrote :

I'll try to look at this some more tomorrow.

relavent files are:
  [ntpdate] /etc/dhcp/dhclient-exit-hooks.d/ntpdate
  [ntp] /etc/dhcp/dhclient-exit-hooks.d/ntp
  [ntp] /etc/init.d/ntp /etc/ntp.conf

it certainly looks to me like the stuff is supposed to work

Revision history for this message
Robie Basak (racb) wrote :

> The culprit seems to be in how ntpdate-debian is programmed. the logic ignores /var/lib/ntpdate/default.dhcp if /etc/default/ntpdate sets NTPDATE_USE_NTP_CONF=yes (the default).

This matches my understanding when I looked at ntpdate-debian with a similar issue, though I haven't tried to verify this for certain.

I wonder what the intention of ntpdate-debian is in this case. Does it do what it does because we don't want NTP to be set from DHCP by default?

Is the reason you're facing this problem because your hardware doesn't have an RTC, or because system doesn't use the RTC at boot time, and is stuck at epoch? Fixing NTP from DHCP should certainly fix the issue, but why isn't the RTC being set correctly at install time and read correctly at boot time? I recently had a case where I fixed this in the kernel, and then the NTP DHCP situation didn't matter.

Revision history for this message
Brian Rzycki (b-rzycki) wrote :

Julian: I disagree that this is dhclient's fault. The lease file clearly shows the NTP servers were received and the contents of /var/lib/ntpdate/default.dhcp are correctly set to the 3 NTP servers, showing the dhclient exit hook worked too. At this point it is up to the ntpclient to use these servers.

Scott: there is no /etc/ntp.conf file. It is not auto-generated by any script. I would expect the ntp client to honor the DHCP NTP servers if /etc/ntp.conf does not work.

Robie: The ARM device we have uses a capacitor to store the state of the RTC. If the device is without power for a few days the RTC completely resets. We absolutely need NTP working to keep these nodes functioning properly in a MAAS cluster. I too wonder why ntpdate-debian completely ignores the DHCP generated file /var/lib/ntpdate/default.dhcp when /etc/ntp.conf doesn't exist.

Revision history for this message
Robie Basak (racb) wrote :

If you set NTPDATE_USE_NTP_CONF=no (eg. from a preseed late_command), then does this fix the issue?

Sounds to me that we either need to have MAAS do that, or consult with Debian about the intention of this setting and its interaction with DHCP in the default case to see if it's a bug or it's intentional.

I understand your use case and think that MAAS should support it, but I wonder whether just fixing the DHCP hook will work in all cases, or some other work is required here. My doubt is just that DHCP needs to run much earlier than usual in the cloud-init case, and so I don't know if the hook will always run, and therefore just want to make sure. Scott?

Revision history for this message
Brian Rzycki (b-rzycki) wrote :

Robie: Yes, changing NTPDATE_USE_NTP_CONF=no makes everything work, at least when I change it after the node is deployed. I am not aware of how to try this with a preeseed late_command. Do you have instructions for me to attempt this? Will this mean all newly-deployed MAAS nodes get this change?

Below is the output from one of the nodes testing NTP_CONF on and off. It shows the script working correctly and using the NTP servers from DHCP when set to =no.
---------------------------------------------
root@sled204n2:~# grep ^NTPDATE_USE_NTP_CONF= /etc/default/ntpdate
NTPDATE_USE_NTP_CONF=yes
root@sled204n2:~# ntpdate-debian
20 Jan 21:37:02 ntpdate[14019]: no server suitable for synchronization found
root@sled204n2:~# date
Thu Jan 20 21:37:05 UTC 2000
root@sled204n2:~# nano /etc/default/ntpdate
root@sled204n2:~# grep ^NTPDATE_USE_NTP_CONF= /etc/default/ntpdate
NTPDATE_USE_NTP_CONF=no
root@sled204n2:~# ntpdate-debian
 3 Dec 19:27:26 ntpdate[14027]: step time server 172.31.23.1 offset 437694593.235280 sec
root@sled204n2:~# date
Tue Dec 3 19:27:31 UTC 2013
root@sled204n2:~#

Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1257082] Re: MAAS does not use NTP servers specified in DHCPD options

On Tue, Dec 3, 2013 at 12:35 PM, Brian Rzycki <email address hidden> wrote:
> Robie: Yes, changing NTPDATE_USE_NTP_CONF=no makes everything work, at
> least when I change it after the node is deployed. I am not aware of how
> to try this with a preeseed late_command. Do you have instructions for
> me to attempt this?

I think you need to edit /usr/share/maas/preseeds/generic. Find the
preseed/late_command section.
Then, just before the final true, add another command with &&:

in-target sed -i 's/NTPDATE_USE_NTP_CONF=yes/NTPDATE_USE_NTP_CONF=no/'
/etc/default/ntpdate

BEFORE:
{{def post_scripts}}
# Executes late command and disables PXE.
d-i preseed/late_command string true && \
    in-target sh -c 'f=$1; shift; echo $0 > $f && chmod 0440 $f $*'
'ubuntu ALL=(ALL) NOPASSWD: ALL' /etc/sudoers.d/maas && \
    in-target wget --no-proxy "{{node_disable_pxe_url|escape.shell}}"
--post-data "{{node_disable_pxe_data|escape.shell}}" -O /dev/null && \
    true
{{enddef}}

AFTER:
{{def post_scripts}}
# Executes late command and disables PXE.
d-i preseed/late_command string true && \
    in-target sh -c 'f=$1; shift; echo $0 > $f && chmod 0440 $f $*'
'ubuntu ALL=(ALL) NOPASSWD: ALL' /etc/sudoers.d/maas && \
    in-target wget --no-proxy "{{node_disable_pxe_url|escape.shell}}"
--post-data "{{node_disable_pxe_data|escape.shell}}" -O /dev/null && \
    in-target sed -i
's/NTPDATE_USE_NTP_CONF=yes/NTPDATE_USE_NTP_CONF=no/'
/etc/default/ntpdate && \
    true
{{enddef}}

> Will this mean all newly-deployed MAAS nodes get
> this change?

Yup.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Tuesday 03 Dec 2013 15:31:02 you wrote:
> Julian: I disagree that this is dhclient's fault. The lease file clearly
> shows the NTP servers were received and the contents of
> /var/lib/ntpdate/default.dhcp are correctly set to the 3 NTP servers,
> showing the dhclient exit hook worked too. At this point it is up to
> the ntpclient to use these servers.

Agreed, I forgot to update my comment after targeting to other projects.

Revision history for this message
Robie Basak (racb) wrote :

I have filed a bug in Debian seeking clarification: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=731352

Revision history for this message
Robie Basak (racb) wrote :

I think the issue is down to ntp's packaging, or possibly MAAS' (need of) configuration of ntp's packaging, and isc-dhcp is doing the right thing here.

Changed in isc-dhcp (Ubuntu):
status: New → Invalid
Changed in ntp (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Brian Rzycki (b-rzycki) wrote :

Robie: I have verified updating /usr/share/maas/preseeds/generic works. I re-deployed a new node and saw that the time is correct. Thank you for the workaround.

I would still like to see what Debian's recommendation is on the upstream bug, whenever they decide to examine it.

Robie Basak (racb)
affects: isc-dhcp (Debian) → ntp (Debian)
Revision history for this message
Robie Basak (racb) wrote :
Changed in ntp (Debian):
status: Unknown → New
Revision history for this message
Robie Basak (racb) wrote :

Setting to Medium priority as a workaround is available.

Debian is considering making NTP available by default in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=731594.

I'm also concerned that though changing NTPDATE_USE_NTP_CONF to default to "no" if /etc/ntp.conf doesn't exist seems to be right solution right now, it may have a negative impact to non-MAAS users, and so this needs to be considered carefully.

So it seems like we're still figuring out what the right solution should be. Some discussion in Ubuntu at: https://lists.ubuntu.com/archives/ubuntu-devel/2013-December/037895.html

Given that there's a workaround, I think we should probably wait to see what Debian wants to do with ntp by default before we tackle this. If this takes too long, then I suppose we could patch the behaviour above, or even send that to Debian.

description: updated
Changed in ntp (Ubuntu):
importance: High → Medium
Revision history for this message
James Troup (elmo) wrote :

So I just ran face first into this bug. Unless I'm missing something,
I don't think ntp being installed by default helps at all. MAAS and
Juju (for the MAAS provisioner) both depend on servers having an
accurate clock early in the boot process; ntpdate achieves this
because it's willing to jump time; ntp is not (for valid reasons).

And, even if I'm wrong about that, we still need a fix for trusty and
precise. Could we not just fix/reverse the logic in ntpdate-debian so
that it prefers dhcp provided NTP servers if one is found? Or at the
very least have it fallback to dhcp provided NTP if it can't find an
ntp.conf?

Revision history for this message
C de-Avillez (hggdh2) wrote :

@James Troup: actually, NTP *can* jump time on start up via the -g parameter. See http://www.eecis.udel.edu/~mills/ntp/html/ntpd.html.

Revision history for this message
Jorge Niedbalski (niedbalski) wrote :

Hello,

Since there is no consensus on the definitive solution , I am proposing a simple fix for prefers dhcp provided NTP servers if one is found as also @elmo pointed here.

Please review the test-case and description for further details.

description: updated
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :
Changed in ntp (Ubuntu Precise):
status: New → In Progress
Changed in ntp (Ubuntu Trusty):
status: New → In Progress
Revision history for this message
Robie Basak (racb) wrote :

Thanks Jorge.

The way I imagined a fix going that might also be acceptable to Debian would be to treat NTPDATE_USE_NTP_CONF as no if ntp.conf doesn't exist, even if it says yes. Would this work for you?

Then I imagine addition of something like (untested):

if [ ! -f /etc/ntp.conf ]; then
    NTPDATE_USE_NTP_CONF=no
fi

and then no other changes. I think this would cause the DHCP setting to be used in our failure case, and it shouldn't cause unexpected an behaviour change since it doesn't make sense to request to use ntp.conf if it doesn't exist.

How does this sound to you? I'm not saying it's definitely the right way. I'm just not sure, so I'd like to propose it as an alternative.

What I think we're missing is a list of use cases, and without that I don't think it's possible to definitely state the right thing to do. I'm just trying to think of the most unobtrusive fix that would be acceptable to all and that minimises the risk of breaking some unknown use case.

Debian did say that they'd consider patches, so I think we should definitely first try to get their review (if that doesn't take too long) to avoid sending Ubuntu down a path that diverges from Debian.

Revision history for this message
Robie Basak (racb) wrote :

One catch with my approach is that setting NTPDATE_USE_NTP_CONF=yes might mean "use /var/lib/ntp/ntp.conf.dhcp", and that would break. But maybe this file won't exist or is not useful if /etc/ntp.conf doesn't exist?

Revision history for this message
Robie Basak (racb) wrote :

Oh, one more thing. I get the impression the failure you're experiencing, which is the case that you do have a working RTC, is a slightly different issue with the same symptom. curtin should set the RTC, and so on boot into the installed system you should already have a good enough clock. I think this would be a separate bug in curtin to set the RTC, and then this fix would only really be a workaround for you.

If I'm right about this, then using d-i rather than fast path should work in your case (I'm not sure if that's acceptable to you as a workaround or not though).

tags: added: cts
Revision history for this message
Maarten (mthibaut-f) wrote :

There has been no movment on the Debian bug report in about 6 months, and they appear to view ntpdate as legacy. SO I think we're faced with a decision:

1) Use ntpd -g instead, i.e. stop installing the ntpdate command and use the ntp package instead and fix startup options for NTP to include "-g" in preseed

2) Use the proposed preseed change by Dann.

I feel #2 is better because if Debian is going to remove ntpdate at a later stage, they will include "-g" for ntpd by default, at which time we can simply remove our fix again. If we use #1 we might break things once Debian goes this route.

Revision history for this message
Robie Basak (racb) wrote :

On Wed, Mar 18, 2015 at 02:45:28PM -0000, Maarten wrote:
> There has been no movment on the Debian bug report in about 6
> months...

Well, nobody has proposed anything to Debian in that time either.

I'm still in favour of my comment #21, which I think Debian might
accept. Assuming that this will fix the issue for everyone, but nobody
has confirmed this.

Revision history for this message
Iain Lane (laney) wrote :

Unsubscribing ubuntu-sponsors since there is nothing to upload at this point. Please re-add if/when there is.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.