deployment failed due to ntpd server can't reach higher stratum
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Released
|
High
|
Stanislaw Bogatkin |
Bug Description
Currently some providers restricts access to 123 port on root ntp servers.
If master node is placed inside such environment - then deployment will fail with error:
2015-03-23T10:45:20 err: [517] Error running RPC method granular_deploy: Failed to execute hook .
---
priority: 300
fail_on_error: true
type: shell
uids:
- '3'
- '4'
parameters:
retries: 10
cmd: ntpdate -u $(egrep '^server' /etc/ntp.conf | egrep -v '127\.127\
| sed '/^#/d' | awk '{print $2}')
timeout: 180
interval: 1
, trace:
["/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
"/usr/
2015-03-23T10:45:20 info: [517] Casting message to Nailgun: {"method"
eploy. Failed to execute hook .\n\n---\npriority: 300\nfail_on_error: true\ntype: shell\nuids:\n- '3'\n- '4'\nparameters:\n retries: 10\n cmd: ntpdate -u $(egrep '^server' /etc/ntp.conf | egre
p -v '127\\.
As you can see - master node can't reach root servers:
ntpq> peers
remote refid st t when poll reach delay offset jitter
=======
kahuna.ruselabs .INIT. 16 u - 256 0 0.000 0.000 0.000
422224.s.dediku .INIT. 16 u - 256 0 0.000 0.000 0.000
ponderosa.piney .INIT. 16 u - 256 0 0.000 0.000 0.000
ntpq> as
ind assid status conf reach auth condition last_event cnt
=======
1 15020 8011 yes no none reject mobilize 1
2 15021 8011 yes no none reject mobilize 1
3 15022 8011 yes no none reject mobilize 1
and if command ntpdate -u 'fuel_ip' executed - it produces error:
23 Mar 12:32:52 ntpdate[17115]: no server suitable for synchronization found
This can be fixed by adding settings to fuel master node - that will instruct ntpd propagate itself as synced server:
--- ntp.conf.orig 2015-03-23 13:29:32.847968972 +0000
+++ ntp.conf 2015-03-23 13:13:33.706984063 +0000
@@ -16,6 +16,8 @@
server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 2.pool.ntp.org iburst
+server 127.127.1.0
+fudge 127.127.1.0 stratum 10
# Driftfile.
driftfile /var/lib/ntp/drift
After this settings added, previous command start working fine:
ntpdate -u 10.20.0.2
23 Mar 13:14:01 ntpdate[21021]: adjust time server 10.20.0.2 offset -0.000010 sec
And deployment continues...
cat /etc/fuel/
VERSION:
feature_groups:
- mirantis
production: "docker"
release: "6.1"
api: "1.0"
build_number: "216"
build_id: "2015-03-
nailgun_sha: "51974b50c3961b
python-
astute_sha: "4a117a1ca6bdcc
fuellib_sha: "a636c680e3c7d8
ostf_sha: "b4d284e9364e30
fuelmain_sha: "f52e4442df55a2
Changed in fuel: | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Fuel Library Team (fuel-library) |
milestone: | none → 6.1 |
Changed in fuel: | |
assignee: | Fuel Library Team (fuel-library) → Stanislaw Bogatkin (sbogatkin) |
There is actually nothing that we can do. If we will add local undisciplined clock to our master node, it will lead to some unpredictable errors too (and it was - as you can see, we had this code just a month or two ago). Case when we need upstream servers available and should not add local clock to master node is just a less evil.