Fuel for OpenStack

sync_time fails on non-first deployment of controllers

Bug #1452912 reported by Andrew Woodward on 2015-05-07

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Committed	High	Andrew Woodward	Fuel for OpenStack 6.1
	6.0.x	Invalid	Undecided	Unassigned

Bug Description

If a deploymet fails, or otherwise has to run the controllers a second time, the ntp_sync task fails. Due to bad error reporting from astute for tasks running on multiple nodes at once, its hard to troubleshoot. I finally sat down and found the root cause.

The command

ntpdate -u $(egrep '^server' /etc/ntp.conf | egrep -v '127\.127\.[0-9]+\.[0-9]+' | sed '/^#/d' | awk '{print $2}')

will find more than one line of results on the controllers

0.pool.ntp.org
1.pool.ntp.org
2.pool.ntp.org

while non-controllers have (controller vip)

server 192.168.0.2 iburst

So I reproduced the operation in bash

[root@fuel ~]# for each in {13..17} ; do ssh node-${each} -C <<EOF || break
ntpdate -u $(egrep '^server' /etc/ntp.conf | egrep -v '127\.127\.[0-9]+\.[0-9]+' | sed '/^#/d' | awk '{print $2}')
>
> EOF
> done
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added 'node-13' (RSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-52-generic x86_64)

* Documentation: https://help.ubuntu.com/
You have new mail.
stdin: is not a tty
7 May 20:38:13 ntpdate[31567]: adjust time server 204.9.54.119 offset -0.003217 sec
-bash: line 2: 1.pool.ntp.org: command not found
-bash: line 3: 2.pool.ntp.org: command not found
[root@fuel ~]#

we see that the second and third lines on the controller are interpreted as commands to bash and are likely raising non-zero exit code back to astute, hence it's failure.

if we change the command to

for each in {13..17} ; do ssh node-${each} -C <<EOF || break
ntpdate -u $(egrep '^server' /etc/ntp.conf | egrep -v '127\.127\.[0-9]+\.[0-9]+' | sed '/^#/d' | awk '{print $2}'| head -1)

EOF
done

then it runs on each node

full transcript and notes http://paste.openstack.org/show/216490/

Andrew Woodward (xarses) on 2015-05-07

summary:

- ntp_sync fails on non-first deployment of controllers
+ sync_time fails on non-first deployment of controllers

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-05-07: Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/181154

Changed in fuel:
assignee:	Fuel Library Team (fuel-library) → Andrew Woodward (xarses)
status:	Triaged → In Progress

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2015-05-08:

I set the 6.0.x milestone to invalid as there were many changes for ntpd had been done only for the 6.1 release cycle.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-05-08: Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/181154
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=5f287aa25c05a487553cd480da941aba7d08f50e
Submitter: Jenkins
Branch: master

commit 5f287aa25c05a487553cd480da941aba7d08f50e
Author: Andrew Woodward <email address hidden>
Date: Thu May 7 14:25:26 2015 -0700

Limit sync_time to only one node

    As described in # 1452912, If the ntp servers is more than one node
    on a host then it will cause a non-zero exit code to astute which causes
    the task to fail.

Change-Id: I865d1ee9aebfaff3bca3827d3fadd394d2a624e1
Closes-bug: #1452912

Changed in fuel:
status:	In Progress → Fix Committed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.