Implement/update time sync logic for CI (system tests)

Bug #1320815 reported by Bogdan Dobrelya
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Fuel QA Team

Bug Description

Please ensure for every reverting from the snapshots the following steps os executed as well:
1) Sync time at the master node host OS
ntpdate -u pool.ntp.org
2) Enter to the docker container shell for astute
dockerctl shell astute
3) use mco rexec plugin to issue time sync for all nodes in the manged enviromnents
mco rpc -v execute_shell_command execute cmd="hostname; date; ntpdate -u <master_node_admin_ip>; date" | awk '/\\n/{gsub("\\\\n","\n");print}'

Note: use any similar commands you want, just make sure the time would be in sync prior to the starting deploy anything.

Tags: system-tests
Changed in fuel:
assignee: nobody → Fuel QA Team (fuel-qa)
description: updated
description: updated
Revision history for this message
Mike Scherbakov (mihgen) wrote :

If desync is large, MCollective won't be able to run its agents. So I believe that more reliable solution is to ssh from systests onto each node and run those commands.

Changed in fuel:
status: Confirmed → Triaged
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Please update the implementation:
https://github.com/stackforge/fuel-main/blob/master/fuelweb_test/models/environment.py#L364
https://github.com/stackforge/fuel-main/blob/master/fuelweb_test/models/environment.py#L340
At least, the fuel master node should sync with pool.ntp.org *before* all other nodes start syncing with master.
We could use something like this:
1) ssh to master and sync it with ntp in Internet (repeat it periodically, until the return code is 0)
2) ssh to other nodes and sync them with master || ntp in internet (repeat it periodically, until the return code is 0)

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Please note, the existing logic would use wrong ntp server for Fuel master node:

[root@nailgun] egrep '^server' /etc/ntp.conf | sed '/^#/d' | awk '{print $2}'
0.centos.pool.ntp.org
1.centos.pool.ntp.org
2.centos.pool.ntp.org
127.127.1.0

[root@nailgun fuel]# egrep '^server' /etc/ntp.conf | sed '/^#/d' | awk '{print $2}' | tail -n 1
127.127.1.0

But it should be the ntp server in Internet instead

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-main (master)

Reviewed: https://review.openstack.org/93896
Committed: https://git.openstack.org/cgit/stackforge/fuel-main/commit/?id=49b8c1c0a959094480f1829162baab59fd810c85
Submitter: Jenkins
Branch: master

commit 49b8c1c0a959094480f1829162baab59fd810c85
Author: Aleksandra Fedorova <email address hidden>
Date: Fri May 16 15:45:42 2014 +0400

    Log time sync on admin and slaves

    Change-Id: I2e7ef4a4f7d3b2066ad80ad92b070388b61f86b3
    Related-Bug: #1320815

Mike Scherbakov (mihgen)
Changed in fuel:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-main (stable/4.1)

Related fix proposed to branch: stable/4.1
Review: https://review.openstack.org/97587

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-main (stable/4.1)

Reviewed: https://review.openstack.org/97587
Committed: https://git.openstack.org/cgit/stackforge/fuel-main/commit/?id=7ed0f85acc0bab4b9157703a618b8cc9fd7de3e1
Submitter: Jenkins
Branch: stable/4.1

commit 7ed0f85acc0bab4b9157703a618b8cc9fd7de3e1
Author: Aleksandra Fedorova <email address hidden>
Date: Fri May 16 15:45:42 2014 +0400

    Log time sync on admin and slaves

    Change-Id: I2e7ef4a4f7d3b2066ad80ad92b070388b61f86b3
    Related-Bug: #1320815

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Please reopen if the above commit (Log time sync on admin and slaves) is not sufficient to fix this bug.

Changed in fuel:
status: Triaged → Fix Committed
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

Dmitry, we are implement only first Bogdan's point.

Changed in fuel:
status: Fix Committed → In Progress
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Nastya, AFAIK you implemented the steps 2 and 3 via ssh as well

Revision history for this message
Aleksandra Fedorova (bookwar) wrote :

Here you can see that this ntpdate doesn't work reliably:

We create environment by this job here:

https://fuel-jenkins.mirantis.com/view/devops/job/create_env/85/

and then run several consequent tests.

At first this sync works

  https://fuel-jenkins.mirantis.com/job/5_0_fuellib_review_systest_ubuntu/39/console

but then, at this job

  https://fuel-jenkins.mirantis.com/job/5_0_fuellib_review_systest_ubuntu/40/console

we get the exception:

2014-06-26 14:32:14,278 - WARNING environment.py:358 -- Paramiko exception catched while trying to run ntpdate: 0 != 1

And error on the node looks like:

# ntpdate -u $(egrep '^server' /etc/ntp.conf | sed '/^#/d' | awk '{print $2}')
27 Jun 14:50:20 ntpdate[1731]: no server suitable for synchronization found

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I believe the implementation is quite a reliable, but the ntp sources at the nodes - aren't

E.g.
[root@nailgun ~]# egrep '^server' /etc/ntp.conf | sed '/^#/d' | awk '{print $2}'
0.centos.pool.ntp.org
1.centos.pool.ntp.org
2.centos.pool.ntp.org
127.127.1.0
[root@nailgun ~]# ssh node-1 egrep '^server' /etc/ntp.conf | sed '/^#/d' | awk '{print $2}'
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
10.108.10.2

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Perhaps, we could configure an additional ntp sources for CI gates, similar to the rsyslog configuration for :5514 tcp

Changed in fuel:
status: In Progress → Triaged
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

Okay, created new issue about additional ntp source https://bugs.launchpad.net/fuel/+bug/1335439, this issue about sync time in systests completely implemented.

Changed in fuel:
status: Triaged → Fix Released
Revision history for this message
Igor Shishkin (teran) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-main (master)

Change abandoned by Igor Shishkin (<email address hidden>) on branch: master
Review: https://review.openstack.org/111533

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.