Restart chronyd execution is skipped on overcloud-novacompute-0

Bug #1821018 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Undecided
Unassigned

Bug Description

This problem was initially raised in https://bugs.launchpad.net/tripleo/+bug/1820580 - but this bug was marked fixed when multiple ntp servers were passed. However, the problem remains that the chronyd service is not restarted after configuration on novacompute nodes.

Looking at an OVB overcloud_deploy.log, RUNNING HANDLER [chrony : Restart chronyd] is shown twice - once when it runs on the controller nodes after chrony is configured (and executes) and a second time when it is supposed to run on the novacompute node, after chrony is configured, but is skipped:

2019-03-19 15:54:10 | RUNNING HANDLER [chrony : Restart chronyd] *************************************
2019-03-19 15:54:10 | Tuesday 19 March 2019 15:53:40 +0000 (0:00:00.327) 0:04:12.296 *********
2019-03-19 15:54:10 | skipping: [overcloud-novacompute-0] => {
2019-03-19 15:54:10 | "changed": false,
2019-03-19 15:54:10 | "skip_reason": "Conditional result was False"

Skipping the restart causes the overcloud deployment on baremetal (an internal system) to fail because if the restart is skipped the custom servers in the configuration are never picked up and the default servers are blocked from internal systems.

See the logs copied below (from OVB - the deploy passes here):

Runs on controller nodes:
https://logs.rdoproject.org/60/636860/5/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/c731cdf/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2019-03-20_05_23_46

Is skipped on novacompute node:
https://logs.rdoproject.org/60/636860/5/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/c731cdf/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2019-03-20_05_24_47

On baremetal, the failure is shown (note that multiple ntp servers and dns servers are passed to this overcloud deployment)
https://sf.hosted.upshift.rdu2.redhat.com/logs/25/165325/13/check/periodic-tripleo-ci-centos-7-baremetal-single_nic-3ctlr_1comp-featureset001-master/d299874/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2019-03-19_00_52_24

Since the restart is called via a handler from an included role:

https://github.com/openstack/ansible-role-chrony/blob/master/handlers/main.yml
https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/timesync/chrony-baremetal-ansible.yaml#L121

possibly, this is the result on an ansible bug in the version used:

[zuul@undercloud ~]$ ansible --version
ansible 2.6.14
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/zuul/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Oct 30 2018, 23:45:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

as per the following reports:

https://github.com/ansible/ansible/issues/37512
https://stackoverflow.com/questions/44168529/ansible-service-module-not-starting-supervisor-with-state-restarted

Revision history for this message
Ronelle Landy (rlandy) wrote :

Unsure of what importance to mark here as OVB is currently passing and baremetal - which shows a consistent error - is not yet in an pipeline.

description: updated
Changed in tripleo:
milestone: none → stein-rc1
status: New → Triaged
Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
Ronelle Landy (rlandy) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.