Chronyd sync fails in overcloud deployment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
Chronyd sync fails sometimes in multiple jobs. Last failure in OVB featureset001 master promotion job:
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | TASK [Ensure system is NTP time synced] *******
2019-03-18 08:34:51 | Monday 18 March 2019 08:31:39 +0000 (0:00:00.842) 0:03:45.959 **********
2019-03-18 08:34:51 | skipping: [overcloud-
2019-03-18 08:34:51 | "changed": false,
2019-03-18 08:34:51 | "skip_reason": "Conditional result was False"
2019-03-18 08:34:51 | }
2019-03-18 08:34:51 | changed: [overcloud-
2019-03-18 08:34:51 | "changed": true,
2019-03-18 08:34:51 | "cmd": [
2019-03-18 08:34:51 | "chronyc",
2019-03-18 08:34:51 | "waitsync",
2019-03-18 08:34:51 | "20"
2019-03-18 08:34:51 | ],
2019-03-18 08:34:51 | "delta": "0:00:10.021030",
2019-03-18 08:34:51 | "end": "2019-03-18 08:31:49.485257",
2019-03-18 08:34:51 | "rc": 0,
2019-03-18 08:34:51 | "start": "2019-03-18 08:31:39.464227"
2019-03-18 08:34:51 | }
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | STDOUT:
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000
2019-03-18 08:34:51 | try: 2, refid: CDCE4602, correction: 0.000001539, skew: 1.234
2019-03-18 08:34:51 | changed: [overcloud-
2019-03-18 08:34:51 | "changed": true,
2019-03-18 08:34:51 | "cmd": [
2019-03-18 08:34:51 | "chronyc",
2019-03-18 08:34:51 | "waitsync",
2019-03-18 08:34:51 | "20"
2019-03-18 08:34:51 | ],
2019-03-18 08:34:51 | "delta": "0:00:10.033575",
2019-03-18 08:34:51 | "end": "2019-03-18 08:31:49.541771",
2019-03-18 08:34:51 | "rc": 0,
2019-03-18 08:34:51 | "start": "2019-03-18 08:31:39.508196"
2019-03-18 08:34:51 | }
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | STDOUT:
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000
2019-03-18 08:34:51 | try: 2, refid: CDCE4602, correction: 0.000001224, skew: 0.375
2019-03-18 08:34:51 | fatal: [overcloud-
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | "changed": true,
2019-03-18 08:34:51 | "cmd": [
2019-03-18 08:34:51 | "chronyc",
2019-03-18 08:34:51 | "waitsync",
2019-03-18 08:34:51 | "20"
2019-03-18 08:34:51 | ],
2019-03-18 08:34:51 | "delta": "0:03:10.205168",
2019-03-18 08:34:51 | "end": "2019-03-18 08:34:49.741343",
2019-03-18 08:34:51 | "rc": 1,
2019-03-18 08:34:51 | "start": "2019-03-18 08:31:39.536175"
2019-03-18 08:34:51 | }
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | STDOUT:
2019-03-18 08:34:51 |
2019-03-18 08:34:51 | try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000
2019-03-18 08:34:51 | try: 2, refid: 00000000, correction: 0.000000000, skew: 0.000
2019-03-18 08:34:51 | try: 3, refid: 00000000, correction: 0.000000000, skew: 0.000
2019-03-18 08:34:51 | try: 4, refid: 00000000, correction: 0.000000000, skew: 0.000
2019-03-18 08:34:51 | try: 5, refid: 00000000, correction: 0.000000001, skew: 0.000
2019-03-18 08:34:51 | try: 6, refid: 00000000, correction: 0.000000001, skew: 0.000
2019-03-18 08:34:51 | try: 7, refid: 00000000, correction: 0.000000001, skew: 0.000
2019-03-18 08:34:51 | try: 8, refid: 00000000, correction: 0.000000001, skew: 0.000
2019-03-18 08:34:51 | try: 9, refid: 00000000, correction: 0.000000001, skew: 0.000
2019-03-18 08:34:51 | try: 10, refid: 00000000, correction: 0.000000001, skew: 0.000
2019-03-18 08:34:51 | try: 11, refid: 00000000, correction: 0.000000002, skew: 0.000
2019-03-18 08:34:51 | try: 12, refid: 00000000, correction: 0.000000002, skew: 0.000
2019-03-18 08:34:51 | try: 13, refid: 00000000, correction: 0.000000002, skew: 0.000
2019-03-18 08:34:51 | try: 14, refid: 00000000, correction: 0.000000002, skew: 0.000
2019-03-18 08:34:56 | try: 15, refid: 00000000, correction: 0Exception occured while running the command
Maybe it's because chronyd is not reloaded when got configured in overcloud deployment. It's already installed and runs on hosts, so it's not either reloaded or restarted when it's reconfigured.
2019-03-18 08:34:56 | Traceback (most recent call last):
2019-03-18 08:34:56 | File "/usr/lib/
2019-03-18 08:34:56 | super(Command, self).run(
2019-03-18 08:34:56 | File "/usr/lib/
2019-03-18 08:34:56 | return super(Command, self).run(
2019-03-18 08:34:56 | File "/usr/lib/
2019-03-18 08:34:56 | return_code = self.take_
2019-03-18 08:34:56 | File "/usr/lib/
2019-03-18 08:34:56 | verbosity=
2019-03-18 08:34:56 | File "/usr/lib/
2019-03-18 08:34:56 | raise exceptions.
2019-03-18 08:34:56 | DeploymentError: Overcloud configuration failed.
2019-03-18 08:34:56 | Overcloud configuration failed.
2019-03-18 08:34:56 | .000000002, skew: 0.000
2019-03-18 08:34:56 | try: 16, refid: 00000000, correction: 0.000000002, skew: 0.000
2019-03-18 08:34:56 | try: 17, refid: 00000000, correction: 0.000000002, skew: 0.000
2019-03-18 08:34:56 | try: 18, refid: 00000000, correction: 0.000000003, skew: 0.000
2019-03-18 08:34:56 | try: 19, refid: 00000000, correction: 0.000000003, skew: 0.000
2019-03-18 08:34:56 | try: 20, refid: 00000000, correction: 0.000000003, skew: 0.000
2019-03-18 08:34:56 |
2019-03-18 08:34:56 |
2019-03-18 08:34:56 | MSG:
2019-03-18 08:34:56 |
2019-03-18 08:34:56 | non-zero return code
tags: | added: ci |
tags: | added: promotion-blocker |
tags: | added: alert |
removed promotion blocker tag as there is a green run right now in https:/ /review. rdoproject. org/zuul/ builds? job_name= periodic- tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t001-master
via sshnaidm|rover just now in #oooq
18:36 < sshnaidm|rover> marios_|ruck, well, now it doesn't fail, seems like random error
18:36 < sshnaidm|rover> marios_|ruck, I mean it doesn't block promotion currently, but should be fixed of course