tripleo-quickstart-extras-gate-newton-delorean-full-minimal fails to install undercloud - Access denied for user 'heat'@'192.168.24.1

Bug #1773445 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Invalid
Low
Arx Cruz

Bug Description

tripleo-quickstart-extras-gate-newton-delorean-full-minimal has been failing to install the undercloud since 05/09. The error is below:

2018-05-25 16:25:02 | 2018-05-25 16:25:02 - Notice: /Stage[main]/Heat::Db::Sync/Exec[heat-dbsync]/returns: ERROR: (pymysql.err.OperationalError) (1045, u"Access denied for user 'heat'@'192.168.24.1' (using password: YES)")
2018-05-25 16:25:02 | 2018-05-25 16:25:02 - Error: /Stage[main]/Heat::Db::Sync/Exec[heat-dbsync]: Failed to call refresh: heat-manage --config-file /etc/heat/heat.conf db_sync returned 1 instead of one of [0]
2018-05-25 16:25:02 | 2018-05-25 16:25:02 - Error: /Stage[main]/Heat::Db::Sync/Exec[heat-dbsync]: heat-manage --config-file /etc/heat/heat.conf db_sync returned 1 instead of one of [0]

....

 + rc=6
2018-05-25 16:25:07 | + set -e
2018-05-25 16:25:07 | + echo 'puppet apply exited with exit code 6'
2018-05-25 16:25:07 | puppet apply exited with exit code 6
2018-05-25 16:25:07 | + '[' 6 '!=' 2 -a 6 '!=' 0 ']'
2018-05-25 16:25:07 | + exit 6
2018-05-25 16:25:07 | [2018-05-25 16:25:07,959] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 1]
2018-05-25 16:25:07 |
2018-05-25 16:25:07 | [2018-05-25 16:25:07,960] (os-refresh-config) [ERROR] Aborting...
2018-05-25 16:25:07 | Traceback (most recent call last):
2018-05-25 16:25:07 | File "<string>", line 1, in <module>
2018-05-25 16:25:07 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1256, in install
2018-05-25 16:25:07 | _run_orc(instack_env)
2018-05-25 16:25:07 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1020, in _run_orc
2018-05-25 16:25:07 | _run_live_command(args, instack_env, 'os-refresh-config')
2018-05-25 16:25:07 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 519, in _run_live_command
2018-05-25 16:25:07 | raise RuntimeError('%s failed. See log for details.' % name)
2018-05-25 16:25:07 | RuntimeError: os-refresh-config failed. See log for details.
2018-05-25 16:25:07 | Command 'instack-install-undercloud' returned non-zero exit status 1

https://ci.centos.org/artifacts/rdo/jenkins-tripleo-quickstart-extras-gate-newton-delorean-full-minimal-5956/undercloud/home/stack/undercloud_install.log.gz

Tags: ci
Ronelle Landy (rlandy)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → rocky-2
summary: - rean-full-minimal fails to install undercloud - os-refresh-config
+ tripleo-quickstart-extras-gate-newton-delorean-full-minimal fails to
+ install undercloud - os-refresh-config
summary: tripleo-quickstart-extras-gate-newton-delorean-full-minimal fails to
- install undercloud - os-refresh-config
+ install undercloud - Access denied for user 'heat'@'192.168.24.1
tags: added: ci
Revision history for this message
Ronelle Landy (rlandy) wrote :

This does fail gates - not sure what the priority is on this. It's been broken for weeks and downstrea is cherry-picking changes.

description: updated
Revision history for this message
Alex Schultz (alex-schultz) wrote :

2018-05-25 16:19:45 | 2018-05-25 16:19:45 - Notice: /Stage[main]/Staging/File[/opt/staging]/ensure: created
2018-05-25 16:19:45 | 2018-05-25 16:19:45 - Error: Could not start Service[certmonger]: Execution of '/bin/systemctl start certmonger' returned 1: Job for certmonger.service failed because the control process exited with error code. See "systemctl status certmonger.service" and "journalctl -xe" for details.
2018-05-25 16:19:45 | 2018-05-25 16:19:45 - Error: /Stage[main]/Certmonger/Service[certmonger]/ensure: change from stopped to running failed: Could not start Service[certmonger]: Execution of '/bin/systemctl start certmonger' returned 1: Job for certmonger.service failed because the control process exited with error code. See "systemctl status certmonger.service" and "journalctl -xe" for details.

It seems to be the certmonger issue from 7.4 to 7.5 w/o a reboot.

Revision history for this message
Arx Cruz (arxcruz) wrote :

In the messages:

May 25 16:19:45 undercloud certmonger: 2018-05-25 16:19:45 [343] Unable to set well-known bus name "org.fedorahosted.certmonger": Connection ":1.194" is not allowed to own the service "org.fedorahosted.certmonger" due to security policies in the configuration file(-1).
May 25 16:19:45 undercloud certmonger: Error connecting to D-Bus.
May 25 16:19:45 undercloud systemd: certmonger.service: main process exited, code=exited, status=1/FAILURE
May 25 16:19:45 undercloud systemd: Failed to start Certificate monitoring and PKI enrollment.
May 25 16:19:45 undercloud systemd: Unit certmonger.service entered failed state.
May 25 16:19:45 undercloud systemd: certmonger.service failed.

Revision history for this message
Arx Cruz (arxcruz) wrote :

I believe this is also duplicate of https://bugs.launchpad.net/tripleo/+bug/1770944

Revision history for this message
Arx Cruz (arxcruz) wrote :
Revision history for this message
Arx Cruz (arxcruz) wrote :

So, i digg into this, and here's what i found:

The problem is indeed with dbus upgrade from 7.4 to 7.5, and the official response to fix the problem is: reboot the system, because nobody would make a live upgrade in production.
The workaround I found is restart dbus before certmonger service start. Also, notice there are several other services that fails, we just notice this in certmonger (dnsmasq also has the same problem).
I submit a PR to certmonger puppet module [1] since it's not part of openstack module.

IF the PR is approved, then we need to pin the puppet-certmonger rpm package to point to the new commit in rdo. According jpena that will not be a problem.

Revision history for this message
Ronelle Landy (rlandy) wrote :
Ronelle Landy (rlandy)
Changed in tripleo:
assignee: nobody → Arx Cruz (arxcruz)
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
Arx Cruz (arxcruz) wrote :

Marking as invalid, since this job is no longer executed, and we don't support newton anymore

Changed in tripleo:
status: Triaged → Invalid
importance: High → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.