mistral fails to start the ceph container

Bug #1712432 reported by wes hayutin on 2017-08-22
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
John Trowbridge

Bug Description

Running gate-tripleo-ci-centos-7-scenario001-multinode-oooq-container for promotion with freshly built containers.

This failed the same way twice in a row; while other container scenario jobs passed.
https://review.rdoproject.org/jenkins/job/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016/17/

https://logs.rdoproject.org/47/475747/108/openstack-manual/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016/Z0d5cd493202546698d4d3ea112e4c1a6/undercloud/var/log/mistral/executor.log.txt.gz#_2017-08-22_20_13_19_412

: provided hosts list is empty, only localhost is available\nERROR! playbooks must be a list of plays\n\nThe error appears to have been in '/tmp/ansible-mistral-actionPjtjb9/playbook.yaml': line 1, column 1, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n/usr/share/ceph-ansible/site-docker.yml.sample\n^ here\n"
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor Traceback (most recent call last):
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/mistral/executors/default_executor.py", line 109, in run_action
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor result = action.run(context.ctx())
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/actions/ansible.py", line 409, in run
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor log_errors=processutils.LogErrors.ALL)
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 400, in execute
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor cmd=sanitized_cmd)
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor ProcessExecutionError: Unexpected error while running command.
2017-08-22 20:13:19.412 31700 ERROR mistral.executors.default_executor Command: ansible-playbook /tmp/ansible-mistral-actionPjtjb9/playbook.yaml --user tripleo-admin --become --become-user root --extra-vars {"monitor_secret": "***", "ceph_conf_overrides": {"global":

wes hayutin (weshayutin) on 2017-08-22
Changed in tripleo:
importance: Critical → High
Giulio Fidente (gfidente) wrote :

The actual error is

  ERROR! playbooks must be a list of plays

which I have seen when the playbook /usr/share/ceph-ansible/site-docker.yml.sample was not installed; I am inclined to think that ceph-ansible wasn't installed in the undercloud but I am not sure why this should be only in the promotion job

The two submissions which took care of installing the package are:

https://review.openstack.org/#/c/478977/
https://review.openstack.org/#/c/478986/

Is there any reason why they would not be effective for the promotion job?

John Trowbridge (trown) wrote :

Ya it is because https://review.openstack.org/475747 is not merged, so we are not getting the ceph-ansible patch in tripleo-ci. We can rebase it after the current manual run if it is not merged yet. Would be even better to just merge https://review.openstack.org/475747 though.

Giulio Fidente (gfidente) wrote :

Though I have to admit that that error message is really deep into mistral and we should fail harder and sooner if the package isn't installed; problem is we wanted it to be optional so right now the only thing we have is a warning message in the pre-deployment validations [1], not sure if that is visible in the promotion job, we probably don't run validations there.

1. https://review.openstack.org/#/c/483345/

Fix proposed to branch: master
Review: https://review.openstack.org/496822

Changed in tripleo:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/496822
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart/commit/?id=1ef01f4251cc1b2786a5d60e0f71d16f88012233
Submitter: Jenkins
Branch: master

commit 1ef01f4251cc1b2786a5d60e0f71d16f88012233
Author: John Trowbridge <email address hidden>
Date: Wed Aug 23 12:13:54 2017 -0400

    Add missing ceph-ansible install to appropriate releases

    We need the same change in 7c13e99a84be38fe2bb064d3a28605c7d1231d58
    in our other releases that support ceph-ansible.

    Change-Id: I4de51dc3e3a166dd25e45585c5eb38140650eeff
    Closes-Bug: 1712432

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers