Error when deploying ceph in overcloud with containers using ceph-ansible

Bug #1735139 reported by Alfredo Moralejo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Giulio Fidente

Bug Description

When doing a overcloud deployment with ceph in master following error appears in deployment log [1]:

overcloud.AllNodesDeploySteps.WorkflowTasks_Step2_Execution:
  resource_type: OS::Mistral::ExternalResource
  physical_resource_id: e30e9ca1-9be4-4102-9a51-df5b5186485c
  status: CREATE_FAILED
  status_reason: |
    resources.WorkflowTasks_Step2_Execution: ERROR

Looking at ceph-install log [2], following error is found when deploying ceph:

2017-11-29 02:34:58,363 p=7774 u=mistral | fatal: [192.168.24.7]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["docker", "exec", "ceph-mon-upstream-centos-7-2-node-rdo-cloud-tripleo-53070-22071", "stat", "/var/run/ceph/ceph-mon.upstream-centos-7-2-node-rdo-cloud-tripleo-53070-22071.localdomain.asok"], "delta": "0:00:00.052434", "end": "2017-11-29 02:34:58.336669", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-29 02:34:58.284235", "stderr": "stat: cannot stat '/var/run/ceph/ceph-mon.upstream-centos-7-2-node-rdo-cloud-tripleo-53070-22071.localdomain.asok': No such file or directory", "stderr_lines": ["stat: cannot stat '/var/run/ceph/ceph-mon.upstream-centos-7-2-node-rdo-cloud-tripleo-53070-22071.localdomain.asok': No such file or directory"], "stdout": "", "stdout_lines": []}

However, apparently the ceph.mon container is up and running [3].

[1] https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset019-master/1509905/undercloud/home/jenkins/failed_deployment_list.log.txt.gz
[2] https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset019-master/1509905/undercloud/var/log/mistral/ceph-install-workflow.log.txt.gz
[3] https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset019-master/1509905/subnode-2/var/log/extra/docker/containers/ceph-mon-upstream-centos-7-2-node-rdo-cloud-tripleo-53070-22071/log/ceph/ceph-mon.upstream-centos-7-2-node-rdo-cloud-tripleo-53070-22071.log.txt.gz

tags: added: ci promotion-blocker
Changed in tripleo:
milestone: none → queens-2
importance: Undecided → Critical
status: New → Triaged
Revision history for this message
Emilien Macchi (emilienm) wrote :

I confirm this but, I have the same problem when I'm deploying in RDO Cloud. I remember gfidente having a fix.

Changed in tripleo:
assignee: nobody → Giulio Fidente (gfidente)
Revision history for this message
Emilien Macchi (emilienm) wrote :

We can close this one once https://review.openstack.org/#/c/523375/ is landed.

Revision history for this message
Alfredo Moralejo (amoralej) wrote :
Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

Still happens and blocks master promotion in featureset019 job:

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset019-master/c1a179f/undercloud/var/log/extra/errors.txt.gz#_2017-11-30_11_23_10_016

2017-11-30 11:23:10.016 /var/log/heat/heat-engine.log: 28931 ERROR heat.engine.resource ResourceFailure: resources.AllNodesDeploySteps: Resource CREATE failed: resources.WorkflowTasks_Step2_Execution: ERROR

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset019-master/c1a179f/undercloud/var/log/extra/errors.txt.gz#_2017-11-30_11_23_07_836699

"stderr":
"stat: cannot stat \'/var/run/ceph/ceph-mon.upstream-centos-7-2-node-rdo-cloud-tripleo-54146-22363.localdomain.asok\': No such file or directory",

Revision history for this message
Giulio Fidente (gfidente) wrote :

That is an issue with the quickstart-extra, should be fixed by https://review.openstack.org/#/c/523945/

Revision history for this message
Ronelle Landy (rlandy) wrote :

Master promotion was successful.
Spoke with Giulio - Moving this bug to fix released

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.