overcloud-deploy fails w/ The Workflow errored and no messages were received.

Bug #1868632 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Adriano Petrich

Bug Description

2020-03-23 20:26:30 | Ansible execution success. playbook: /usr/share/ansible/tripleo-playbooks/cli-update-deployment-plan.yaml
2020-03-23 20:26:44 | Pulling role list from: overcloud
2020-03-23 20:26:44 | Indexing roles from: overcloud
2020-03-23 20:26:45 | WARNING: Following parameter(s) are defined but not currently used in the deployment plan. These parameters may be valid but not in use due to the service or deployment configuration. CephPoolDefaultPgNum, CephPoolDefaultSize, CinderLVMLoopDeviceSize, CinderWorkers, ComputeCount, GnocchiMetricdWorkers, HeatWorkers, ManilaCephFSDataPoolPGNum, ManilaCephFSMetadataPoolPGNum, MistralDockerGroup, NovaComputeExtraConfig, SaharaWorkers, SwiftRingGetTempurl, SwiftRingPutTempurl, SwiftWorkers
2020-03-23 20:36:46 | Timed out waiting for messages from Execution (ID: 12909c7c-8221-4863-b57b-b395f59928af, State: ERROR). The Workflow errored and no messages were received.
2020-03-23 20:36:46 |
2020-03-23 20:36:46 | END return value: 1
2020-03-23 20:36:46 | Success.
2020-03-23 20:36:46 | Processing templates in the directory /tmp/tripleoclient-tpcdk0bf/tripleo-heat-templates
2020-03-23 20:36:46 | Deploying templates in the directory /tmp/tripleoclient-tpcdk0bf/tripleo-heat-templates
2020-03-23 20:36:46 | + status_code=1
2020-03-23 20:36:46 | + openstack stack list
2020-03-23 20:36:46 | + grep -q overcloud
2020-03-23 20:36:49 | + echo 'overcloud deployment not started. Check the deploy configurations'
2020-03-23 20:36:49 | overcloud deployment not started. Check the deploy configurations
2020-03-23 20:36:49 | + exit 1

https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-tripleo-master/2031709/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

Revision history for this message
wes hayutin (weshayutin) wrote :

possible required patches:
<cloudnull> weshay|ruck i found a missing path which was being added from within a mistral workflow - https://review.opendev.org/#/c/714492, https://review.opendev.org/#/c/714493 - maybe related to the ansible-pacemaker issues ?

Revision history for this message
wes hayutin (weshayutin) wrote :

mistral.exceptions.InputException: No module named 'mistral.actions.openstack'
: mistral.exceptions.InputException: No module named 'mistral.actions.openstack'
2020-03-23 20:24:28.142 6 INFO workflow_trace [req-0e41d2e5-ec3b-43dc-aaec-9ee12121f018 a908288487424545b9d56a1df34e2410 8757bd7a0b024c96afc4cf1b5af1eb7a - default default] Task 'get_containers' (5cd8500d-9d5a-4da3-8a3b-71c5b6c06c31) [RUNNING -> ERROR, msg=Failed to run task [error=No module named 'mistral.actions.openstack', wf=tripleo.swift.v1.container_exists, task=get_containers]:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/mistral/engine/actions.py", line 325, in is_sync
    self.action_def.namespace)(
  File "/usr/lib/python3.6/site-packages/mistral/services/action_manager.py", line 151, in get_action_class
    action_db.attributes
  File "/usr/lib/python3.6/site-packages/mistral/actions/action_factory.py", line 20, in construct_action_class
    action_class = importutils.import_class(action_class_str)
  File "/usr/lib/python3.6/site-packages/oslo_utils/importutils.py", line 30, in import_class
    __import__(mod_str)
ModuleNotFoundError: No module named 'mistral.actions.openstack'

Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
Rabi Mishra (rabi) wrote :

This is due to missing mistral-extra as all openstack actions were moved there. https://review.opendev.org/#/c/703972/.

It seems it was added and then reverted https://review.rdoproject.org/r/#/c/25327/. There are not enough details about the issue. I've resubmitted that change. https://review.rdoproject.org/r/#/c/26072/

Revision history for this message
Rabi Mishra (rabi) wrote :

Looks like mistral-extra had been added as a dep in mistral https://github.com/rdo-packages/mistral-distgit/commit/4322798975ec3305090e7a89ccc31bbfdffdc815 and it's there in the mistral containers. Not sure what's the issue then.

It seems the last run of the job was fine too.

 https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-tripleo-master

Revision history for this message
yatin (yatinkarel) wrote :

<<< Looks like mistral-extra had been added as a dep in mistral https://github.com/rdo-packages/mistral-distgit/commit/4322798975ec3305090e7a89ccc31bbfdffdc815 and it's there in the mistral containers. Not sure what's the issue then.

The issue is not all mistral containers were updated in the failure job, mistral-api container had outdated mistral packages and missing mistral-extra, so likely issue occured due to different mistral packages between mistral api[2] and engine[3]. container update logs are missing so didn't know why mistral-api container didn't got updated, related bug:- https://bugs.launchpad.net/tripleo/+bug/1866676

[1] https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-tripleo-master/2031709/logs/undercloud/var/log/containers/stdouts/mistral_db_populate.log.txt.gz
[2] https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-tripleo-master/2031709/logs/undercloud/var/log/extra/podman/containers/mistral_api/podman_info.log.txt.gz
[3] https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-tripleo-master/2031709/logs/undercloud/var/log/extra/podman/containers/mistral_engine/podman_info.log.txt.gz

Last night there was centos8 promotion, so now the issue shouldn't be seen as latest containers have all required latest packages(mistral-lib, mistral and mistral-extra).

Revision history for this message
Adriano Petrich (apetrich) wrote :

Try as I might I was unable to reproduce this. Is this still happening or did the promotion fix it like Yatin said?

Changed in tripleo:
assignee: nobody → Adriano Petrich (apetrich)
Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.