baremetal introspection workflow fails with new mistral

Bug #1688767 reported by Emilien Macchi
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mistral
Invalid
Undecided
Unassigned
tripleo
Fix Released
Critical
Unassigned

Bug Description

Deploying TripleO in promotion pipeline with OpenStack from trunk, it fails to run Ironic introspection workflow:

2017-05-06 02:52:21.530875 | Started Mistral Workflow tripleo.baremetal.v1.introspect_manageable_nodes. Execution ID: c4bbf53a-20ab-4c0f-a0e9-2bbf5407cdb4
2017-05-06 02:52:21.530961 | Waiting for messages on queue 'f6df8565-5c3c-4c98-96e2-8d7ead5be031' with no timeout.
2017-05-06 02:52:24.877990 | {u'result': u'Failed to run task [error=Failed to find action [action_name=baremetal_introspection.introspect], wf=tripleo.baremetal.v1.introspect, task=start_introspection]:\nTraceback (most recent call last):\n File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 58, in run_task\n task.run()\n File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper\n return f(*args, **kwargs)\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 269, in run\n self._run_new()\n File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper\n return f(*args, **kwargs)\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 293, in _run_new\n self._schedule_actions()\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 500, in _schedule_actions\n action = self._build_action()\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 403, in _build_action\n self.wf_spec.get_name()\n File "/usr/lib/python2.7/site-packages/mistral/engine/actions.py", line 557, in resolve_action_definition\n "Failed to find action [action_name=%s]" % action_spec_name\nInvalidActionException: Failed to find action [action_name=baremetal_introspection.introspect]\n'}
2017-05-06 02:52:24.980004 | Exception introspecting nodes: {u'result': u'Failed to run task [error=Failed to find action [action_name=baremetal_introspection.introspect], wf=tripleo.baremetal.v1.introspect, task=start_introspection]:\nTraceback (most recent call last):\n File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 58, in run_task\n task.run()\n File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper\n return f(*args, **kwargs)\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 269, in run\n self._run_new()\n File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper\n return f(*args, **kwargs)\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 293, in _run_new\n self._schedule_actions()\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 500, in _schedule_actions\n action = self._build_action()\n File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 403, in _build_action\n self.wf_spec.get_name()\n File "/usr/lib/python2.7/site-packages/mistral/engine/actions.py", line 557, in resolve_action_definition\n "Failed to find action [action_name=%s]" % action_spec_name\nInvalidActionException: Failed to find action [action_name=baremetal_introspection.introspect]\n'}

http://logs.openstack.org/15/359215/99/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/c6e1ef3/console.html.gz#_2017-05-06_02_52_24_877990

Tags: alert ci
Revision history for this message
Ryan Brady (rbrady) wrote :

The workflow in tripleo-common that refers to baremetal_introspection.introspect and the action itself in mistral haven't changed in a year. Is there something else wrong with the packaging? Can you provide a list of what `mistral action-list | grep baremetal` says?

Revision history for this message
Emilien Macchi (emilienm) wrote :
Revision history for this message
Dougal Matthews (d0ugal) wrote :

The action is indeed missing, the one linked is from tripleo-common - the action we want is provided by Mistral itself.

What we really need to see if the output from mistral-db-populate. This is executed by puppet.

This command finds the installed actions (via setup.py entrypoints) and registers them in the database. If any fail they wont be added and an error should be output. They could fail for a number of reasons, a bug in the code or maybe python-ironic-inspector-client isn't fully installed yet.

Revision history for this message
Dougal Matthews (d0ugal) wrote :

One way to try and verify if this is a race condition would be to run the command again at the very end and see if the action is now present. I hit a similar issue in a dev environment recently and done this and it resolved it - I thought it was because the env was quite old.

Revision history for this message
Brad P. Crochet (brad-9) wrote :

From what I can tell, when the action is being created, the keystone auth fails. I made a patch[1] that attempted to fix this, but it was not a complete fix. It just moved the problem target a bit.

[1] https://review.openstack.org/#/c/462560/

Revision history for this message
Emilien Macchi (emilienm) wrote :
Revision history for this message
Emilien Macchi (emilienm) wrote :

I removed tripleo project and added mistral, since tripleo CI is not affected anymore since we reverted the patch.

tags: removed: alert
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Alfredo Moralejo (amoralej) wrote :

Periodic job http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ovb-ha-oooq/b3daee1/ has failed with this error doing the introspection [1]:

Failed to run task [error=Failed to find action [action_name=baremetal_introspection.introspect]

And it was using patched version of puppet-mistral [2]:

puppet-mistral-11.2.0-0.20170706174714.ef9f9e4.el7.centos.noarch

[1] http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ovb-ha-oooq/b3daee1/logs/undercloud/home/jenkins/overcloud_prep_images.log.txt.gz
[2] http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ovb-ha-oooq/b3daee1/logs/undercloud/var/log/extra/rpm-list.txt.gz

Revision history for this message
Alfredo Moralejo (amoralej) wrote :
Revision history for this message
Alfredo Moralejo (amoralej) wrote :

According to proposal by d0ugal, i've proposed revert in https://review.openstack.org/#/c/481535/ and DNM test review in https://review.openstack.org/#/c/481546/

Feel free to amend it or let me know if test review is wrong, i'm not very familiar to how tripleo-ci with oooq works.

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

https://review.openstack.org/#/c/481535 is fixing it as shown in test review.

tags: added: alert
Dougal Matthews (d0ugal)
Changed in mistral:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.