openstack overcloud node delete times out, even if the stack update operation finished. YAQL Expression errors in mistral/engine.log

Bug #1802969 reported by James Bagwell
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
New
Undecided
Unassigned

Bug Description

Description of problem:
When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log

Version-Release number of selected component (if applicable):
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch
openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch
openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch
openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch
openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch

How reproducible:

100%

Steps to Reproduce:
1. Deploy overcloud with single controller single compute
2. Remove compute with 'openstack overcloud node delete <uuid>'

Actual results:

Command continues to hang:
[stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef
Deleting the following nodes from stack overcloud:
- f46c1f41-7558-4d9a-9f50-805651d105ef
Waiting for messages on queue 'tripleo' with no timeout.

We see that the heat stack has completed.
[stack@undercloud (stackrc) mistral]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| ID | Stack Name | Project | Stack Status | Creation Time | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+

We see strackraces in the mistral/engine.log
2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 -
efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat
s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task
    task.run()
  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
    result = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run
    self._run_new()
  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
    result = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new
    self._schedule_actions()
  File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions
    input_dict = self._get_action_input()
  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
    result = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input
    input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx)
  File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression
    ctx_view
  File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively
    data[key] = _evaluate_item(data[key], context)
  File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item
    return evaluate(item, context)
  File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate
    return evaluator.evaluate(expression, context)
  File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate
    cls).evaluate(trim_expr, data_context)
  File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate
    ", data=%s]" % (expression, str(e), data_context)
YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}]
] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209)

nova list also shows that the node has been removed. node delete command continues to hang.

Expected results:

1: Run the node delete command
2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log

summary: openstack overcloud node delete times out, even if the stack update
- operation finished before
+ operation finished. YAQL Expression errors in mistral/engine.log
description: updated
description: updated
tags: added: tripleo-common
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.