2018-11-12 18:43:06 |
James Bagwell |
bug |
|
|
added bug |
2018-11-13 17:48:11 |
James Bagwell |
summary |
openstack overcloud node delete times out, even if the stack update operation finished before |
openstack overcloud node delete times out, even if the stack update operation finished. YAQL Expression errors in mistral/engine.log |
|
2018-11-13 17:48:12 |
James Bagwell |
description |
How reproducible:
100%
Environment: 14.0 Rocky
Storage: Ceph
Steps to Reproduce:
1. Deploy overcloud with 1 controller + 1 compute
2. Remove one compute node:
openstack overcloud node delete $node_uuid
Actual results:
The command appears to exit after 60 minutes even if the stack update had finished within 15 minutes
[stack@undercloud (stackrc) ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| ID | Stack Name | Project | Stack Status | Creation Time | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| 00462a04-88ff-4558-a8e0-d86f16f39241 | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-12T04:31:08Z | 2018-11-12T17:56:57Z |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
CLI execution hangs, never returns:
Started Mistral Workflow tripleo.scale.v1.delete_node. Execution ID: 39b85ffd-de65-499f-ab53-9c4677c72f7d
Waiting for messages on queue 'tripleo' with no timeout.
openstack workflow execution show 39b85ffd-de65-499f-ab53-9c4677c72f7d -f yaml
ID: 39b85ffd-de65-499f-ab53-9c4677c72f7d
Workflow ID: 4ff89508-2f97-41c8-92fb-6f1490e1ec0e
Workflow name: tripleo.scale.v1.delete_node
Workflow namespace: ''
Description: ''
Task Execution ID: <none>
Root Execution ID: <none>
State: ERROR
State info: "Failed to run task [error=Can not evaluate YAQL expression [expression=$.status,\
\ error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:\n\
Traceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py\"\
, line 63, in run_task\n task.run()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\
, line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\
, line 390, in run\n self._run_new()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\
, line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\
, line 419, in _run_new\n self._schedule_actions()\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\
, line 483, in _schedule_actions\n input_dict = self._get_action_input()\n File\
\ \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\", line 159, in wrapper\n\
\ result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\
, line 514, in _get_action_input\n input_dict = self._evaluate_expression(self.task_spec.get_input(),\
\ ctx)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\", line\
\ 540, in _evaluate_expression\n ctx_view\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\
, line 100, in evaluate_recursively\n data[key] = _evaluate_item(data[key], context)\n\
\ File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\", line\
\ 79, in _evaluate_item\n return evaluate(item, context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\
, line 71, in evaluate\n return evaluator.evaluate(expression, context)\n File\
\ \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\", line\
\ 159, in evaluate\n cls).evaluate(trim_expr, data_context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\"\
, line 113, in evaluate\n \", data=%s]\" % (expression, str(e), data_context)\n\
YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status,\
\ error=u'status', data={}]\n"
Created at: '2018-11-12 17:54:00'
Updated at: '2018-11-12 18:04:08'
Expected results:
openstack overcloud node delete command exits after the stack update has finished and that no Yaql stacktrace exists in mistral engine log.
rpms:
[stack@undercloud (stackrc) ~]$ rpm -qa | grep openstack-tripleo
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch
openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch
openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch
openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch
openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch |
Description of problem:
When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log
Version-Release number of selected component (if applicable):
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch
openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch
openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch
openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch
openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch
How reproducible:
100%
Steps to Reproduce:
1. Deploy overcloud with single controller single compute
2. Remove compute with 'openstack overcloud node delete <uuid>'
Actual results:
Command continues to hang:
[stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef
Deleting the following nodes from stack overcloud:
- f46c1f41-7558-4d9a-9f50-805651d105ef
Waiting for messages on queue 'tripleo' with no timeout.
We see that the heat stack has completed.
[stack@undercloud (stackrc) mistral]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| ID | Stack Name | Project | Stack Status | Creation Time | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
We see strackraces in the mistral/engine.log
2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 -
efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat
s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task
task.run()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run
self._run_new()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new
self._schedule_actions()
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions
input_dict = self._get_action_input()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input
input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression
ctx_view
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively
data[key] = _evaluate_item(data[key], context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item
return evaluate(item, context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate
return evaluator.evaluate(expression, context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate
cls).evaluate(trim_expr, data_context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate
", data=%s]" % (expression, str(e), data_context)
YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}]
] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209)
nova list also shows that the node has been removed. node delete command continues to hang.
Expected results:
1: Run the node delete command
2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log
Additional info:
related to : https://bugzilla.redhat.com/show_bug.cgi?id=1636545
Forced to create this as a new bug since the previous was set to VERIFIED BY QA with an update to the openstack-triple-common rpm openstack-tripleo-common-9.4.1-0.20181012010866.67bab16.el7ost.noarch , however we've updated to an even newer version and still producing the same results as seen in previous bug. |
|
2018-11-13 17:49:35 |
James Bagwell |
description |
Description of problem:
When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log
Version-Release number of selected component (if applicable):
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch
openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch
openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch
openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch
openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch
How reproducible:
100%
Steps to Reproduce:
1. Deploy overcloud with single controller single compute
2. Remove compute with 'openstack overcloud node delete <uuid>'
Actual results:
Command continues to hang:
[stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef
Deleting the following nodes from stack overcloud:
- f46c1f41-7558-4d9a-9f50-805651d105ef
Waiting for messages on queue 'tripleo' with no timeout.
We see that the heat stack has completed.
[stack@undercloud (stackrc) mistral]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| ID | Stack Name | Project | Stack Status | Creation Time | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
We see strackraces in the mistral/engine.log
2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 -
efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat
s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task
task.run()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run
self._run_new()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new
self._schedule_actions()
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions
input_dict = self._get_action_input()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input
input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression
ctx_view
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively
data[key] = _evaluate_item(data[key], context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item
return evaluate(item, context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate
return evaluator.evaluate(expression, context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate
cls).evaluate(trim_expr, data_context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate
", data=%s]" % (expression, str(e), data_context)
YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}]
] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209)
nova list also shows that the node has been removed. node delete command continues to hang.
Expected results:
1: Run the node delete command
2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log
Additional info:
related to : https://bugzilla.redhat.com/show_bug.cgi?id=1636545
Forced to create this as a new bug since the previous was set to VERIFIED BY QA with an update to the openstack-triple-common rpm openstack-tripleo-common-9.4.1-0.20181012010866.67bab16.el7ost.noarch , however we've updated to an even newer version and still producing the same results as seen in previous bug. |
Description of problem:
When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log
Version-Release number of selected component (if applicable):
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch
openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch
openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch
openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch
openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch
How reproducible:
100%
Steps to Reproduce:
1. Deploy overcloud with single controller single compute
2. Remove compute with 'openstack overcloud node delete <uuid>'
Actual results:
Command continues to hang:
[stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef
Deleting the following nodes from stack overcloud:
- f46c1f41-7558-4d9a-9f50-805651d105ef
Waiting for messages on queue 'tripleo' with no timeout.
We see that the heat stack has completed.
[stack@undercloud (stackrc) mistral]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| ID | Stack Name | Project | Stack Status | Creation Time | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
We see strackraces in the mistral/engine.log
2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 -
efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat
s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task
task.run()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run
self._run_new()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new
self._schedule_actions()
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions
input_dict = self._get_action_input()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper
result = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input
input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx)
File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression
ctx_view
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively
data[key] = _evaluate_item(data[key], context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item
return evaluate(item, context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate
return evaluator.evaluate(expression, context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate
cls).evaluate(trim_expr, data_context)
File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate
", data=%s]" % (expression, str(e), data_context)
YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}]
] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209)
nova list also shows that the node has been removed. node delete command continues to hang.
Expected results:
1: Run the node delete command
2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log |
|
2018-11-13 18:41:53 |
James Bagwell |
tags |
|
tripleo-common |
|
2018-11-13 18:46:16 |
Alex Schultz |
marked as duplicate |
|
1796893 |
|