Activity log for bug #1802969

Date Who What changed Old value New value Message
2018-11-12 18:43:06 James Bagwell bug added bug
2018-11-13 17:48:11 James Bagwell summary openstack overcloud node delete times out, even if the stack update operation finished before openstack overcloud node delete times out, even if the stack update operation finished. YAQL Expression errors in mistral/engine.log
2018-11-13 17:48:12 James Bagwell description How reproducible: 100% Environment: 14.0 Rocky Storage: Ceph Steps to Reproduce: 1. Deploy overcloud with 1 controller + 1 compute 2. Remove one compute node: openstack overcloud node delete $node_uuid Actual results: The command appears to exit after 60 minutes even if the stack update had finished within 15 minutes [stack@undercloud (stackrc) ~]$ openstack stack list +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | 00462a04-88ff-4558-a8e0-d86f16f39241 | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-12T04:31:08Z | 2018-11-12T17:56:57Z | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ CLI execution hangs, never returns: Started Mistral Workflow tripleo.scale.v1.delete_node. Execution ID: 39b85ffd-de65-499f-ab53-9c4677c72f7d Waiting for messages on queue 'tripleo' with no timeout. openstack workflow execution show 39b85ffd-de65-499f-ab53-9c4677c72f7d -f yaml ID: 39b85ffd-de65-499f-ab53-9c4677c72f7d Workflow ID: 4ff89508-2f97-41c8-92fb-6f1490e1ec0e Workflow name: tripleo.scale.v1.delete_node Workflow namespace: '' Description: '' Task Execution ID: <none> Root Execution ID: <none> State: ERROR State info: "Failed to run task [error=Can not evaluate YAQL expression [expression=$.status,\ \ error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]:\n\ Traceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py\"\ , line 63, in run_task\n task.run()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\ , line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 390, in run\n self._run_new()\n File \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\"\ , line 159, in wrapper\n result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 419, in _run_new\n self._schedule_actions()\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 483, in _schedule_actions\n input_dict = self._get_action_input()\n File\ \ \"/usr/lib/python2.7/site-packages/osprofiler/profiler.py\", line 159, in wrapper\n\ \ result = f(*args, **kwargs)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\"\ , line 514, in _get_action_input\n input_dict = self._evaluate_expression(self.task_spec.get_input(),\ \ ctx)\n File \"/usr/lib/python2.7/site-packages/mistral/engine/tasks.py\", line\ \ 540, in _evaluate_expression\n ctx_view\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\ , line 100, in evaluate_recursively\n data[key] = _evaluate_item(data[key], context)\n\ \ File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\", line\ \ 79, in _evaluate_item\n return evaluate(item, context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py\"\ , line 71, in evaluate\n return evaluator.evaluate(expression, context)\n File\ \ \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\", line\ \ 159, in evaluate\n cls).evaluate(trim_expr, data_context)\n File \"/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py\"\ , line 113, in evaluate\n \", data=%s]\" % (expression, str(e), data_context)\n\ YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status,\ \ error=u'status', data={}]\n" Created at: '2018-11-12 17:54:00' Updated at: '2018-11-12 18:04:08' Expected results: openstack overcloud node delete command exits after the stack update has finished and that no Yaql stacktrace exists in mistral engine log. rpms: [stack@undercloud (stackrc) ~]$ rpm -qa | grep openstack-tripleo openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch Description of problem: When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log Version-Release number of selected component (if applicable): openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with single controller single compute 2. Remove compute with 'openstack overcloud node delete <uuid>' Actual results: Command continues to hang: [stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef Deleting the following nodes from stack overcloud: - f46c1f41-7558-4d9a-9f50-805651d105ef Waiting for messages on queue 'tripleo' with no timeout. We see that the heat stack has completed. [stack@undercloud (stackrc) mistral]$ openstack stack list +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ We see strackraces in the mistral/engine.log 2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 - efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task task.run() File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper result = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run self._run_new() File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper result = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new self._schedule_actions() File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions input_dict = self._get_action_input() File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper result = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression ctx_view File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively data[key] = _evaluate_item(data[key], context) File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item return evaluate(item, context) File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate return evaluator.evaluate(expression, context) File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate cls).evaluate(trim_expr, data_context) File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate ", data=%s]" % (expression, str(e), data_context) YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}] ] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209) nova list also shows that the node has been removed. node delete command continues to hang. Expected results: 1: Run the node delete command 2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log Additional info: related to : https://bugzilla.redhat.com/show_bug.cgi?id=1636545 Forced to create this as a new bug since the previous was set to VERIFIED BY QA with an update to the openstack-triple-common rpm openstack-tripleo-common-9.4.1-0.20181012010866.67bab16.el7ost.noarch , however we've updated to an even newer version and still producing the same results as seen in previous bug.
2018-11-13 17:49:35 James Bagwell description Description of problem: When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log Version-Release number of selected component (if applicable): openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with single controller single compute 2. Remove compute with 'openstack overcloud node delete <uuid>' Actual results: Command continues to hang: [stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef Deleting the following nodes from stack overcloud: - f46c1f41-7558-4d9a-9f50-805651d105ef Waiting for messages on queue 'tripleo' with no timeout. We see that the heat stack has completed. [stack@undercloud (stackrc) mistral]$ openstack stack list +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ We see strackraces in the mistral/engine.log 2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 - efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task task.run() File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper result = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run self._run_new() File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper result = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new self._schedule_actions() File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions input_dict = self._get_action_input() File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper result = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx) File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression ctx_view File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively data[key] = _evaluate_item(data[key], context) File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item return evaluate(item, context) File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate return evaluator.evaluate(expression, context) File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate cls).evaluate(trim_expr, data_context) File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate ", data=%s]" % (expression, str(e), data_context) YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}] ] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209) nova list also shows that the node has been removed. node delete command continues to hang. Expected results: 1: Run the node delete command 2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log Additional info: related to : https://bugzilla.redhat.com/show_bug.cgi?id=1636545 Forced to create this as a new bug since the previous was set to VERIFIED BY QA with an update to the openstack-triple-common rpm openstack-tripleo-common-9.4.1-0.20181012010866.67bab16.el7ost.noarch , however we've updated to an even newer version and still producing the same results as seen in previous bug. Description of problem: When running a scale-in operation, the heat operations complete with 'UPDATE COMPLETE', but the actual command 'openstack overcloud node delete <uuid>' never returns, it hangs. Then we notice YAQL stacktraces found in the mistral/engine.log Version-Release number of selected component (if applicable): openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7.noarch openstack-tripleo-image-elements-9.0.1-0.20181007200834.2dc678a.el7.noarch openstack-tripleo-common-containers-10.0.1-0.20181112071049.b8bfff8.el7.noarch openstack-tripleo-validations-9.3.1-0.20181008110747.4064fb7.el7.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060858.ffbe879.el7.noarch openstack-tripleo-ui-9.3.1-0.20180921180340.df30b55.el7.noarch openstack-tripleo-common-10.0.1-0.20181112071049.b8bfff8.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with single controller single compute 2. Remove compute with 'openstack overcloud node delete <uuid>' Actual results: Command continues to hang: [stack@undercloud (stackrc) ~]$ openstack overcloud node delete f46c1f41-7558-4d9a-9f50-805651d105ef Deleting the following nodes from stack overcloud: - f46c1f41-7558-4d9a-9f50-805651d105ef Waiting for messages on queue 'tripleo' with no timeout. We see that the heat stack has completed. [stack@undercloud (stackrc) mistral]$ openstack stack list +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ | fd5f5b10-8c76-433e-9283-fd67a0e7785c | overcloud | da55dbc940c54f8ca2f069b31563e0b4 | UPDATE_COMPLETE | 2018-11-13T00:56:50Z | 2018-11-13T17:27:28Z | +--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+ We see strackraces in the mistral/engine.log 2018-11-13 17:35:16.923 7 INFO workflow_trace [req-ef8e0eb7-1911-461f-9aff-05095cfa0ce4 a15bdaa1a2ec459b9ff47c1809c64ab5 da55dbc940c54f8ca2f069b31563e0b4 - efault default] Workflow 'tripleo.scale.v1.delete_node' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=$.stat s, error=u'status', data={}], wf=tripleo.scale.v1.delete_node, task=send_message]: Traceback (most recent call last):   File "/usr/lib/python2.7/site-packages/mistral/engine/task_handler.py", line 63, in run_task     task.run()   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper     result = f(*args, **kwargs)   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 390, in run     self._run_new()   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper     result = f(*args, **kwargs)   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 419, in _run_new     self._schedule_actions()   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 483, in _schedule_actions     input_dict = self._get_action_input()   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 159, in wrapper     result = f(*args, **kwargs)   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 514, in _get_action_input     input_dict = self._evaluate_expression(self.task_spec.get_input(), ctx)   File "/usr/lib/python2.7/site-packages/mistral/engine/tasks.py", line 540, in _evaluate_expression     ctx_view   File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 100, in evaluate_recursively     data[key] = _evaluate_item(data[key], context)   File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 79, in _evaluate_item     return evaluate(item, context)   File "/usr/lib/python2.7/site-packages/mistral/expressions/__init__.py", line 71, in evaluate     return evaluator.evaluate(expression, context)   File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 159, in evaluate     cls).evaluate(trim_expr, data_context)   File "/usr/lib/python2.7/site-packages/mistral/expressions/yaql_expression.py", line 113, in evaluate     ", data=%s]" % (expression, str(e), data_context) YaqlEvaluationException: Can not evaluate YAQL expression [expression=$.status, error=u'status', data={}] ] (execution_id=3bd65773-9e5f-4f36-a525-51532735b209) nova list also shows that the node has been removed. node delete command continues to hang. Expected results: 1: Run the node delete command 2: Node gets deleted, command returns successfully. No YAQL expression stack traces inside the mistral/engine.log
2018-11-13 18:41:53 James Bagwell tags tripleo-common
2018-11-13 18:46:16 Alex Schultz marked as duplicate 1796893