Can't set IN_PROGRESS stack to FAILED when engine restart

Bug #1584724 reported by huangtianhua
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
huangtianhua

Bug Description

1. create stack with cinder volume resource
2. the engine is down when the stack is CREATE_IN_PROGRESS
3. the engine is up again, but the stack is still in CREATE_IN_PROGRESS, the reason details:
2016-05-23 18:24:53.114 ERROR heat.common.context [req-51048731-649a-4440-9fa0-310ced5e5f14 None None] Keystone v3 API connection failed, no password trust or auth_token!
2016-05-23 18:24:53.116 INFO heat.engine.stack_lock [req-7847989c-0114-4345-902f-c3f919e5e959 None None] Stale lock detected on stack 40185c9a-5bbe-40f8-9317-ef4bc4e00c4c. Engine 3618a167-95ed-44e0-8cd2-b827dda3dde6 will attempt to steal the lock
2016-05-23 18:24:53.119 INFO heat.engine.stack_lock [req-e7712da2-8199-4511-aea4-9becae3dd258 None None] Stale lock detected on stack 40185c9a-5bbe-40f8-9317-ef4bc4e00c4c. Engine e59963cf-1c2e-44e1-8f51-9521141680a4 will attempt to steal the lock
2016-05-23 18:24:53.119 INFO heat.engine.stack_lock [req-5b60050d-c524-403c-849a-2380d07c3e1c None None] Stale lock detected on stack 40185c9a-5bbe-40f8-9317-ef4bc4e00c4c. Engine 8673ecfd-9062-4a13-92a8-0c1aef4326ec will attempt to steal the lock
2016-05-23 18:24:53.116 ERROR heat.engine.resource [req-51048731-649a-4440-9fa0-310ced5e5f14 None None] Authorization failed.
2016-05-23 18:24:53.116 TRACE heat.engine.resource Traceback (most recent call last):
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/resource.py", line 169, in _validate_service_availability
2016-05-23 18:24:53.116 TRACE heat.engine.resource svc_available = cls.is_service_available(context)
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/resource.py", line 664, in is_service_available
2016-05-23 18:24:53.116 TRACE heat.engine.resource service_name=cls.default_client_name)
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/clients/client_plugin.py", line 305, in does_endpoint_exist
2016-05-23 18:24:53.116 TRACE heat.engine.resource endpoint_type=endpoint_type)
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/clients/client_plugin.py", line 203, in url_for
2016-05-23 18:24:53.116 TRACE heat.engine.resource url = get_endpoint()
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/clients/client_plugin.py", line 188, in get_endpoint
2016-05-23 18:24:53.116 TRACE heat.engine.resource auth_plugin = self.context.auth_plugin
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/common/context.py", line 236, in auth_plugin
2016-05-23 18:24:53.116 TRACE heat.engine.resource self._auth_plugin = self._create_auth_plugin()
2016-05-23 18:24:53.116 TRACE heat.engine.resource File "/opt/stack/heat/heat/common/context.py", line 225, in _create_auth_plugin
2016-05-23 18:24:53.116 TRACE heat.engine.resource raise exception.AuthorizationFailure()
2016-05-23 18:24:53.116 TRACE heat.engine.resource AuthorizationFailure: Authorization failed.
2016-05-23 18:24:53.116 TRACE heat.engine.resource
2016-05-23 18:24:53.122 ERROR heat.engine.service [req-51048731-649a-4440-9fa0-310ced5e5f14 None None] Unhandled error in asynchronous task
2016-05-23 18:24:53.122 TRACE heat.engine.service Traceback (most recent call last):
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/service.py", line 131, in log_exceptions
2016-05-23 18:24:53.122 TRACE heat.engine.service gt.wait()
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 175, in wait
2016-05-23 18:24:53.122 TRACE heat.engine.service return self._exit_event.wait()
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 125, in wait
2016-05-23 18:24:53.122 TRACE heat.engine.service current.throw(*self._exc)
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
2016-05-23 18:24:53.122 TRACE heat.engine.service result = function(*args, **kwargs)
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/service.py", line 122, in _start_with_trace
2016-05-23 18:24:53.122 TRACE heat.engine.service return func(*args, **kwargs)
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 147, in wrapper
2016-05-23 18:24:53.122 TRACE heat.engine.service return f(*args, **kwargs)
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/service.py", line 2164, in set_stack_and_resource_to_failed
2016-05-23 18:24:53.122 TRACE heat.engine.service for name, rsrc in six.iteritems(stack.resources):
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/stack.py", line 287, in resources
2016-05-23 18:24:53.122 TRACE heat.engine.service return self._find_resources()
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/stack.py", line 296, in _find_resources
2016-05-23 18:24:53.122 TRACE heat.engine.service for (name, data) in res_defns.items())
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/stack.py", line 296, in <genexpr>
2016-05-23 18:24:53.122 TRACE heat.engine.service for (name, data) in res_defns.items())
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/resource.py", line 161, in __new__
2016-05-23 18:24:53.122 TRACE heat.engine.service definition.resource_type
2016-05-23 18:24:53.122 TRACE heat.engine.service File "/opt/stack/heat/heat/engine/resource.py", line 176, in _validate_service_availability
2016-05-23 18:24:53.122 TRACE heat.engine.service raise ex
2016-05-23 18:24:53.122 TRACE heat.engine.service ResourceTypeUnavailable: HEAT-E99001 Service cinder is not available for resource type OS::Cinder::Volume, reason: Authorization failed.
2016-05-23 18:24:53.122 TRACE heat.engine.service
2016-05-23 18:24:53.127 INFO heat.engine.stack_lock [req-7847989c-0114-4345-902f-c3f919e5e959 None None] Failed to steal lock on stack 40185c9a-5bbe-40f8-9317-ef4bc4e00c4c. Engine 315ecd40-6a4c-414c-94de-1b0a1f7f789d stole the lock first
2016-05-23 18:24:53.128 INFO heat.engine.stack_lock [req-e7712da2-8199-4511-aea4-9becae3dd258 None None] Failed to steal lock on stack 40185c9a-5bbe-40f8-9317-ef4bc4e00c4c. Engine 315ecd40-6a4c-414c-94de-1b0a1f7f789d stole the lock first
2016-05-23 18:24:53.128 INFO heat.engine.stack_lock [req-5b60050d-c524-403c-849a-2380d07c3e1c None None] Failed to steal lock on stack 40185c9a-5bbe-40f8-9317-ef4bc4e00c4c. Engine 315ecd40-6a4c-414c-94de-1b0a1f7f789d stole the lock first
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main
    result = function(*args, **kwargs)
  File "/opt/stack/heat/heat/engine/service.py", line 122, in _start_with_trace
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 147, in wrapper
    return f(*args, **kwargs)
  File "/opt/stack/heat/heat/engine/service.py", line 2164, in set_stack_and_resource_to_failed
    for name, rsrc in six.iteritems(stack.resources):
  File "/opt/stack/heat/heat/engine/stack.py", line 287, in resources
    return self._find_resources()
  File "/opt/stack/heat/heat/engine/stack.py", line 296, in _find_resources
    for (name, data) in res_defns.items())
  File "/opt/stack/heat/heat/engine/stack.py", line 296, in <genexpr>
    for (name, data) in res_defns.items())
  File "/opt/stack/heat/heat/engine/resource.py", line 161, in __new__
    definition.resource_type
  File "/opt/stack/heat/heat/engine/resource.py", line 176, in _validate_service_availability
    raise ex
ResourceTypeUnavailable: HEAT-E99001 Service cinder is not available for resource type OS::Cinder::Volume, reason: Authorization failed.

4. I think this is produced by the change I8109c622b7661254a658d78d98f8dc8f756d8755 which is merged in 5/21/2016
5. Let's revert it first.

Changed in heat:
assignee: nobody → huangtianhua (huangtianhua)
importance: Undecided → High
Changed in heat:
status: New → In Progress
Revision history for this message
Thomas Herve (therve) wrote :

Can we try to not check for service availability instead? It's pointless.

Revision history for this message
Rabi Mishra (rabi) wrote :

+1, I think we can avoid that by setting service_check_defer=True for the stacks during restart/reset?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/319856
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=af7f31758fbc9343571563d5638b01cee49c644f
Submitter: Jenkins
Branch: master

commit af7f31758fbc9343571563d5638b01cee49c644f
Author: huangtianhua <email address hidden>
Date: Mon May 23 11:14:03 2016 +0000

    Revert "Don't use stored context to reset stacks"

    This reverts commit 026cc94eba48364b35a561fd21e33d6d482ab39d.

    Closes-Bug: #1584724
    Change-Id: I019122081525762565900d95f8f88a4fe3ae1660

Changed in heat:
status: In Progress → Fix Released
Revision history for this message
huangtianhua (huangtianhua) wrote :

Thanks all, I will fix this problem and then to propose the commit 026cc94eba48364b35a561fd21e33d6d482ab39d again.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/heat 7.0.0.0b1

This issue was fixed in the openstack/heat 7.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.