Cannot recover from a heat stack stuck in "UPDATE_IN_PROGRESS"

Bug #1286185 reported by Cian O'Driscoll
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
Medium
Pavlo Shchelokovskyy
tripleo
Fix Released
Medium
Unassigned

Bug Description

Tried to add an addition compute node to my overcloud heat stack using "hat stack-update" but there wasn't any compute resources available.
Got "no valid host" during instance creating on the overcloud.
Heat stack is now stuck in "UPDATE_IN_PROGRESS".

Can't see a way to recover from this unless you update heat db and set status to "UPDATE_COMPLETE" or delete the entire stack.

root@stratus5:~# heat stack-list
+--------------------------------------+------------+-----------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+------------+-----------------+----------------------+
| b4fd17ef-63fc-4ed0-bf8c-7bf50e09100b | overcloud | UPDATE_IN_PROGRESS| 2014-02-28T10:29:06Z |
+--------------------------------------+------------+-----------------+----------------------+

heat resource-list overcloud
+---------------------------------+------------------------------------------+-----------------+----------------------+
| resource_name | resource_type | resource_status | updated_time |
+---------------------------------+------------------------------------------+-----------------+----------------------+
| AccessPolicy | OS::Heat::AccessPolicy | CREATE_COMPLETE | 2014-02-28T10:29:07Z |
| ComputeAccessPolicy | OS::Heat::AccessPolicy | CREATE_COMPLETE | 2014-02-28T10:29:07Z |
| ComputeUser | AWS::IAM::User | CREATE_COMPLETE | 2014-02-28T10:29:08Z |
| User | AWS::IAM::User | CREATE_COMPLETE | 2014-02-28T10:29:08Z |
| NovaCompute0Key | AWS::IAM::AccessKey | CREATE_COMPLETE | 2014-02-28T10:29:09Z |
| NovaCompute1Key | AWS::IAM::AccessKey | CREATE_COMPLETE | 2014-02-28T10:29:09Z |
| NovaCompute2Key | AWS::IAM::AccessKey | CREATE_COMPLETE | 2014-02-28T10:29:09Z |
| notCompute0Key | AWS::IAM::AccessKey | CREATE_COMPLETE | 2014-02-28T10:29:09Z |
| NovaCompute0 | OS::Nova::Server | CREATE_COMPLETE | 2014-02-28T10:37:24Z |
| NovaCompute1 | OS::Nova::Server | CREATE_COMPLETE | 2014-02-28T10:41:29Z |
| NovaCompute2 | OS::Nova::Server | CREATE_FAILED | 2014-02-28T10:41:29Z |
| notCompute0 | OS::Nova::Server | CREATE_COMPLETE | 2014-02-28T10:47:24Z |
| NovaCompute0Config | AWS::AutoScaling::LaunchConfiguration | CREATE_COMPLETE | 2014-02-28T10:47:26Z |
| NovaCompute1Config | AWS::AutoScaling::LaunchConfiguration | CREATE_COMPLETE | 2014-02-28T10:47:31Z |
| NovaCompute2Config | AWS::AutoScaling::LaunchConfiguration | CREATE_COMPLETE | 2014-02-28T10:47:31Z |
| notCompute0Config | AWS::AutoScaling::LaunchConfiguration | CREATE_COMPLETE | 2014-02-28T10:47:32Z |
..........
....

Revision history for this message
James Slagle (james-slagle) wrote :

Hi, I think this bug is probably more suited to Heat, so I've added it to affecting that project as well.

Changed in tripleo:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Ladislav Smola (lsmola) wrote :

seems like auth failure when sending:

curl -X POST "http://192.0.2.3:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3Ace7d690c041e46dc982846f4e4d0fa5e%3Astacks%2Fovercloud%2F917434b0-3592-43a2-ae24-a6a0904f5a15%2Fresources%2FNovaCompute1CompletionHandle?Timestamp=2014-03-13T09%3A48%3A55Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=3ab500ceebfb43c9a8ce41b1f90a52df&SignatureVersion=2&Signature=JeMY902pIZz1lqwhM9J2Stdkfo5FwSGyKIlEte66M6U%3D"

I got:

Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.599 4183 INFO heat.api.aws.ec2token [-] Checking AWS credentials..
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.599 4183 INFO heat.api.aws.ec2token [-] AWS credentials found, checking against keystone.
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.599 4183 INFO heat.api.aws.ec2token [-] Authenticating with http://127.0.0.1:5000/v2.0/ec2tokens
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.602 4183 INFO requests.packages.urllib3.connectionpool [-] Starting new HTTP connection (1): 127.0.0.1
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy keystone-all[16507]: 2014-03-13 10:08:15.610 16507 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 127.0.0.1
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.612 4183 DEBUG requests.packages.urllib3.connectionpool [-] "POST /v2.0/ec2tokens HTTP/1.1" 401 114 _make_request /opt/stack/venvs/heat/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:344
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.613 4183 INFO heat.api.aws.ec2token [-] AWS authentication failure.
Mar 13 10:08:15 undercloud-undercloud-ojsfmisefovy heat-api-cfn[4183]: 2014-03-13 10:08:15.613 4183 DEBUG root [-] XML response : <ErrorResponse><Error><Message>User is not authorized to pe

I am trying to compare stack-create and stack-update now.

Changed in tripleo:
importance: Medium → High
importance: High → Medium
Revision history for this message
Ladislav Smola (lsmola) wrote :

Sry. reading too quickly, this is probably unrelated. This happens when instance is successfully created in nova.

Changed in heat:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Steven Hardy (shardy) wrote :
Changed in heat:
milestone: none → juno-rc1
status: Triaged → Fix Committed
assignee: nobody → Pavlo Shchelokovskyy (pshchelo)
Revision history for this message
Zane Bitter (zaneb) wrote :

Fixed by the linked blueprint.

Thierry Carrez (ttx)
Changed in heat:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in heat:
milestone: juno-rc1 → 2014.2
Revision history for this message
Ben Nemec (bnemec) wrote :

The heat bug is fixed, so that should take care of tripleo as well.

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.