Cannot delete overcloud heat stack when the stack creation failed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Expired
|
Medium
|
Unassigned |
Bug Description
A creation of the overcloud heat failed, after which I tried to delete the heat stack:
root@stratus37:
+------
| id | stack_name | stack_status | creation_time |
+------
| ea739758-
+------
root@stratus37:
+------
| id | stack_name | stack_status | creation_time |
+------
| ea739758-
+------
root@stratus37:
+------
| id | stack_name | stack_status | creation_time |
+------
| ea739758-
+------
- This got stuck in this state for some time (heat stack-delete is usually pretty quick):
root@stratus37:
+------
| ID | Name | Status | Task State | Power State | Networks |
+------
| 8f5dec9e-
| 120f9e5c-
| c67a13bb-
+------
- As a result, I tried to delete the nova instances, which didn't work and then tried a force-delete, which is not allowed:
root@stratus37:
ERROR: Cannot 'forceDelete' while instance is in vm_state building (HTTP 409) (Request-ID: req-6b87f27c-
- At this point (apart from database hacking), I think I have to rerun devtest to bring up my overcloud again. The resource list is stuck in this state:
root@stratus37:
+------
| resource_name | resource_type | resource_status | updated_time |
+------
| AccessPolicy | OS::Heat:
| CompletionHandle | AWS::CloudForma
| ComputeAccessPolicy | OS::Heat:
| ComputeUser | AWS::IAM::User | CREATE_COMPLETE | 2014-04-
| User | AWS::IAM::User | CREATE_COMPLETE | 2014-04-
| ComputeKey | AWS::IAM::AccessKey | CREATE_COMPLETE | 2014-04-
| Key | AWS::IAM::AccessKey | CREATE_COMPLETE | 2014-04-
| notcompute1 | OS::Nova::Server | CREATE_FAILED | 2014-04-
| NovaCompute0 | OS::Nova::Server | DELETE_IN_PROGRESS | 2014-04-
| notcompute2 | OS::Nova::Server | CREATE_FAILED | 2014-04-
| notcompute0 | OS::Nova::Server | DELETE_IN_PROGRESS | 2014-04-
| CompletionCondition | AWS::CloudForma
| NovaCompute0Config | AWS::AutoScalin
| RabbitCookie | OS::Heat:
| notcompute0Config | AWS::AutoScalin
| notcompute1Config | AWS::AutoScalin
| notcompute2Config | AWS::AutoScalin
+------
Changed in tripleo: | |
status: | New → Triaged |
importance: | Undecided → Medium |
I think this is more a nova problem than a heat problem - it can't delete the instances when they're spawning, so Heat can't delete the stack. I think I might change this to incomplete for now, because I think we'd need to see the nova logs to figure out how it got into this bad state in order to figure out what needs to be fixed (but I agree this should definitely be fixed).