Improve instance state recovery for Compute service failure during Create Server
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Expired
|
Undecided
|
Unassigned |
Bug Description
Scenario:
Compute service spawns an instance but crashes just before instance's state is updated in database to Active, but instance has started running on the hypervisor.
In this situation, the recovery of the instance requires admin intervention:
- When compute service resumes, the check_instance_
- To recover the instance, Admin now has to reset the instance's state to Active (task state gets reset to None)
The instance can now be usable. The sync power state periodic task eventually sets the Power state to Running.
However , this is a tedious workflow needing admin intervention and should be handled in the code.
Changed in nova: | |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in nova: | |
assignee: | nobody → Grzegorz Grasza (xek) |
status: | Triaged → In Progress |
Changed in nova: | |
assignee: | Grzegorz Grasza (xek) → nobody |
Changed in nova: | |
status: | In Progress → Confirmed |
To reproduce the error, I stopped the compute in _update_ instance_ after_spawn method.