Rescheduling loses reasons
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Andrew Laski |
Bug Description
In nova.compute.
For example:
Say the following happens, on schedule instance 1, on hypervisor A it errors with error X then rescheduled to hypervisor B, which
errors with error Y, then next can't reschedule due to no more hypervisors being able to be scheduled to (aka no more compute nodes), then you basically get an error that says no more instances to schedule on (which is not connected to the original error in any fashion).
Likely there needs to be a record of the rescheduling exceptions, or rescheduling needs to be rethought, where a orchestration unit can perform this rescheduling and be more aware of the rescheduling attempts (and there success and failures).
Changed in nova: | |
milestone: | none → kilo-3 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | kilo-3 → 2015.1.0 |
The exceptions are stored as instance faults, but that information is not exposed. There is another place to keep this which is exposed, the instance actions and events tables. Currently scheduling events are occurring in the scheduler manager which may not catch all exceptions that can occur. That should probably move up a level or be extended in order to capture all exceptions.