cpu_pinning errors after evacuation of instance with cpu_policy
Bug #1688599 reported by
Chris Friesen
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Confirmed
|
Medium
|
Unassigned |
Bug Description
We recently hit an issue where an evacuating instance with dedicated cpu_policy being pinned to same host CPUs as other instances with dedicated cpu_policy. During subsequent resource audits we would see cpu pinning errors.
The root cause appears to be the fact that the resource audit skips the evacuating instance during migration phase of audit while instance was rebuilding on new host. It appears that _instance_
summary: |
- resource audit races against evacuating instance + cpu_pinning errors after evacuation of instance with cpu_policy |
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
tags: | added: evacuate |
To post a comment you must log in.
Even after updating _instance_ in_resize_ state() to account for rebuilds from vm_states.ERROR, I think there is a further race condition. Down towards the end of _do_rebuild_ instance( ) we call:
This sets the task_state to "None", but the new instance host doesn't get updated until a bit later down at the bottom of rebuild_instance(). During that window, the newly-rebuilt instance will not get accounted for in either _update_ usage_from_ instances( ) or _update_ usage_from_ migrations( ).