ipmi cmds run too fast, cause BMC to run out of resources
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ironic |
Fix Released
|
High
|
aeva black | ||
OpenStack Compute (nova) |
Won't Fix
|
High
|
Dan Prince |
Bug Description
When using Nova baremetal the IPMI power commands are still proving to be too fast. I routinely get stack traces that look like this when deleting baremetal instances:
2 10:39:33.351 5112 TRACE nova.compute.
Mar 12 10:39:33 undercloud-
Mar 12 10:39:33 undercloud-
Mar 12 10:39:33 undercloud-
Mar 12 10:39:33 undercloud-
Mar 12 10:39:33 undercloud-
Mar 12 10:39:33 undercloud-
----
The root cause seems to be in the _power_off routine which repeatedly calls "power status" to determine if the instance has properly powered down after issuing the "power off". Once this fails simply resetting the instance state and retrying the delete again usually fixes the issue.
On the CLI the same commands always seem to work as well.
It does seem like our retry code is still too aggressive and we need to wait longer for each IPMI retry.
Changed in nova: | |
assignee: | nobody → Dan Prince (dan-prince) |
importance: | Undecided → High |
status: | New → In Progress |
Changed in ironic: | |
assignee: | nobody → Dan Prince (dan-prince) |
status: | New → In Progress |
summary: |
- ipmi cmds run to fast + ipmi cmds run too fast, cause BMC to run out of resources |
Changed in ironic: | |
importance: | Undecided → High |
milestone: | none → icehouse-rc1 |
Changed in ironic: | |
status: | Fix Committed → Fix Released |
Changed in ironic: | |
milestone: | icehouse-rc1 → 2014.1 |
Changed in nova: | |
status: | In Progress → Won't Fix |
Fix proposed to branch: master /review. openstack. org/80397
Review: https:/