Continual warnings in n-cpu logs about being unable to delete inventory for an ironic node with an instance on it
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Dmitry Tantsur | ||
Ocata |
New
|
Undecided
|
Belmiro Moreira |
Bug Description
Seen here:
Aug 09 19:31:21.450705 ubuntu-
As soon as an ironic node has an instance built on it, the node state is ACTIVE which means that this method returns True:
Saying the node is unavailable, because it's wholly consumed I guess.
That's used here:
And that's checked here when reporting inventory to the resource tracker:
Which then tries to delete the inventory for the node resource provider in placement, which fails because it's already got an instance running on it that is consuming inventory:
Aug 09 19:31:21.391146 ubuntu-
Aug 09 19:31:21.450705 ubuntu-
This is also bad because if the node was updated with a resource_class, that resource class won't be automatically created in Placement here:
Because the driver didn't report it in the get_inventory method.
And that has an impact on this code to migrate instance.
https:/
So we've got a bit of a chicken and egg problem here.
Manually testing the ironic flavor migration code hits this problem, as seen here:
Changed in nova: | |
status: | New → Triaged |
importance: | Undecided → High |
tags: | added: pike-rc-potential |
Changed in nova: | |
assignee: | nobody → Dmitry Tantsur (divius) |
status: | Triaged → In Progress |
tags: | removed: pike-rc-potential |
Changed in nova: | |
status: | In Progress → Fix Released |
One question is, why don't we report inventory for an ACTIVE node? If the inventory is 1 but an instance is also allocating that 1 of whatever resource class, then isn't that sufficient? In other words, if an instance is consuming all of the node inventory, that should take the node out of scheduling decisions for building new instances, which is also how things work for regular compute nodes for building VMs.