Ironic Nova Virt driver tries to act on exclusively locked node during tear down
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ironic |
Won't Fix
|
Undecided
|
Unassigned | ||
OpenStack Compute (nova) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
When a node is unprovisioned with cleaning enabled it moves into the
CLEANING state which exclusively locks the node.
A node will remain in CLEANING state and therefore locked until the node
moves into the CLEAN_WAIT state, this can take as long as it takes to
decommission the node and power it back on for booting the cleaning
ramdisk. This can take a surprisingly long amount of time with real
hardware.
There are several tasks that require a lock on the Ironic node,
which it can't claim if the node is already exclusively locked by being
in the CLEANING state.
This means that people deploying nova have to tune their nova timeouts and retries to match their equipment, to ensure that nova keeps retrying until that node becomes unlocked. This can slow down the turn around time of Ironic nodes and confuses users wondering why their node is taking so much time to delete.
What exactly is Nova doing, that it can't do with the node locked? If nova is telling it to delete, it should be OK with seeing CLEANING or CLEAN WAIT after a delete to indicate it's deleted.