Terminating an instance while attaching a volume leads to both actions failing
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Undecided
|
Andrew Laski |
Bug Description
This is happening with the xenapi driver, but it's possible that this can happen with others. The sequence of events I'm witnessing is:
An attach_volume request is made and shortly after a terminate_instance request is made.
From the attach_volume request the block device mapping has been updated, the volume has been connected to the hypervisor, but has not been attached to the instance. The terminate request begins processing before the volume connection is attached to the instance so when it detaches volumes and their connections it misses the latest one that's still attaching. This leads to a failure when asking Cinder to clean up the volume, such as:
2014-08-06 20:30:14.324 30737 TRACE nova.compute.
And in turn, when the attach_volume tries to attach the volume to the instance it finds that the instance no longer exists due to the terminate request. This leaves the instance undeletable and the volume stuck.
Having attach_volume share the instance lock with terminate_instance should resolve this. Virt drivers may also want to try to cope with this internally and not rely on a lock.
Changed in nova: | |
milestone: | none → juno-rc1 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | juno-rc1 → 2014.2 |
Fix proposed to branch: master /review. openstack. org/113341
Review: https:/