Nova might orphan volumes when it's racing to delete a volume-backed instance
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Medium
|
ChangBo Guo(gcb) |
Bug Description
Discussed in the -dev mailing list here:
http://
When nova deletes a volume-backed instance, it detaches the volume first here:
And then deletes the volume here (if the delete_
The problem is this code races since the detach is async, nova gets back a 202 and then goes on to delete the volume, which can fail if the volume status is not 'available' yet, as seen here:
2015-12-18 13:59:16.071 WARNING nova.compute.
This isn't an error in nova because the compute manager's _delete_instance method calls _cleanup_volumes with raise_exc=False, but this will orphan volumes in cinder, which then requires manual cleanup on the cinder side.
Changed in nova: | |
assignee: | nobody → Zhihai Song (szhsong) |
Changed in nova: | |
assignee: | Zhihai Song (szhsong) → Chris Friesen (cbf123) |
Changed in nova: | |
assignee: | Chris Friesen (cbf123) → ChangBo Guo(gcb) (glongwave) |
Changed in nova: | |
assignee: | ChangBo Guo(gcb) (glongwave) → Swami Reddy (swamireddy) |
Changed in nova: | |
assignee: | Swami Reddy (swamireddy) → ChangBo Guo(gcb) (glongwave) |
We could wait for detach to complete or timeout, similar to what we do with boot from volume when creating the volume and attaching it to the instance:
https:/ /github. com/openstack/ nova/blob/ 5508e11cf873384 a28dc7416168d34 e85f2c06cf/ nova/compute/ manager. py#L1398