xenapi: bad handling of "volume in use" errors
Bug #1030108 reported by
Chuck Thier
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Medium
|
Unassigned |
Bug Description
We have noticed several circumstances (like for example if a volume is in a raid, or in use) and detach call is issued, the volume stays in the in-use state, and Xen holds on to that volume. When the volume becomes not in use any longer (for example if you remove it from the raid, unmount it), Xen will then still have the detach command queued, and detach the device. This may lead to confusing behavior for the user. If a detach fails because it is in use, it should remove the queued detach command from the instance so that it isn't detached when not in use any more.
Changed in nova: | |
assignee: | nobody → Citrix OpenStack development team (citrix-openstack) |
summary: |
- Detaching a volume from a Xen instance fails if Xen thinks it is in use + Detaching a volume from a XenAPI instance fails if XenAPI thinks it is + in use |
Changed in nova: | |
status: | New → Confirmed |
Changed in nova: | |
importance: | Undecided → Medium |
tags: | added: xenserver |
Changed in nova: | |
assignee: | John Garbutt (johngarbutt) → nobody |
summary: |
- Detaching a volume from a XenAPI instance fails if it is in use + xenapi: bad handling of "volume in use" errors |
To post a comment you must log in.
http:// docs.vmd. citrix. com/XenServer/ 6.0.0/1. 0/en_gb/ api/?c= VBD
Highlights the following error: DETACH_ REJECTED
DEVICE_
This means the VM will not release the volume (i.e. it is still in use).
While this should cause the Volume detach to fail, it should not cause the volume to enter the "error" state, it should simply remain "in use".
It looks that by default, 12 attempts are made to detach the volume, we should really time out earlier, informing the user that the volume cannot be detached.
We could move to making async calls, so we can cancel this operation, and all the retry attempts, after a flag determined timeout.