xenapi: bad handling of "volume in use" errors

Bug #1030108 reported by Chuck Thier
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Medium
Unassigned

Bug Description

We have noticed several circumstances (like for example if a volume is in a raid, or in use) and detach call is issued, the volume stays in the in-use state, and Xen holds on to that volume. When the volume becomes not in use any longer (for example if you remove it from the raid, unmount it), Xen will then still have the detach command queued, and detach the device. This may lead to confusing behavior for the user. If a detach fails because it is in use, it should remove the queued detach command from the instance so that it isn't detached when not in use any more.

Tags: xenserver
Renuka Apte (renukaapte)
Changed in nova:
assignee: nobody → Citrix OpenStack development team (citrix-openstack)
Revision history for this message
John Garbutt (johngarbutt) wrote :

http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD

Highlights the following error:
DEVICE_DETACH_REJECTED
This means the VM will not release the volume (i.e. it is still in use).

While this should cause the Volume detach to fail, it should not cause the volume to enter the "error" state, it should simply remain "in use".

It looks that by default, 12 attempts are made to detach the volume, we should really time out earlier, informing the user that the volume cannot be detached.

We could move to making async calls, so we can cancel this operation, and all the retry attempts, after a flag determined timeout.

summary: - Detaching a volume from a Xen instance fails if Xen thinks it is in use
+ Detaching a volume from a XenAPI instance fails if XenAPI thinks it is
+ in use
Revision history for this message
John Garbutt (johngarbutt) wrote : Re: Detaching a volume from a XenAPI instance fails if XenAPI thinks it is in use

I have just spoken to the XenAPI team.

Issue 1:
Certain versions of XenServer (up to 6.0.2) have an issue where the first call to xenapi returns quickly, telling you that the disk is in use, but the second call might take up to 20mins. This has been fixed in the upcoming XenServer 6.1. OpenStack retries, by default, 12 times.

Issue 2:
In Xen(Server), once you request to detach a VBD, it will happen at the next time the disk is not in use. This may be at the next VM restart, or the next time the user does umount on that disk. Apparently there is no way around this, except changes to Xen.

I guess we need to ensure we expose this to the user in the way we update the Volume state.

A possible fix might be to make the nova code do this (to reflect how XenServer works):
- only attempt to call VBD.unplug once
- retry "for ever" to see the VBD has been successfully unpluged (or successfully destroyed, if we miss the point it has been unplugged and not destroyed during a terminate) (XenAPI has an event mechanism that may be a more efficient way to do this task)
- ensure the volume state remains in some kind of "unplugging" state until the volume has been detached

I am talking with the XenAPI SDK doc team to ensure XenServer documents these cases more clearly in the API docs.

Changed in nova:
assignee: Citrix OpenStack development team (citrix-openstack) → John Garbutt (johngarbutt)
Changed in nova:
status: New → Confirmed
Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Medium
tags: added: xenserver
Changed in nova:
assignee: John Garbutt (johngarbutt) → nobody
Revision history for this message
John Garbutt (johngarbutt) wrote :

Lets look at using force=true to do the detach, if it doesn't happen.

Although there may be data loss, maybe allowing that option could be a good way forward.

I guess OpenStack should keep trying unless we make a "force detach" call (I guess that is a new call that needs adding into OpenStack, it could be made admin only I guess)

Revision history for this message
Bob Ball (bob-ball) wrote :

There is also a force=true parameter on the vbd-unplug command.

A test on an iSCSI connection suggests that using this after unplug without force=true does indeed cause the detach to happen immediately.

The guest will clearly experience errors if an uncooperative detach is used.

summary: - Detaching a volume from a XenAPI instance fails if XenAPI thinks it is
- in use
+ Detaching a volume from a XenAPI instance fails if it is in use
Revision history for this message
John Garbutt (johngarbutt) wrote : Re: Detaching a volume from a XenAPI instance fails if it is in use

So force detach crashes the guest, better to reboot the guest.

What we need to do is make sure OpenStack stays in the detaching state until the disk really does detach, even if that is after the next reboot.

Probably should use same polling logic as resize-confirm and similar periodic tasks, rather than the current more broken solution. need to make it robust across nova-compute restarts.

Revision history for this message
Bob Ball (bob-ball) wrote :

It's worth clarifying that the "guest crash" is dependent on the guest kernel - As far as Xen is concerned the disk is detached and the guest continues running. However, in my test case of CentOS 6.0 the guest stopped responding as it was trying to use this disk which was no longer present.

Other guests may respond with different error conditions.

Revision history for this message
Aditi Raveesh (aditirav) wrote :

Problem:

If detach_volume fails in the unplug_vbd step, we catch the exception,
and roll_back cinder which puts the volume in 'in-use' state.
Xen might succeed in detaching the volume later on, so though the volume is
detached, cinder has the state as 'in-use'.

Proposed solution:

On such a failure, add an error state 'failed_detach' to the volume.
Add a periodic task which will poll all such failed detach instances and issue
a detach_volume.
Alternatively, the user can retry detaching the volume.
Once this call goes through, we can do the post detach_volume steps like
updating cinder db and destroying the bdm in nova db.

Does this approach have any other repercussions?

Revision history for this message
John Garbutt (johngarbutt) wrote :

I would first concentrate on ensuring that if you call detach on an already detached volume, that the cinder state gets updated correctly.

I worry about the cost of making the cinder API call to decide if there are any failed_detaches to process. Some careful caching might solve that.

Perhaps a better solution is using the XenAPI events system to wait for the detach, however we need to confirm with Citrix that the correct events will be produced. This is quite related to:
https://blueprints.launchpad.net/nova/+spec/xenapi-compute-driver-events

I would prefer to see that as a continued "detaching state", rather than set it to error, but I will let the Cinder people decide what to do. If we do an add an error, something like "detach_timeout" might represent the state better, but I haven't really given that much thought yet.

Revision history for this message
John Garbutt (johngarbutt) wrote :

Interesting alternative, with this change:
https://review.openstack.org/#/c/32145/

Users can reboot a VM to remove the ERROR state.

If we fix up volumes on reboot, if the detach timeout occurs, maybe put the VM in the ERROR state, and leave the volume in Detatching in Cinder?

Revision history for this message
John Garbutt (johngarbutt) wrote :

Manually marking as committed, because I got the bug wrong in the commit.

Changed in nova:
status: Confirmed → Fix Committed
status: Fix Committed → Triaged
Revision history for this message
John Garbutt (johngarbutt) wrote :

my bad, wrong bug.

summary: - Detaching a volume from a XenAPI instance fails if it is in use
+ xenapi: bad handling of "volume in use" errors
Revision history for this message
John Garbutt (johngarbutt) wrote :

This now raises:
Reached maximum number of retries trying to unplug VBD OpaqueRef:50be57f5-7c5a-0e72-7659-38b37608ea2a

Instance goes into the error state.

If the user calls reboot, everything will return to normal.

Changed in nova:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.