Volume detach error when use NFS as the cinder backend

Bug #1340552 reported by SamP
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
SamP

Bug Description

Tested Environment
--------------------------
OS: Ubuntu 14.04 LST
Cinder NFS driver:
volume_driver=cinder.volume.drivers.nfs.NfsDriver

Error description
--------------------------
I used NFS as the cinder storage backend and successfully attached multiple volumes to nova instances.
However, when I tried to detach one them, I found following error on nova-compute.log.

-----------Error log------
2014-07-07 17:48:46.175 3195 ERROR nova.virt.libvirt.volume [req-a07d077f-2ad1-4558-91fa-ab1895ca4914 c8ac60023a794aed8cec8552110d5f12 fdd538eb5dbf48a98d08e6d64def73d7] Couldn't unmount the NFS share 172.23.58.245:/NFSThinLun2
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume Traceback (most recent call last):
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume File "/usr/local/lib/python2.7/dist-packages/nova/virt/libvirt/volume.py", line 675, in disconnect_volume
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume utils.execute('umount', mount_path, run_as_root=True)
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume File "/usr/local/lib/python2.7/dist-packages/nova/utils.py", line 164, in execute
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume return processutils.execute(*cmd, **kwargs)
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume File "/usr/local/lib/python2.7/dist-packages/nova/openstack/common/processutils.py", line 193, in execute
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume cmd=' '.join(cmd))
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume ProcessExecutionError: Unexpected error while running command.
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume Command: sudo nova-rootwrap /etc/nova/rootwrap.conf umount /var/lib/nova/mnt/16a381ac60f3e130cf26e7d6eb832cb6
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume Exit code: 16
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume Stdout: ''
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume Stderr: 'umount.nfs: /var/lib/nova/mnt/16a381ac60f3e130cf26e7d6eb832cb6: device is busy\numount.nfs: /var/lib/nova/mnt/16a381ac60f3e130cf26e7d6eb832cb6: device is busy\n'
2014-07-07 17:48:46.175 3195 TRACE nova.virt.libvirt.volume

-----------End of the Log--

For NFS volumes, every time you detach a volume, nova tries to umount the device path.
/nova/virt/libvirt/volume.py in
Line 632: class LibvirtNFSVolumeDriver(LibvirtBaseVolumeDriver):
Line 653: def disconnect_volume(self, connection_info, disk_dev):
Line 661: utils.execute('umount', mount_path, run_as_root=True)

This works when the device path is not busy.
If the device path is busy (or in use), it should output a message to log and continue.
The problem is, Instead of output a log message, it raise exception and that cause the above error.

I think the reason is, the ‘if’ statement at Line 663 fails to catch the device busy massage from the content of the exc.message. It looking for the ‘target is busy’ in the exc.message, but umount error code returns ‘device is busy’.
Therefore, current code skip the ‘if’ statement and run the ‘else’ and raise the exception.

How to reproduce
--------------------------
(1) Prepare a NFS share storage and set it as the storage backend of you cinder
(refer http://docs.openstack.org/grizzly/openstack-block-storage/admin/content/NFS-driver.html)
In cinder.conf
volume_driver=cinder.volume.drivers.nfs.NfsDriver
nfs_shares_config=<path to your nfs share list file>
(2) Create 2 empty volumes from cinder
(3) Create a nova instance and attach above 2 volumes
(4) Then, try to detach one of them.
You will get the error in nova-compute.log “Couldn't unmount the NFS share <your NFS mount path on nova-compute>”

Proposed Fix
--------------------------
I’m not sure about any other OSs who outputs the ‘target is busy’ in the umount error code.
Therefore, first fix comes to my mind is fix the ‘if’ statement to:
Before fix;
if 'target is busy' in exc.message:
After fix;
if 'device is busy' in exc.message:

Thang Pham (thang-pham)
Changed in nova:
assignee: nobody → Thang Pham (thang-pham)
Revision history for this message
Thang Pham (thang-pham) wrote :

The feature above was put in by this commit: https://github.com/openstack/nova/commit/dc716bd0ce77b56f4aabe54d6633b7f3bf9b0a5d. I agree with your proposed fix. Most of the time, I see "device is busy" and not "target is busy". This should be a quick fix.

Revision history for this message
Thang Pham (thang-pham) wrote :

Re-assigning this bug to Sam, since he asked to fixed it.

Changed in nova:
assignee: Thang Pham (thang-pham) → nobody
assignee: nobody → Sampath Priyankara (sampath-priyankara)
Revision history for this message
SamP (sampath-priyankara) wrote :

Thanks Thang, I will push the fix

Tracy Jones (tjones-i)
tags: added: volumes
Changed in nova:
status: New → In Progress
description: updated
Revision history for this message
SamP (sampath-priyankara) wrote :

add link manually, since not posted automatically
Here are the codes,
https://review.openstack.org/#/c/111553/

tags: added: icehouse-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/111553
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bd5c5d5bbd808b4f83da58dce433cac711575bee
Submitter: Jenkins
Branch: master

commit bd5c5d5bbd808b4f83da58dce433cac711575bee
Author: Sampath Priyankara <email address hidden>
Date: Sun Aug 3 15:23:23 2014 +0900

    Fix for volume detach error when use NFS as the cinder backend

    For NFS volumes, every time you detach a volume, nova tries to umount
    the device path. If the device path is busy (or in use), it should
    output a message to log and continue.
    In current code, if the device is busy, it cannot catch the ‘device is busy’
    message returned by umount, because it looking for the ‘target is busy’.
     Therefore, current code skip the ‘if’ statement and run the ‘else’ and
    raise the exception.
    Fix: Add ‘device is busy’ to if statement.

    Add a mock test to check the behaviour of the
    virt.libvirt.volume.LibvirtNFSVolumeDriver.disconnect_volume
    when it has umount errors.

    Closes-Bug: #1340552

    Change-Id: Iac946c37064c5f5bf5a102305de40d21d16846c1

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.