libvirt: nova's detach_volume silently fails sometimes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Libvirt Python |
New
|
Undecided
|
Unassigned | ||
OpenStack Compute (nova) |
Confirmed
|
Low
|
Unassigned |
Bug Description
This behavior has been observed on the following platforms:
* Nova Icehouse, Debian 12.04, QEMU 1.5.3, libvirt 1.1.3.5, with the Cinder Icehouse NFS driver, CirrOS 0.3.2 guest
* Nova Icehouse, Debian 12.04, QEMU 1.5.3, libvirt 1.1.3.5, with the Cinder Icehouse RBD (Ceph) driver, CirrOS 0.3.2 guest
* Nova master, Debian 14.04, QEMU 2.0.0, libvirt 1.2.2, with the Cinder master iSCSI driver, CirrOS 0.3.2 guest
Nova's "detach_volume" fires the detach method into libvirt, which claims success, but the device is still attached according to "virsh domblklist". Nova then finishes the teardown, releasing the resources, which then causes I/O errors in the guest, and subsequent volume_attach requests from Nova to fail spectacularly due to it trying to use an in-use resource.
This appears to be a race condition, in that it does occasionally work fine.
Steps to Reproduce:
This script will usually trigger the error condition:
#!/bin/bash -vx
: Setup
img=$(glance image-list --disk-format ami | awk '/cirros-
vol1_
sleep 5
: Launch
nova boot --flavor m1.tiny --image "$img" --block-device source=
: Measure
nova show test | grep "volumes_
: Poke the bear
nova volume-detach test "$vol1_id"
sudo virsh list --all --uuid | xargs -r -n 1 sudo virsh domblklist
sleep 10
sudo virsh list --all --uuid | xargs -r -n 1 sudo virsh domblklist
vol2_
nova volume-attach test "$vol2_id"
sleep 1
: Measure again
nova show test | grep "volumes_
Expected behavior:
The volumes attach/
Actual behavior:
The second attachment fails, and n-cpu throws the following exception:
Failed to attach volume at mountpoint: /dev/vdb
Traceback (most recent call last):
File "/opt/stack/
File "/usr/local/
result = proxy_call(
File "/usr/local/
rv = execute(f, *args, **kwargs)
File "/usr/local/
File "/usr/local/
rv = meth(*args, **kwargs)
File "/usr/local/
if ret == -1: raise libvirtError ('virDomainAtta
libvirtError: operation failed: target vdb already exists
Workaround:
"sudo virsh detach-disk $SOME_UUID $SOME_DISK_ID" appears to cause the guest to properly detach the device, and also seems to ward off whatever gremlins caused the problem in the first place; i.e., the problem gets much less likely to present itself after firing a virsh command.
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → Low |
Changed in nova: | |
status: | Confirmed → In Progress |
Changed in nova: | |
status: | In Progress → Confirmed |
What version of libvirt/qemu used with master nova?