nova-compute stacktrace in log when reattaching same volume

Bug #917245 reported by David Kranz
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Unassigned

Bug Description

I created two vms and a volume. I attached the volume to the first vm with

nova volume-attach oneiric-1 2 /dev/vdd

I used volume-detach and then attached the volume to the other vm.
I then detached from the second vm and tried to attach again to the first vm with the same command. The
attach did not work and the following error was in the nova-compute log. I tried again but using /dev/vde and it worked. This was using kvm.

 I know that with kvm there is no relation between the drive you specify with volume-attach and the actual
new drive seen by the vm. This is confusing because if I attach more than one volume and mount a filesystem for each, it seems there is no way to reattach them and know they are in the right place. Should they be mounted by UUID? Am I missing something obvious?

2012-01-16 11:07:57,853 DEBUG nova.utils [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] Running cmd (subprocess): sudo iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000002 -p 172.18.0.131:3260 --rescan from (pid=21157) execute /usr/lib/python2.7/dist-packages/nova/utils.py:167
2012-01-16 11:07:57,872 DEBUG nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] iscsiadm ('--rescan',): stdout=Rescanning session [sid: 5, target: iqn.2010-10.org.openstack:volume-00000002, portal: 172.18.0.131,3260]
 stderr= from (pid=21157) _run_iscsiadm /usr/lib/python2.7/dist-packages/nova/volume/driver.py:514
2012-01-16 11:07:58,872 DEBUG nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] Found iSCSI node /dev/disk/by-path/ip-172.18.0.131:3260-iscsi-iqn.2010-10.org.openstack:volume-00000002-lun-0 (after 1 rescans) from (pid=21157) discover_volume /usr/lib/python2.7/dist-packages/nova/volume/driver.py:570
2012-01-16 11:08:00,326 ERROR nova.compute.manager [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] instance 14: attach failed /dev/vdd, removing
(nova.compute.manager): TRACE: Traceback (most recent call last):
(nova.compute.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1280, in attach_volume
(nova.compute.manager): TRACE: mountpoint)
(nova.compute.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 100, in wrapped
(nova.compute.manager): TRACE: return f(*args, **kw)
(nova.compute.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 373, in attach_volume
(nova.compute.manager): TRACE: virt_dom.attachDevice(xml)
(nova.compute.manager): TRACE: File "/usr/lib/python2.7/dist-packages/libvirt.py", line 298, in attachDevice
(nova.compute.manager): TRACE: if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self)
(nova.compute.manager): TRACE: libvirtError: operation failed: adding virtio-blk-pci,bus=pci.0,addr=0x8,drive=drive-virtio-disk3,id=virtio-disk3 device failed: Duplicate ID 'virtio-disk3' for device
(nova.compute.manager): TRACE:
(nova.compute.manager): TRACE:
2012-01-16 11:08:00,362 WARNING nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] ISCSI provider_location not stored, using discovery
2012-01-16 11:08:00,362 DEBUG nova.utils [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] Running cmd (subprocess): sudo iscsiadm -m discovery -t sendtargets -p xg03 from (pid=21157) execute /usr/lib/python2.7/dist-packages/nova/utils.py:167
2012-01-16 11:08:00,382 DEBUG nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] ISCSI Discovery: Found 172.18.0.131:3260,1 iqn.2010-10.org.openstack:volume-00000002 from (pid=21157) _get_iscsi_properties /usr/lib/python2.7/dist-packages/nova/volume/driver.py:487
2012-01-16 11:08:00,383 DEBUG nova.utils [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] Running cmd (subprocess): sudo iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000002 -p 172.18.0.131:3260 --op update -n node.startup -v manual from (pid=21157) execute /usr/lib/python2.7/dist-packages/nova/utils.py:167
2012-01-16 11:08:00,398 DEBUG nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] iscsiadm ('--op', 'update', '-n', 'node.startup', '-v', 'manual'): stdout= stderr= from (pid=21157) _run_iscsiadm /usr/lib/python2.7/dist-packages/nova/volume/driver.py:514
2012-01-16 11:08:00,399 DEBUG nova.utils [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] Running cmd (subprocess): sudo iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000002 -p 172.18.0.131:3260 --logout from (pid=21157) execute /usr/lib/python2.7/dist-packages/nova/utils.py:167
2012-01-16 11:08:01,054 DEBUG nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] iscsiadm ('--logout',): stdout=Logging out of session [sid: 5, target: iqn.2010-10.org.openstack:volume-00000002, portal: 172.18.0.131,3260]
Logout of [sid: 5, target: iqn.2010-10.org.openstack:volume-00000002, portal: 172.18.0.131,3260]: successful
 stderr= from (pid=21157) _run_iscsiadm /usr/lib/python2.7/dist-packages/nova/volume/driver.py:514
2012-01-16 11:08:01,055 DEBUG nova.utils [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] Running cmd (subprocess): sudo iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000002 -p 172.18.0.131:3260 --op delete from (pid=21157) execute /usr/lib/python2.7/dist-packages/nova/utils.py:167
2012-01-16 11:08:01,070 DEBUG nova.volume.driver [95bd2914-7ba4-46c3-a4c0-22327816df0e tester testproject] iscsiadm ('--op', 'delete'): stdout= stderr= from (pid=21157) _run_iscsiadm /usr/lib/python2.7/dist-packages/nova/volume/driver.py:514
2012-01-16 11:08:01,070 ERROR nova.rpc [-] Exception during message handling
(nova.rpc): TRACE: Traceback (most recent call last):
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 620, in _process_data
(nova.rpc): TRACE: rval = node_func(context=ctxt, **node_args)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 118, in decorated_function
(nova.rpc): TRACE: function(self, context, instance_id, *args, **kwargs)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1303, in attach_volume
(nova.rpc): TRACE: raise exc
(nova.rpc): TRACE: libvirtError: operation failed: adding virtio-blk-pci,bus=pci.0,addr=0x8,drive=drive-virtio-disk3,id=virtio-disk3 device failed: Duplicate ID 'virtio-disk3' for device

Tags: volume
Revision history for this message
Vish Ishaya (vishvananda) wrote :

I've noticed this before. It seems to be impossible to reuse the same mountpoint for a second attach.

It would be nice to find out if there is a way to stop this from happening, but otherwise perhaps we should be tracking used device names and just picking the next one instead of using what is passed in by the user.

Brian Waldon (bcwaldon)
tags: added: volume
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
importance: Low → Medium
Revision history for this message
John Griffith (john-griffith) wrote :

I was interested in looking at this and trying to come up with a fix. One thing I came across in libvirt documentation was they seem to discourage using /dev/xx for iSCSI mount points but recommend using /dev/disk/by-path/xxx or /dev/disk/by-id/xxx. Any thoughts around me trying to figure out a way to use /dev/disk/by-id for mount points?

I should have access to my iSCSI volumes tomorrow and I can test this out a bit.

Revision history for this message
John Griffith (john-griffith) wrote :

I tried reproducing this in a devstack environment using both local drives for volumes as well as iSCSI volumes but could not reproduce.

Revision history for this message
David Kranz (david-kranz) wrote :

I saw this in a diablo cluster.

Revision history for this message
yong sheng gong (gongysh) wrote :

I failed to reproduce it with latest master branch

Revision history for this message
wangpan (hzwangpan) wrote :

I saw this in a essex cluster, too when i do these steps:
1、nova volume-attach 24e52914-f066-49b0-8eb3-a06514bea4ea 13 /dev/vda
2、nova volume-detach 24e52914-f066-49b0-8eb3-a06514bea4ea 13
3、nova volume-attach 24e52914-f066-49b0-8eb3-a06514bea4ea 13 /dev/vda, this time attach failed
if i attach the volume to another device such as /dev/vdb like this nova volume-attach 24e52914-f066-49b0-8eb3-a06514bea4ea 13 /dev/vdb, it success

Revision history for this message
Vish Ishaya (vishvananda) wrote :

/dev/vda should be the root drive. Can you reproduce by using /dev/vdc twice?

Revision history for this message
wangpan (hzwangpan) wrote :

sorry for responsing so late.
I got the same result by using /dev/vdc after reattaching the volume.
but I try the same steps by using ‘virsh attach-disk&detach-disk’(as well as attach-device&detach-device), it gives me some error log like this:
root@114-113-199-9:/home/hzwangpan# virsh attach-disk 25 /home/hzwangpan/zero vdc
Disk attached successfully

root@114-113-199-9:/home/hzwangpan# virsh detach-disk 25 /home/hzwangpan/zero
Disk detached successfully

root@114-113-199-9:/home/hzwangpan# virsh attach-disk 25 /home/hzwangpan/zero vdc
error: Failed to attach disk
error: internal error unable to execute QEMU command 'device_add': Duplicate ID 'virtio-disk2' for device
but I can successfully attach the same disk(/home/hzwangpan/zero) to a device which is the first time be attached such as 'vdd'.
so I think this is a bug in libvirt or kvm.
the libvirt version of mine is:
root@114-113-199-9:/home/hzwangpan# virsh version
Compiled against library: libvir 0.9.11
Using library: libvir 0.9.11
Using API: QEMU 0.9.11
Running hypervisor: QEMU 1.0.0
kvm version info:
root@114-113-199-9:/home/hzwangpan# kvm -version
QEMU emulator version 1.0 (qemu-kvm-1.0+dfsg-11, Debian), Copyright (c) 2003-2008 Fabrice Bellard

Revision history for this message
Dave Spano (dspano) wrote :

Vish,

I ran into the same problem with my Essex install. mounting as /dev/vdd works. Then if you detach and reattach again using /dev/vdc it works too. It seems to only care about whatever the previous volume was.

Revision history for this message
Dan Smith (danms) wrote :

Do you have acpiphp loaded in the guest? IIRC, this can't-attach-in-the-same-place-twice issue is related to the guest not being able to do proper hotplug (when that module isn't loaded), which prevents the detach from actually taking place, and thus the subsequent attach fails.

Revision history for this message
David Kranz (david-kranz) wrote :

I just tried this again on a stable/essex cluster using precise-server-cloudimg-amd64-disk1.img and did not have this problem.
The guest had acpid running. Perhaps oneiric was the cause of the problem. I suppose If it is really necessary for the guest to be running this service then that should be documented.

Revision history for this message
David Blundell (david-blundell) wrote :

The bug looks identical to one that Redhat recently closed: https://bugzilla.redhat.com/show_bug.cgi?id=712266

Their fix was to add a cleanup call to libvirt - see comment 6 on the Redhat page.

I have been able to reproduce this on a Folsom-2 cluster using Precise by repeatedly adding and removing a volume with the same mountpoint. Strangely, the fault did not show up on all instances I tried it on.

Revision history for this message
Christian Berendt (berendt) wrote :

Tried reproducing this behavior on Folsom (2012.2) with Cinder as volume service (with LVM2/iSCSI) and libvirt/Xen for virtualization without success. I can cycle volumes between instances without problems.

Revision history for this message
Chuck Short (zulcss) wrote :

This sounds like its been fixed in libvirt, please re-open if you can still reproduce this.

Changed in nova:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.