cannot attach a volume when using multiple ceph backends

Bug #1502028 reported by Jay Lee
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Matt Riedemann

Bug Description

1. Exact version of Nova/OpenStack you are running: Kilo Stable

2. Relevant log files:

I'm testing using ceph RADOS block devices to attach VM; however I've hit an issue when ceph cluster is different between VM and volumes.

<--error message-->
2015-09-24 11:32:31 13083 DEBUG nova.virt.libvirt.config [req-b9bbd744-cf75-477b-b6a6-ea5b72f6181f 9504f2c4fe6b4b34a1bb0330f2faba35 0788824d5d1f46f2b014597ba8dc0585] Generated XML ('<disk type="network" device="disk">\n <driver name="qemu" type="raw" cache="none"/>\n <source protocol="rbd" name="rbd/volume-727c5319-1926-44ac-ba52-de55485faf2b">\n <host name="10.40.100.115" port="6789"/>\n <host name="10.40.100.116" port="6789"/>\n <host name="10.40.100.119" port="6789"/>\n </source>\n <auth username="cinder">\n <secret type="ceph" uuid="457eb676-33da-42ec-9a8c-9293d545c337"/>\n </auth>\n <target bus="virtio" dev="vdb"/>\n <serial>727c5319-1926-44ac-ba52-de55485faf2b</serial>\n</disk>\n',) to_xml /opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/nova/virt/libvirt/config.py:82
2015-09-24 11:32:31 13083 ERROR nova.virt.libvirt.driver [req-b9bbd744-cf75-477b-b6a6-ea5b72f6181f 9504f2c4fe6b4b34a1bb0330f2faba35 0788824d5d1f46f2b014597ba8dc0585] [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] Failed to attach volume at mountpoint: /dev/vdb
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] Traceback (most recent call last):
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1092, in attach_volume
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] virt_dom.attachDeviceFlags(conf.to_xml(), flags)
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] result = proxy_call(self._autowrap, f, *args, **kwargs)
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] rv = execute(f, *args, **kwargs)
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] six.reraise(c, e, tb)
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] rv = meth(*args, **kwargs)
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/libvirt.py", line 528, in attachDeviceFlags
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] libvirtError: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk1'
2015-09-24 11:32:31.923 13083 TRACE nova.virt.libvirt.driver [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165]
2015-09-24 11:32:31 13083 ERROR nova.virt.block_device [req-b9bbd744-cf75-477b-b6a6-ea5b72f6181f 9504f2c4fe6b4b34a1bb0330f2faba35 0788824d5d1f46f2b014597ba8dc0585] [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] Driver failed to attach volume 727c5319-1926-44ac-ba52-de55485faf2b at /dev/vdb
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] Traceback (most recent call last):
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/nova/virt/block_device.py", line 255, in attach
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] device_type=self['device_type'], encryption=encryption)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1103, in attach_volume
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] self._disconnect_volume(connection_info, disk_dev)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] six.reraise(self.type_, self.value, self.tb)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1092, in attach_volume
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] virt_dom.attachDeviceFlags(conf.to_xml(), flags)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] result = proxy_call(self._autowrap, f, *args, **kwargs)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] rv = execute(f, *args, **kwargs)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] six.reraise(c, e, tb)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] rv = meth(*args, **kwargs)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] File "/opt/stack/venv/nova-20150831T151915Z/lib/python2.7/site-packages/libvirt.py", line 528, in attachDeviceFlags
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165] libvirtError: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk1'
2015-09-24 11:32:31.926 13083 TRACE nova.virt.block_device [instance: 3aa05494-88ef-44c3-a7ad-705437b5f165]

This is a VM libvirt xml. A root disk is created in "ceph1" cluster.

<--xml-->
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <auth username='cinder'>
        <secret type='ceph' uuid='457eb676-33da-42ec-9a8c-9293d545c337'/>
      </auth>
      <source protocol='rbd' name='volumes/def50421-13d9-4bbd-ad93-4f95b9d38bf6_disk'>
        <host name='10.40.100.36' port='6789'/>
        <host name='10.40.100.37' port='6789'/>
        <host name='10.40.100.38' port='6789'/>
      </source>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

but in my environment, I have two ceph clusters and use multi backends and volume is created in "ssdceph" cluster.

<--cinder config for multi backends-->
# Configure the enabled backends
enabled_backends= ceph1,ssdceph

[ssdceph]
rbd_max_clone_depth = 5
rbd_flatten_volume_from_snapshot = False
rbd_user = admin
rbd_pool = rbd
rbd_ceph_conf = /etc/ceph/ssd/ceph.conf
volume_driver = cinder.volume.drivers.rbd.RBDDriver
volume_backend_name = ssdceph
rbd_secret_uuid = 7b16f0cb-1276-4ba5-a7c5-74e2bb06d836

[ceph1]
rbd_max_clone_depth = 5
rbd_flatten_volume_from_snapshot = False
rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
rbd_user = cinder
rbd_pool = volumes
rbd_ceph_conf = /etc/ceph/ceph.conf

backup_driver = cinder.backup.drivers.ceph
backup_ceph_conf = /etc/ceph/ceph.conf
backup_ceph_user = cinder-backup
backup_ceph_chunk_size = 134217728
backup_ceph_pool = backups
backup_ceph_stripe_unit = 0
backup_ceph_stripe_count = 0
restore_discard_excess_bytes = true
volume_driver = cinder.volume.drivers.rbd.RBDDriver
volume_backend_name = ceph

3. Reproduce steps:
(1) Create VM "X" with "A" ceph cluster. A root disk is created in "A" ceph cluster.
(2) Create volume "Y" with "B" ceph cluster using multi backends in cinder.
(3) Attach volume "Y" to "X", libvirt can't attach a volume properly.

Expected result: volume status is "in-use" and volume attached.
Actual result: volume status back to "available" and libvirt can't attach a volume properly.

nova-compute try to attach a volume using the secret_uuid, username parameter in nova.conf. If a volume has different secret_uuid, username with nova.conf, libvirt can't find exact volume in ceph cluster and error occured.

If a backend of volume which try to attach is ceph, user and secret_uuid information of volume should be provided by cinder; not in nova.conf.

Tags: ceph libvirt
Jay Lee (hyangii)
Changed in nova:
assignee: nobody → Jay Lee (hyangii)
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/318405

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/318405
Reason: This is no way looks like a valid bug fix. Please don't upload noop changes without an description in the commit message about why you are.

Revision history for this message
Sujitha (sujitha-neti) wrote :

Its been a long time this bug has been assigned and has no progress. The last patch submitted was invalid and abandoned. Removing the assignee.

Jay Lee: Feel free to add yourself as assignee and push a patch for it.

Changed in nova:
assignee: Jay Lee (hyangii) → nobody
status: In Progress → Confirmed
Revision history for this message
Augustina Ragwitz (auggy) wrote :

I looks like the version this bug was reported against is Kilo, which is at end of life. Has anyone confirmed if this is an issue in Mitaka or master?

Changed in nova:
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Revision history for this message
Kevin Zhao (kevin-zhao) wrote :
Download full text (8.9 KiB)

In Mitaka,
   When I use nova to attach the volume to an instance(with Ceph), will get error:
2016-08-22 03:30:04.879 DEBUG nova.virt.libvirt.guest [^[[01;36mreq-af1bf2bd-5346-4b56-b68f-e22b44e32715 ^[[00;36mdemo demo] ^[[01;35mattach device xml: <disk type="network" device="disk">
  <driver name="qemu" type="raw" cache="none"/>
  <source protocol="rbd" name="volumes/volume-aa3152f6-db7b-4893-8c5f-ff05de3ac36e">
    <host name="10.20.100.3" port="6789"/>
    <host name="10.20.100.4" port="6789"/>
    <host name="10.20.100.5" port="6789"/>
  </source>
  <auth username="nova">
    <secret type="ceph" uuid="06214a0d-dc30-4e9b-856b-c8f0c0e63d9d"/>
  </auth>
  <target bus="scsi" dev="sdb"/>
  <serial>aa3152f6-db7b-4893-8c5f-ff05de3ac36e</serial>
</disk>

2016-08-22 03:30:04.946 ERROR nova.virt.libvirt.driver [^[[01;36mreq-af1bf2bd-5346-4b56-b68f-e22b44e32715 ^[[00;36mdemo demo] ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] Failed to attach volume at mountpoint: /dev/sdb^[[00m
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00mTraceback (most recent call last):
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m File "/srv/nova/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1225, in attach_volume
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m guest.attach_device(conf, persistent=True, live=live)
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m File "/srv/nova/local/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 296, in attach_device
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m self._domain.attachDeviceFlags(device_xml, flags=flags)
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m File "/srv/nova/local/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m File "/srv/nova/local/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m rv = execute(f, *args, **kwargs)
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m File "/srv/nova/local/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m six.reraise(c, e, tb)
2016-08-22 03:30:04.946 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: bfa9db55-e55b-4aaa-9dab-7d2ebf85c009] ^[[00m ...

Read more...

Changed in nova:
status: Expired → Confirmed
assignee: nobody → Kevin Zhao (kevin-zhao)
Revision history for this message
Shorton (shorton3) wrote :

I am seeing this bug in Mitaka on Ubuntu. I am trying to get Magnum containers working, but its broken because Magnum is trying to attach my Ceph-backed Cinder volume to Nova instance. Is there a possible for a short-term fix? Thanks!

Revision history for this message
Michael Gugino (gugino-michael) wrote :

This issue also affects my org.

This looks to be address (in Master, at time of writing O cycle) here:
https://github.com/openstack/nova/commit/b89efa3ef611a1932df0c2d6e6f30315b5111a57

Looks like a good candidate for backport.

Revision history for this message
Jacolex (jacolex) wrote :

Hello
I have the same issue (Mitaka). Nova compute is passing rbd_user and rbd_secret from nova.conf to libvirt, which are overwriting user and secret passed from cinder-volume service during attaching volume from another ceph storage.

Is there any solution implemented already?

Revision history for this message
melanie witt (melwitt) wrote :

As mentioned in comment 8, I think this got fixed by https://review.openstack.org/#/c/389399 in Ocata (15.0.0.0b2).

As for a backport, the best way is to ask in #openstack-nova in IRC or in a Nova meeting or on the openstack-dev mailing list.

Changed in nova:
assignee: Kevin Zhao (kevin-zhao) → Matt Riedemann (mriedem)
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.