copy_image_to_volume() don't work because there's no connection

Bug #1648972 reported by Turbo Fredriksson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned

Bug Description

I've forked a Cinder/ZFSonLinux driver and got it working on Mitaka a few months ago.

After a lot of troubles because an accidental upgrade to Newton, I decided to scratch everything and start over, this time using Debian GNU/Linux Stretch (instead of Sid), which come by default with Newton.

But trying to create an instance, using a ZoL root volume, it fails to boot because the volume is not a bootable volume.

Creating a "stand alone" volume works fine, but it's the copying of the image (from Glance) onto the volume that fails.

Looking through the logs, adding some debugging here and there, it is because the copy_image_to_volume() is called _before_ create_export() and initialize_connection(), leaving create_volume() without a destination.

This is the excerpts from the logs:

Updating volume stats _update_volume_stats /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:231
create_volume(volume-826bed7f-8d5f-488a-9ea7-2cb96e25a267) => 826bed7f-8d5f-488a-9ea7-2cb96e25a267 create_volume /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:207
_find_iscsi_block_device(826bed7f-8d5f-488a-9ea7-2cb96e25a267) _find_iscsi_block_device /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:556
copy_image_to_volume: volume_id='826bed7f-8d5f-488a-9ea7-2cb96e25a267', dest='False' copy_image_to_volume /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:702
create_export(826bed7f-8d5f-488a-9ea7-2cb96e25a267) create_export /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:646
create_export(): Trying to share "share/VirtualMachines/Blade_Center/volume-826bed7f-8d5f-488a-9ea7-2cb96e25a267" create_export /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:649
initialize_connection(826bed7f-8d5f-488a-9ea7-2cb96e25a267) initialize_connection /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:583
_find_target: return iqn.2012-11.com.bayour:share.virtualmachines.blade.center.volume.826bed7f.8d5f.488a.9ea7.2cb96e25a267 _find_target /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:482
initialize_connection: target=iqn.2012-11.com.bayour:share.virtualmachines.blade.center.volume.826bed7f.8d5f.488a.9ea7.2cb96e25a267 initialize_connection /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:591
_login_target(10.0.3.253:3260, iqn.2012-11.com.bayour:share.virtualmachines.blade.center.volume.826bed7f.8d5f.488a.9ea7.2cb96e25a267) _login_target /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:489
_find_iscsi_block_device(826bed7f-8d5f-488a-9ea7-2cb96e25a267) _find_iscsi_block_device /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:556
_find_target: return iqn.2012-11.com.bayour:share.virtualmachines.blade.center.volume.826bed7f.8d5f.488a.9ea7.2cb96e25a267 _find_target /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:482
_find_iscsi_block_device: target=iqn.2012-11.com.bayour:share.virtualmachines.blade.center.volume.826bed7f.8d5f.488a.9ea7.2cb96e25a267 _find_iscsi_block_device /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:562
initialize_connection: block_dev=/dev/sdc initialize_connection /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:601
Updating volume stats _update_volume_stats /usr/lib/python2.7/dist-packages/cinder/volume/drivers/zol.py:231

Because this is a remote storage SAN, create_volume() does an ssh to the storage/SAN and creates the volume, the create_export() also uses ssh to create a iSCSI target on the remote storage/SAN and then initialize_connection() uses iscsiadm to login to the target on the Cinder host (which creates the device/target for copy image)..

But because all this is done _after_ attempting to copy the image to the volume, the whole thing fails..

I'm not sure what changed between Mitaka and Newton or if it's something I have failed to 'port' or configure (my cinder.conf is identical), but I've looked at the lvm.py driver as well as http://docs.openstack.org/developer/cinder/devref/drivers.html and as far as I can tell, I'm doing "The Right Thing (tm)".

The driver can be found at https://github.com/FransUrbo/Openstack-ZFS/blob/master/zol.py.

description: updated
Revision history for this message
John Griffith (john-griffith) wrote :

Some of the base code for this is a bit confusing and there are different ways to do it. I'm unclear on what changes may have impacted you, but looking at your driver I see the problem I think.

Take a look at the base driver copy_image_to_volume call:
    https://github.com/openstack/cinder/blob/master/cinder/volume/driver.py#L785

The lvm driver IIRC is now truly just LVM, when it actually gets instantiated it's going to pick in the iscsi class from driver.py and use that, so you'll then actually end up using the call above in driver.py.

You can also check some of the other drivers that chose to implement the method themeselves. The other option is you shouldn't need to implement this at all, you can just omit it and pick up the method from the base driver.

This is all sort of confusing and needs to be cleaned up a good bit, but check those things out as I think that's where your problem lies. I would be interested in digging in to the changes that exposed this for you though.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

This looks good, but I'm unsure on how to use it.

I can se that it does a volume attach, which in turn first tries to create an export.

But because my SAN is remote, and require a very special command to run on the remote host as well as creating a specific target, I don't know how/if this can/could solve my problem.

My hack was to pretty much do the same thing in create_volume() - https://github.com/FransUrbo/Openstack-ZFS/blob/master/zol.py#L229-231 - but I guess those lines is better of done in the copy_image_to_volume(), but is this the correct way to do it?

As in, the copy_image_to_volume() have to initiate and login to the target? From reading the documentation about the driver API (above), it seems this is/should be automatic. And it was in Mitaka..

So from what I remember of my porting of it to Mitaka, that if there was a san/remote variable set, it would then first call create_export() and initialize_connection() before it called copy_image_to_volume().

But I'm not sure, it was a while ago :).

In either case, this sounds like a much saner way to do it - let Cinder figure this out, and let the driver do the actual work.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

Reading the documentation again, if at all, the attach should be done in copy_*() functions, not in the create function:

  This method is responsible only for storage allocation on the backend. _It should not export
  a LUN or actually make this storage available for use, this is done in a later call_.

but it doesn't say anything that copy_*() should/must do the attaching.

But the "this is done in a later call" kind'a indicate that Cinder would call initialize_connection() "where/when appropriate", not that I should do it myself.

I'm slightly confused, a better/more (detailed) information would be appreciated.

Revision history for this message
John Griffith (john-griffith) wrote :

Don't do it in the create, add it in the copy_image_to_volume before the call to fetch the image, pass in the handle. In other words do just like the driver.py file does. Or just remove it altogether and use the base class method to do it if you can. That's what I do in the SolidFire driver, we should sync up via IRC.

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

That's what I just did. In practice, it's the same thing, but it DO make more sense to do it in copy_image_to_volume().

But my real question is: Is this the correct way to do it now (from Newton)? As in, the driver should deal with this, not Cinder?

Revision history for this message
Turbo Fredriksson (turbo-bayour) wrote :

Personally, I see this is both a regression and a bug. The driver should not have to deal with this, Cinder should (with the appropriate config option ("san_is_local = false" was the setting for Mitaka).

Eric Harney (eharney)
Changed in cinder:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.