volume from image fails for Nexenta iSCSI

Bug #1213785 reported by James Clark
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
John Griffith

Bug Description

When creating a new iscsi volume from image - the volume creation fails during copy_image_to_volume (log attached)

This appears to be a problem with the volume creation flow rather than with the Nexenta driver in use. Volumes created without an image source work as expected.

The order of operations in cinder/volume/flows/create_volume.py - CreateVolumeFromSpecTask(CinderTask) is:
  * lookup then call create_functor (which will be _create_from_image)
  * on success, call create_export

_create_from_image winds it's way into copy_image_to_volume in the driver.py base class which tries to attach the volume being created. This fails because the iSCSI export is not yet created and registered (the call to create_export has not yet happened).

log fragment:

2013-08-18 17:06:24.643 27490 ERROR cinder.volume.flows.create_volume [req-92ff01ef-5c98-443c-88c9-4ceb5e63d9c0 cf456b20c3be423abf1683eee04dd481 700ef723bbd84f5eb50065338758d3e0] Failed to copy image 851e90d1-00b3-4a1d-a7ff-97400dff3050 to volume: 6a31cd71-96ac-4ef0-a705-8b1e9b4cf9a0, error: iscsiadm: Connection to Discovery Address 127.0.1.1 failed
iscsiadm: Login I/O error, failed to receive a PDU
iscsiadm: retrying discovery login to 127.0.1.1

Failing because volumes.provider_location is NULL, and no target exists yet anyway.

Revision history for this message
James Clark (jamiec) wrote :
Revision history for this message
John Griffith (john-griffith) wrote :

I believe this is a problem in the Nexenta driver and not in the task-flow (or at least a compatibility between the two). It appears that your discover address is not being set correctly, I wonder if it's possible that you're model update or export methods are wrong?

Have you tried attaching a nexenta volume to an instance and compared the provider info?

I've run a number of tests with LVM and SolidFire and have not seen any issues leading me to believe there's something specific with the Nexenta driver.

Revision history for this message
John Griffith (john-griffith) wrote :

The nexenta create method doesn't set provider info.

summary: - volume from image fails for iSCSI
+ volume from image fails for Nexenta iSCSI
tags: added: nexenta
tags: added: drivers
Revision history for this message
James Clark (jamiec) wrote :

There are separate driver API calls for create_volume() and create_export()

The Solidfire driver effectively does a create_export inside the create_volume call. Compare the last 10 or so lines of _get_model_info with _do_export in solidfire.py and you'll see. The create_export method appears redundant (duplicated code).

In the Nexenta driver create_volume() and create_export() do exactly what they say. You must call them both before you can attach.

At the very least the API contract/expectations for these driver calls needs to be explicitly defined somewhere before you can say which is wrong. Should create_volume implicitly export the volume as well. If so, then what is the purpose of create_export() ?

Logically, Nexenta looks correct here.

It still looks like the _create_from_image flow is broken. It tries to connect to a volume that has not yet been exported. That it works with Solidfire is a side effect of the Solidfire choosing to make create_export() unnecessary and redundant.

Revision history for this message
John Griffith (john-griffith) wrote :

Hmm... well, look at the code from the past year:

Grizzly:
https://github.com/openstack/cinder/blob/stable/grizzly/cinder/volume/manager.py#L168

Folsom:
https://github.com/openstack/cinder/blob/stable/folsom/cinder/volume/manager.py#L149

Essex was the same way, my point is that it seems that if the manager it expecting the model_update (which yes is the same as the export) it's probably correct to return the model update. Also the fact is that the task-flow code does the same thing that the old manager code has been doing for quite some time so I certainly wouldn't say it's a bug in the task-flow code.

Finally... just because a percentage of drivers failed to implement this in the past, it was never an issue because we didn't do things like general iscsi attaches during create operations, so nobody cared. Also for quite a while we explicitly called export at the end of the create method (even though this seems a bit redundant). Now however, we actually rely on attach before create finishes and drivers that didn't implement what the manager expects (such as Nexenta in this case as well as a number of others) fail.

Fact is there was some sloppiness here, personally I don't see the objection to the driver setting up the export at create time, why have an unnecessary second call back out to the driver?

That being said, for now rather than trying to fix the drivers that don't return the expected information on create it would be better to add the explicit export call if the model update isn't present. I'll look at submitting a fix for that tomorrow.

Thanks!

Changed in cinder:
status: New → Triaged
importance: Undecided → High
milestone: none → havana-3
assignee: nobody → John Griffith (john-griffith)
Revision history for this message
James Clark (jamiec) wrote :

Just an update on this. Victor Rodionov has a patch in review (Change I9dc72509) which addresses this for the Nexenta case by overriding the base class copy_image_to_volume. There are several drivers that take this approach since there was originally no base class implementation.

Revision history for this message
John Griffith (john-griffith) wrote :
Changed in cinder:
status: Triaged → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: havana-3 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.