Retyping with migrate to new backend not renaming volume on backend device

Bug #1450649 reported by Sean McGinnis
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Vincent Hou

Bug Description

When doing a retype volumes are being lost. Cinder calls in to the driver to have a new volume created an the second backend. It then copies the data over from the first backend to the new lun on the second backend. After the migration of the data completes the original lun on the first backend is deleted.

The problem with this is the new volume is given a new ID, but the cinder database keeps the original volume id. There is a missing step where cinder should instruct the driver to update the new volume's ID so that it matches the old deleted volume.

Subsequent calls in to the driver to manipulate the volume fail because cinder still has the old ID but the array only has the new ID.

<jdg>
Just a little clarification:
The issue here is that some backend devices use the volume UUID from Cinder at create time to create their internal mapping. The migration process however does a UUID swap in the DB but doesn't actively send an update request or notification to the backend driver to tell it that it also needs to update its internal mapping information WRT the volume.

Suggestion is the addition of a general "update_volume_info" or something along those lines being setup up for drivers to be informed that they have to change something on the backend (ie: update_voume_info(notification=id_change, 'old_id: xxx, new_id: xxx')

maybe a generalized method like that is more trouble than it's worth, but I can see other places where a notification like this would be handy.

Changed in cinder:
status: New → Confirmed
Mike Perez (thingee)
Changed in cinder:
importance: Undecided → High
milestone: none → liberty-1
summary: - Retyping to new backend not renaming volume on array
+ Retyping with migrate to new backend not renaming volume on backend
+ device
description: updated
Changed in cinder:
status: Confirmed → Triaged
Revision history for this message
Vincent Hou (houshengbo) wrote :

I will take this issue into account for the migration improvement (https://etherpad.openstack.org/p/volume-migration-improvement) for the L release. Look forward to meeting you in Vancouver for discussion.

Changed in cinder:
assignee: nobody → Vincent Hou (houshengbo)
Revision history for this message
Vincent Hou (houshengbo) wrote :

Sean, what kind of back-ends did you use?

I have done some tests with LVM and Storwize V7000.
When I retyped a volume from LVM to Storwize V7000.
The database record ended up with the following:
+------------------------------- -------+--------------------------------------+
| id | _name_id |
+------------------------------ --------+--------------------------------------+
| 90f8ed5c-177f-4eee-acd7-1094ba786077 | 02d59537-7227-4a97-822f-a9510aca25a9 |
+---------------------------- ----------+------------------ --------------------+

_name_id is used to map the cinder volume id and the back-end volume id. They can be different for sure. If these two volume IDs are the same, the _name_id will be NULL. I guess this issue has already been taken care of in the retype code. And I can delete the new created volume with the old volume id, which is expected behavior.

I need folks to help me with more tests. Try retyping between other two back-ends and feed back your results.

Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

Thanks Vincent. I think John identified the root issue. This probably only impacts those drivers that use the volume ID as the identifier on the backend. So we probably either need to update those drivers to not use ID, or use his suggestion to have an update call to give those drivers a chance to update their internal information to be able to correctly locate the new volume.

I could see having a call like update_voume_info() being useful for other purposes in the future, so my inclination is to go that direction.

Revision history for this message
Erlon R. Cruz (sombrafam) wrote :

Hi Sean, what backend did you reproduced that? I tried to reproduce with Hitachi HUS driver[1], and as John said, the problem should happen when the backend use the ID to identify the volume in the array, so, as expected (hus driver stores the volume identity in the metadata) it worked well.

But, when I did the same in our NFS and iSCSI backends[2], which use the ID to name the volume in the array, the re-type also worked with no problems as well. After retyping I can access and use the volumes in the new backends.

What I think is happening is that cinder is using the name_id to access the volume which always is updated to the volume name present in the ID.

[1] http://paste.openstack.org/show/215539/
[2] http://paste.openstack.org/show/215618/

Revision history for this message
John Griffith (john-griffith) wrote :

@Erlon
Keep in mind the uuid swap only occurs in the case of a migrate. If you don't do a migrate as part of the retype I wouldn't expect any sort of problem. Not sure, but maybe that's what you did (hus-iscsi-1 and hus-iscsi-2 being two separate physical backends?).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/180873

Changed in cinder:
status: Triaged → In Progress
Revision history for this message
Vincent Hou (houshengbo) wrote :

The current migration code has already provided a chance for different drivers to implement the method

def update_migrated_volume(self, ctxt, volume, backend_volume):

to do some necessary updates after the migration is done.

I provide a patch implementing this method for LVM and Storwize driver in terms of adding the code to rename the back-end volume.
I think other storage back-ends can also implement in a similar way.

Revision history for this message
Erlon R. Cruz (sombrafam) wrote :

@John
In fact those 2 backends are actually in the same storage, but I configured 2 backends on cinder.conf and it seemed to me that the migration occurred as I used a distinct volume_backend_name for each. The operation takes a long time, I can see 'dd' copying the data, and as you can see on [1] after the migration is triggered, cinder list shows 3 volumes, and then, when it is finished, 2, with the same UUID as before. HUS does not use the volume ID to name the volume on storage.

In the second example (migrating v3 to hnas-nfs) the migration also happen (the physical backend are the same but its another protocol). Both backends use the volume ID as volume name and I had no problems to perform other operations after the migration.

I still curious to know what backends this problems is happening and why the problem does not happen with HNAS.

Revision history for this message
Tom Swanson (tom-swanson) wrote :

Dell SC.

I've just implemented https://review.openstack.org/181160 which fixes the issue.

So you can migrate the volume and then migrate it back?

I'm using the volume['id'] as the volume name on my backend. This is my only link between the cinder volume and the backend volume. Is there something else I should have used instead?

Revision history for this message
Vincent Hou (houshengbo) wrote :

Folks, I have been working on the mismatch issues between the LVM name and the iscsi target name for the migration of attached volumes.

Three patches combine to fix them:
A patch for Nova: https://review.openstack.org/#/c/181818/
A patch for cinderclient: https://review.openstack.org/#/c/181819/
A patch for cinder: https://review.openstack.org/#/c/180873/

This solution can be improved later for sure, but so far it can work to make sure that the non-attached and attached volumes will have the same names as they used to be, without the link issue between volume and iscsi target, after the successful migration.

Mike Perez (thingee)
Changed in cinder:
milestone: liberty-1 → liberty-2
Changed in cinder:
assignee: Vincent Hou (houshengbo) → Peter Penchev (openstack-dev-s)
Changed in cinder:
assignee: Peter Penchev (openstack-dev-s) → Vincent Hou (houshengbo)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by Vincent Hou (<email address hidden>) on branch: master
Review: https://review.openstack.org/195932

Revision history for this message
Vincent Hou (houshengbo) wrote :
Changed in cinder:
status: In Progress → Fix Committed
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: liberty-2 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.