live migration fails with xenapi virt driver and SRs with old-style naming convention

Bug #1566622 reported by Corey Wright
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
High
Unassigned

Bug Description

version: commit ce5a2fb419f999bec0fb2c67413387c8b67a691a

1. create a boot-from-volume instance prior to deploying commit 5bd222e8d854ca7f03ee6936454ee57e0d6e1a78
2. upgrade nova to commit 5bd222e8d854ca7f03ee6936454ee57e0d6e1a78
3. live-migrate instance
4. observe live-migrate action fail

based on my analysis of logs and code:
1. destination uses new-style SR naming convention in sr_uuid_map.
2. source tries to use new-style SR naming convention in talking to XenAPI (in nova.virt.xenapi.vmops.py:VMOps.live_migrate() -> _call_live_migrate_command())
3. xenapi throws XenAPI.Failure exception because it "Got exception UUID_INVALID" because it only knows the SR by the old-style naming convention

example destination nova-compute, source nova-compute, and xenapi logs from a live-migrate request to follow.

Revision history for this message
Corey Wright (coreywright) wrote :
tags: added: live-migration xen
removed: live-migrate nova xenapi
Bob Ball (bob-ball)
tags: added: xenapi
removed: xen
tags: added: xenserver
removed: xenapi
Revision history for this message
John Garbutt (johngarbutt) wrote :

Does this mean we need to make sure that when we generate the sr_map_uuid on the source host, we need to convert the destination uuid to the new format, else we hit this bug?

Revision history for this message
Corey Wright (coreywright) wrote :

@johngarbutt: i don't yet know a solution (ie still researching), but with the current code, if the SR uuid in the sr_uuid_map communicated from the destination to the host differs (ie new-style vs old-style naming convention), then when _call_live_migrate_command() tries to get the SR reference by calling xenapi's SR.get_by_uuid with the mismatching destination-provided SR UUID, then the xenapi call fails and the whole migration with it.

the root cause is this assumption in the code: "Source and destination SRs have the same UUID, so get the reference for the local SR [by using the destination's SR UUID]" (see nova/virt/xenapi/vmops.py:VMOps._call_live_migrate_command()).

Revision history for this message
Corey Wright (coreywright) wrote :

brainstorming potential solutions:
1. expand the sr_uuid_map to include both old and new style naming conventions (or create a shadow copy which has only old-style names) so that when lookup fails in _call_live_migrate_command() on one, it fallsback and tries the other
2. provide enough information to _call_live_migrate_command() so that if lookup fails, then it can fall back and recreate the old-style uuid and try it
3. ...

Revision history for this message
Bob Ball (bob-ball) wrote :

I'm worried this may be a bigger problem than described above, and it may not be fixable without reverting the SR naming: https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/xapi_vm_migrate.ml#L481

Here XAPI explicitly checks the SR uuid on source and destination being the same before it decides that it does not have to mirror; i.e. the VDI UUIDs matching is not sufficient.
If the UUIDs of the SRs differ between source and destination then XAPI will attempt to copy the VDI during the migrate phase, which will be reading from the remote SR from the source host and writing back to the same remote SR from the destination host. This is clearly not suitable.

I think that perhaps the only fix would be to specify old style when connecting the SR on the destination host.

Revision history for this message
Bob Ball (bob-ball) wrote :

Perhaps the change was too ambitious and we should add a config option to switch between the styles globally. It could default to the current Mitaka behaviour but enable upgrades by still using the old style where there are existing VMs.

Matt Riedemann (mriedem)
Changed in nova:
status: New → Confirmed
Revision history for this message
Corey Wright (coreywright) wrote :
Download full text (4.3 KiB)

live migrations are broken for any deployer using xen with pre-mitaka instances, so a configuration option is insufficient (as i see it).

the requirement: maintain the SR and VDI UUIDs between source and destination during live-migration

* VDI UUID naming convention appears to be unchanged.
* SR UUID naming convention changed:
** old-style: @"FA15E-D15C-<volume id>"@
** new-style: @uuid.uuid5(uuid.UUID("3cca4135-a809-5bb3-af62-275fbfe87178"), "<target>/<port>/<iqn>")@

so the derived requirements are:

# determine whether the source has an old or new-style named SR
# pass that determination to the destination for use when it creates its SR name

because this is all specific to the XenAPI virt driver and the destination creates the SR name in @pre_live_migration()@, the determination will have to be before then in @check_can_live_migrate_source()@ (ie the only place where the source's XenAPI virt driver is called beforehand). this is somewhat of a hack because ideally this would be done in @pre_live_migration@ on the source, which doesn't exist (@pre_live_migration()@ is only done on the destination) and this isn't really "checking if the source can live-migrate" (ie @check_can_live_migrate_source()@).

to pass this naming convention determination from source to destination, a determination that has to be done for all volumes attached to an instance (because, correct me if i'm wrong, but the instance could be created pre-mitaka with an old-style-SR volume, but then an additional volume added in mitaka with a new-style SR volume) so a @sr_uuid_naming_convention_map@ mirroring @block_device_mapping@ needs to be added to @XenapiLiveMigrateData@, bumping it's version to 1.0, and adding @obj_make_compatible()@ to it (handling the new @sr_uuid_naming_convention_map@ attribute by dropping it like how @LibvirtLiveMigrateData@ handles @target_connect_addr@ which was added in its version bump from 1.0 to 1.1).

so @sr_uuid_naming_convention_map@ will be passed from source to destination in XenapiLiveMigrateData and the destination can pass it down from @pre_live_migration()@ to @parse_sr_info()@ by way of adding it to @block_device_info['block_device_mapping'][*]['connection_info']['data']@ or by modifying all 5 interfaces called on the way down to @parse_sr_info()@ to pass it separate from block_device_info data.

still need to think about the source side and how exactly to mechanize determining the naming convention style. (well, obviously look at the SR uuids and if they start with "FA15E-D15C-" then it's old style.)

supporting documentation:

live-migration call-tree

# destination:nova/compute/manager.py:ComputeManager.check_can_live_migrate_destination()
## destination:nova/compute/manager.py:ComputeManager._do_check_can_live_migrate_destination()
### nova/virt/xenapi/driver.py:XenAPIDriver.check_can_live_migrate_destination()
#### nova/virt/xenapi/vmops.py:VMOps.check_can_live_migrate_destination()
### source:nova/compute/manager.py:ComputeManager.check_can_live_migrate_source()
#### nova/virt/xenapi/driver.py:XenAPIDriver.check_can_live_migrate_source()
##### nova/virt/xenapi/vmops.py:VMOps.check_can_live_migrate_source()
## nova/virt/xenap...

Read more...

Revision history for this message
Bob Ball (bob-ball) wrote :

If we're going to pass information from the source host to the destination _before_ we map the block device, can't we just pass the expected UUID and bypass the calculation entirely?

volume_utils.parse_sr_info will honor the sr_uuid if it is present in the connection_data for the block device.
Can we inject the UUID from the source into the connection_data when passing it to the dest?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/303348

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by John Garbutt (<email address hidden>) on branch: master
Review: https://review.openstack.org/303348
Reason: just for a discussion.

Revision history for this message
Corey Wright (coreywright) wrote :

@bob-ball

yes, we can pass a volume-SR mapping from the source to the destination... *if* i can figure out how to best determine the volume from the SR. (me limiting my thinking to passing the SR UUID *naming convention*, instead of the SR UUID *itself*, was just my daftness.)

i haven't found an existing function/method in the XenAPI virt driver that takes a SR and returns a volume. i investigated the output of the xe command (eg "xe sr-list") on the dom0 of a host with Nova-spawned boot-from-volume instances, and the closest i can get is the volume is encoded in the sr-associated iSCSI IQN as the storage target name (but reverse engineering it from there feels so dirty). is there a xen-api function that supports SR-to-volume lookups but doesn't have a "wrapper" implemented in the XenAPI virt driver?

i can always brute force the SR-to-volume mapping by working it backwards: calculating SRs names from volumes (as originally done when creating the SR), except i might have to calculate twice as many SR names as needed, because each volume can result in two different SR names (old and new style). that process is doable, but feels brutish, where something more clever is definitely preferred (if possible).

Revision history for this message
Bob Ball (bob-ball) wrote :

As far as I know I don't think there is a key we can use. I see that the attach_volume / detach_volume also take a mountpoint, which is used as the unique key - is that anywhere in the block device info? In which case, it would be good to use the same logic as detach.

I would think that the existing method to calculate the name (new style first, fall back to old style) would be fine as a fall back.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/307541

Changed in nova:
assignee: nobody → Corey Wright (coreywright)
status: Confirmed → In Progress
Bob Ball (bob-ball)
Changed in nova:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/307541
Reason: This code hasn't been updated in a long time, and is in merge conflict. I am going to abandon this review, but feel free to restore it if you're still working on this.

Revision history for this message
Paul Murray (pmurray) wrote :

Unassigned as the patch has been abandoned and there has been no other progress for 5 months. Feel free to take back if there is work being done.

Changed in nova:
status: In Progress → Confirmed
assignee: Corey Wright (coreywright) → nobody
affects: nova → ilh-facebook
Revision history for this message
Takashi Kajinami (kajinamit) wrote :

I don't understand what can be a reason to change the affected project. Please describe it in case the change was appropriate and intentional.

affects: ilh-facebook → nova
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.