live migration with encrypted volume fails

Bug #1633033 reported by Paul Carlton
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Lee Yarwood

Bug Description

When live migrating an instance with an encrypted volume it fails to detach the encrypted volume from the source and attaches at the target as an unencrypted volume.

I do see the encrypted volume connector on the source but not on the target

ls -l /dev/disk/by-path
total 0
lrwxrwxrwx 1 root root 121 Oct 13 10:26 ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-1a17d61f-7f44-450e-b040-a0baebdb0466-lun-1 -> /dev/mapper/crypt-ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-1a17d61f-7f44-450e-b040-a0baebdb0466-lun-1
lrwxrwxrwx 1 root root 9 Oct 13 10:36 ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-76909639-61bc-4abd-9a1e-fd5624bb8fc1-lun-1 -> ../../sdd

Target

ls -l /dev/disk/by-path
total 0
lrwxrwxrwx 1 root root 9 Oct 13 10:48 ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-1a17d61f-7f44-450e-b040-a0baebdb0466-lun-1 -> ../../sdb
lrwxrwxrwx 1 root root 9 Oct 13 10:48 ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-76909639-61bc-4abd-9a1e-fd5624bb8fc1-lun-1 -> ../../sdc

The instance can still access encrypted volume, but the data disappears when you umount/mount the device so I guess it looked ok at first due to filesystem caching

The live migration fails in post migrate on the source due to an error trying to detach the encrypted volume (see bug https://bugs.launchpad.net/os-brick/+bug/1631318, which I'll close now)

Subsequent attempts to detach the volume from the instance (after manually updating it to say it is on the target and active, see https://bugs.launchpad.net/nova/+bug/1628606.

Revision history for this message
Lee Yarwood (lyarwood) wrote :

I assume this is reproducible in devstack with the LVM/iSCSI cinder backend but could you just confirm?

Also as discussed on irc I'd be interested in seeing the connection_info captured on the destination host by a call to os-initialize_connection prior to connect_volume being called. Did it include the encrypted flag?

Revision history for this message
Paul Carlton (paul-carlton2) wrote :

Yes, it is reproducible in devstack
The connection info include a encrypted=True
Trying to get the following fix to work...

        for bdm in block_device_mapping:
            connection_info = bdm['connection_info']
            disk_info = blockinfo.get_info_from_bdm(
                instance, CONF.libvirt.virt_type,
                instance.image_meta, bdm)
            # New code
            LOG.debug("bdm: %s" % bdm)
            volume_id = connection_info.get('serial', None)
            data = connection_info.get('data', {})
            if data.get('encrypted', False):

                encryption = encryptors.get_encryption_metadata(context,
                                            self._volume_api, volume_id,
                                            connection_info)
                if encryption:
                    encryptor = self._get_volume_encryptor(connection_info,
                                                           encryption)
                    encryptor.attach_volume(context, **encryption)
            # end new code
            self._connect_volume(connection_info, disk_info)

But it fails in get_volume_encrytor because the connection_info['data'] item has no 'device_path'
This seems to be set by connect_volume in volume/iscsi.py currently trying to figure out how to make that get called

tags: added: live-migration
Revision history for this message
Paul Carlton (paul-carlton2) wrote :
Download full text (6.6 KiB)

Got a solution working for this bug in HPE Helion 4.0 version of OpenStack (Mitaka based).
However when I try to get the same fix to work in devstack running off master I get and error relating to passphrase when trying to connect volume on target in pre live migration

2016-10-21 08:19:28.779 78521 WARNING nova.keymgr.conf_key_mgr [req-09805499-5647-49c2-a3ea-e257ac627391 admin admin] This key manager is insecure and is not recommended for production deployments
2016-10-21 08:19:28.782 78521 DEBUG nova.volume.encryptors [req-09805499-5647-49c2-a3ea-e257ac627391 admin admin] Using volume encryptor '<nova.volume.encryptors.luks.LuksEncryptor object at 0x7f7dec14a5d0>' for connection: {u'driver_volume_type': u'iscsi', 'connector': {'platform': 'x86_64', 'host': 'devstack-hlinux-c2', 'do_local_attach': False, 'ip': '192.168.16.22', 'os_type': 'linux2', 'multipath': False, 'initiator': 'iqn.1993-08.org.debian:01:d6a93bd1597f'}, 'serial': u'9e897b1b-599d-489a-b080-c32a47eab2cd', u'data': {u'access_mode': u'rw', u'target_discovered': False, u'encrypted': True, u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-9e897b1b-599d-489a-b080-c32a47eab2cd', u'target_portal': u'192.168.16.20:3260', u'volume_id': u'9e897b1b-599d-489a-b080-c32a47eab2cd', u'target_lun': 1, 'device_path': u'/dev/disk/by-path/ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-9e897b1b-599d-489a-b080-c32a47eab2cd-lun-1', u'auth_password': u'***', u'auth_username': u'FFAtKjUUbEpUmLaEH6SZ', u'auth_method': u'CHAP'}} get_volume_encryptor /home/pcarlton/openstack/nova/nova/volume/encryptors/__init__.py:57
2016-10-21 08:19:28.782 78521 DEBUG nova.volume.encryptors.luks [req-09805499-5647-49c2-a3ea-e257ac627391 admin admin] opening encrypted volume /dev/sdb _open_volume /home/pcarlton/openstack/nova/nova/volume/encryptors/luks.py:83
2016-10-21 08:19:28.783 78521 DEBUG oslo_concurrency.processutils [req-09805499-5647-49c2-a3ea-e257ac627391 admin admin] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf cryptsetup luksOpen --key-file=- /dev/sdb crypt-ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-9e897b1b-599d-489a-b080-c32a47eab2cd-lun-1 execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:344
2016-10-21 08:19:32.030 78521 DEBUG oslo_concurrency.processutils [req-09805499-5647-49c2-a3ea-e257ac627391 admin admin] CMD "sudo nova-rootwrap /etc/nova/rootwrap.conf cryptsetup luksOpen --key-file=- /dev/sdb crypt-ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-9e897b1b-599d-489a-b080-c32a47eab2cd-lun-1" returned: 2 in 3.247s execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:374
2016-10-21 08:19:32.031 78521 DEBUG oslo_concurrency.processutils [req-09805499-5647-49c2-a3ea-e257ac627391 admin admin] u'sudo nova-rootwrap /etc/nova/rootwrap.conf cryptsetup luksOpen --key-file=- /dev/sdb crypt-ip-192.168.16.20:3260-iscsi-iqn.2010-10.org.openstack:volume-9e897b1b-599d-489a-b080-c32a47eab2cd-lun-1' failed. Not Retrying. execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:422
2016-10-21 08:19:32.033 78521 ERROR nova.virt.libvirt.driver [req-098054...

Read more...

Changed in nova:
assignee: nobody → Paul Carlton (paul-carlton2)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/389608

Changed in nova:
status: New → In Progress
Revision history for this message
Brianna Poulos (brianna-poulos) wrote :

Note that there is a bug related to mangling the passphrase used for encrypting/decrypting: https://bugs.launchpad.net/nova/+bug/1633518

This might be related to the passphrase error that was encountered when testing the fix on master (mentioned in comment #3: https://bugs.launchpad.net/nova/+bug/1633033/comments/3).

Changed in nova:
assignee: Paul Carlton (paul-carlton2) → Lee Yarwood (lyarwood)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/389608
Reason: This review is > 4 weeks without comment, and is not mergable in it's current state. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
melanie witt (melwitt) wrote :

lyarwood confirmed on IRC in the #openstack-nova channel today that this bug has been fixed by:

https://review.openstack.org/460243

which landed in queens. So the fix is available in as of version 17.0.0.0b3.

Changed in nova:
importance: Undecided → Medium
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.