Revert to snapshot fails with mounted share in LVM driver

Bug #1658133 reported by Alyson
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Shared File Systems Service (Manila)
Fix Released
High
Ben Swartzlander

Bug Description

I'm currently trying to use the new feature "revert to snapshot" in Manila LVM driver following these steps:

1 - created a share "lvm_share1" in lvm backend;
2 - added rules to use this share and added files to it;
3 - created a snapshot "snap1";
4 - modified files in share;
5 - tried to revert to snap1:

$ manila revert-to-snapshot snap1
This operation fails and share status became reverting_error. Traceback in log:

2017-01-20 13:04:46.380 DEBUG oslo_concurrency.processutils [req-841c7a5e-3000-45b9-a3b1-a7ac2cbb815b 04ee1b57f2c34ec784d10b9524b8296e 536f12c7855047b8a90d351fe1b69ad1] CMD "sudo manila-rootwrap /etc/manila/rootwrap.conf umount /opt/stack/data/manila/mnt/share-947b0995-b279-46bc-9749-7a6be6ed49f4" returned: 1 in 0.958s
2017-01-20 13:04:46.517 ERROR oslo_messaging.rpc.server [req-841c7a5e-3000-45b9-a3b1-a7ac2cbb815b 04ee1b57f2c34ec784d10b9524b8296e 536f12c7855047b8a90d351fe1b69ad1] Exception during message handling
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server Traceback (most recent call last):
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 155, in _process_incoming
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 222, in dispatch
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 192, in _do_dispatch
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server result = func(ctxt, **new_args)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/share/manager.py", line 165, in wrapped
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server return f(self, *args, **kwargs)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/utils.py", line 493, in wrapper
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server return func(self, *args, **kwargs)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/share/manager.py", line 2157, in revert_to_snapshot
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server self._revert_to_snapshot(context, share, snapshot, reservations)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/share/manager.py", line 2196, in _revert_to_snapshot
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server {'status': constants.STATUS_AVAILABLE})
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server self.force_reraise()
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/share/manager.py", line 2177, in _revert_to_snapshot
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server share_server=share_server)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/share/drivers/lvm.py", line 372, in revert_to_snapshot
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server self._unmount_device(share)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/share/drivers/lvm.py", line 335, in _unmount_device
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server self._execute('umount', mount_path, run_as_root=True)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/opt/stack/manila/manila/utils.py", line 67, in execute
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server return processutils.execute(*cmd, **kwargs)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 394, in execute
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server cmd=sanitized_cmd)
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server ProcessExecutionError: Unexpected error while running command.
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server Command: sudo manila-rootwrap /etc/manila/rootwrap.conf umount /opt/stack/data/manila/mnt/share-947b0995-b279-46bc-9749-7a6be6ed49f4
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server Exit code: 1
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server Stdout: u''
2017-01-20 13:04:46.517 TRACE oslo_messaging.rpc.server Stderr: u'umount: /opt/stack/data/manila/mnt/share-947b0995-b279-46bc-9749-7a6be6ed49f4: device is busy.\n (In some cases useful info about processes that use\n the device is found by lsof(8) or fuser(1))

After this error, the snapshot is in merging state, and other lvm operations fails. Like delete:
$ manila snapshot-delete snap1

2017-01-20 13:09:17.893 DEBUG oslo_concurrency.processutils [req-8fd5b0f9-bfc8-4b0d-a8ad-226aa3b53f8a 04ee1b57f2c34ec784d10b9524b8296e 536f12c7855047b8a90d351fe1b69ad1] CMD "sudo manila-rootwrap /etc/manila/rootwrap.conf lvremove -f lvm-shares/share-snapshot-7fc29b6d-abb1-454d-b087-c731f25a12d0" returned: 5 in 0.734s from (pid=8686) execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:379
2017-01-20 13:09:17.895 TRACE manila.share.driver Traceback (most recent call last):
2017-01-20 13:09:17.895 TRACE manila.share.driver File "/opt/stack/manila/manila/share/driver.py", line 205, in _try_execute
2017-01-20 13:09:17.895 TRACE manila.share.driver self._execute(*command, **kwargs)
2017-01-20 13:09:17.895 TRACE manila.share.driver File "/opt/stack/manila/manila/utils.py", line 67, in execute
2017-01-20 13:09:17.895 TRACE manila.share.driver return processutils.execute(*cmd, **kwargs)
2017-01-20 13:09:17.895 TRACE manila.share.driver File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py", line 394, in execute
2017-01-20 13:09:17.895 TRACE manila.share.driver cmd=sanitized_cmd)
2017-01-20 13:09:17.895 TRACE manila.share.driver ProcessExecutionError: Unexpected error while running command.
2017-01-20 13:09:17.895 TRACE manila.share.driver Command: sudo manila-rootwrap /etc/manila/rootwrap.conf lvremove -f lvm-shares/share-snapshot-7fc29b6d-abb1-454d-b087-c731f25a12d0
2017-01-20 13:09:17.895 TRACE manila.share.driver Exit code: 5
2017-01-20 13:09:17.895 TRACE manila.share.driver Stdout: u''
2017-01-20 13:09:17.895 TRACE manila.share.driver Stderr: u' Can\'t remove merging snapshot logical volume "share-snapshot-7fc29b6d-abb1-454d-b087-c731f25a12d0"\n'

As a workaround, I have to remove all rules from lvm share before using revert-to-snapshot, then it works.

Changed in manila:
importance: Undecided → High
Changed in manila:
assignee: nobody → Ben Swartzlander (bswartz)
Changed in manila:
status: New → Confirmed
Changed in manila:
milestone: none → ocata-rc1
Revision history for this message
Ben Swartzlander (bswartz) wrote :

I've confirmed that the correct fix here is to remove all access during the revert and reapply the access afterwards. My concern is the access code itself. It appears to be written in a way that makes all access rules volatile which would cause a different bug.

Fixing that bug will involve a lot of code change, and it will overlap with the fix for this bug so I'm not sure it makes sense to try to fix this first.

Revision history for this message
Rodrigo Barbieri (rodrigo-barbieri2010) wrote :

Ben, a minimal fix is required to avoid the lv getting stuck in merging state. Could you please elaborate on how an access bug interacts/overlaps with this bug?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to manila (master)

Fix proposed to branch: master
Review: https://review.openstack.org/428398

Changed in manila:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to manila (master)

Reviewed: https://review.openstack.org/428398
Committed: https://git.openstack.org/cgit/openstack/manila/commit/?id=a7d8363d32c687a01b41133ad609c3949791cd58
Submitter: Jenkins
Branch: master

commit a7d8363d32c687a01b41133ad609c3949791cd58
Author: Ben Swartzlander <email address hidden>
Date: Thu Feb 2 13:46:35 2017 -0500

    Pass access rules to driver on snapshot revert

    In order to revert to a snapshot in the LVM driver (and
    very likely other drivers) the list of access rules is
    needed, so this change modifies the driver interface to
    provide this extra information.

    This change requires preventing a revert to snapshot
    operation while access rules on the affected share are
    out of sync.

    Closes bug: 1658133

    Change-Id: Ia6678bb0e484f9c8f8b05d90e514801ae9baa94b

Changed in manila:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/manila 4.0.0.0rc1

This issue was fixed in the openstack/manila 4.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.