Swapping volume can't been swap again

Bug #1490236 reported by Chung Chih, Hung on 2015-08-30
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Takashi NATSUME

Bug Description

If we had two volume, one is attached by a instance, and the other is still available.
Due to https://bugs.launchpad.net/nova/+bug/1489744, those volumes will stay in wrong status.
It will swap successful after fix the issue.
Volumes will be available and in-use status.
But when I try to swap in-use volume with other available volume.
Nova compute will throw following exception:

2015-08-30 04:55:13.772 ERROR oslo_messaging.rpc.dispatcher [req-4d999362-7a13-4b43-8c6a-0d85f3b9aa5b admin admin] Exception during message handling: No volume Block Device Mapping with id cafc833a-8645-47db-b464-999142afa7be.
Traceback (most recent call last):

  File "/opt/stack/nova/nova/conductor/manager.py", line 443, in _object_dispatch
    return getattr(target, method)(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 169, in wrapper
    result = fn(cls, context, *args, **kwargs)

  File "/opt/stack/nova/nova/objects/block_device.py", line 204, in get_by_volume_id
    raise exception.VolumeBDMNotFound(volume_id=volume_id)

VolumeBDMNotFound: No volume Block Device Mapping with id cafc833a-8645-47db-b464-999142afa7be.

2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher payload)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 72, in wrapped
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 345, in decorated_function
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance_uuid=instance_uuid)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 316, in decorated_function
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 373, in decorated_function
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 361, in decorated_function
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 4706, in swap_volume
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher resize_to)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 4658, in _swap_volume
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher self.volume_api.unreserve_volume(context, new_volume_id)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 4639, in _swap_volume
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher resize_to)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1217, in swap_volume
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher nova_context.get_admin_context(), volume_id)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 167, in wrapper
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher args, kwargs)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/conductor/rpcapi.py", line 239, in object_class_action
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher objver=objver, args=args, kwargs=kwargs)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher retry=self.retry)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher timeout=timeout, retry=retry)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 431, in send
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher retry=retry)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 422, in _send
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher raise result
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher VolumeBDMNotFound_Remote: No volume Block Device Mapping with id cafc833a-8645-47db-b464-999142afa7be.
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/conductor/manager.py", line 443, in _object_dispatch
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher return getattr(target, method)(*args, **kwargs)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 169, in wrapper
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher result = fn(cls, context, *args, **kwargs)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/objects/block_device.py", line 204, in get_by_volume_id
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher raise exception.VolumeBDMNotFound(volume_id=volume_id)
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher VolumeBDMNotFound: No volume Block Device Mapping with id cafc833a-8645-47db-b464-999142afa7be.
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher
2015-08-30 04:55:13.772 TRACE oslo_messaging.rpc.dispatcher

In first swap, nova will save serial with old volume id in block device mapping.
Therefore nova will try to find block device mapping through old volume id at second swap.
It had been change to new volume id, so nova will raise no bdm not found.

Changed in nova:
assignee: nobody → Chung Chih, Hung (lyanchih)
tags: added: swap-disk volumes

Hung,

Are you still working on this bug?
If you are not, may I become the assignee?

Changed in nova:
assignee: Chung Chih, Hung (lyanchih) → Takashi NATSUME (natsume-takashi)

Fix proposed to branch: master
Review: https://review.openstack.org/257135

Changed in nova:
status: New → In Progress
Changed in nova:
assignee: Takashi NATSUME (natsume-takashi) → Ryan McNair (rdmcnair)
Changed in nova:
assignee: Ryan McNair (rdmcnair) → Takashi NATSUME (natsume-takashi)
Sarafraj Singh (sarafraj-singh) wrote :

@Takashi: It's been more than 60 days without any update. Are you still working on it?

@Sarafraj: Yes.
The patch has already been pushed and needs to be reviewed.

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/304714

Matt Riedemann (mriedem) wrote :

I've marked this as low priority simply because it's a long standing latent bug and swap volume is and admin-only API that was written for volume migration/retype operations in cinder, it shouldn't even really exist as a nova API, but it predates the server external events API.

Changed in nova:
importance: Undecided → Low
Matt Riedemann (mriedem) wrote :

To clarify the description of this bug, is this the recreate scenario:

1. create a server, A
2. create two volumes, X and Y
3. attach volume X to server A
4. swap volume from X to Y on server A
5. swap volume from Y to X on server A

Is that correct? The failure happens in step 5?

Matt Riedemann (mriedem) wrote :

OK I was able to recreate this at least with the steps in comment 7, the error is here:

http://paste.openstack.org/show/566107/

These are the volumes when I tried swapping them the 2nd time:

stack@newton:~$ cinder list
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+
| 3ac8e20c-b821-43cf-b189-8baafbf09d69 | in-use | vol2 | 1 | lvmdriver-1 | false | f719dd72-23e1-497c-badd-da7f8fe28f12 |
| 62ee56ef-5f01-43fc-b4de-176a2299ba56 | available | vol1 | 1 | lvmdriver-1 | false | |
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+

Those were successfully swapped once, but then failed on the 2nd attempt to swap from vol2 to vol1.

Reviewed: https://review.openstack.org/257135
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=be553fb15591c6fc212ef3a07c1dd1cbc43d6866
Submitter: Jenkins
Branch: master

commit be553fb15591c6fc212ef3a07c1dd1cbc43d6866
Author: Takashi NATSUME <email address hidden>
Date: Thu Jun 9 13:01:51 2016 +0900

    Set 'serial' to new volume ID in swap volumes

    In swap_volume method of nova/virt/libvirt/driver.py,
    before BDM was got by using the instance's UUID and
    'serial' of new connection_info as the volume ID,
    and driver BDM was updated by using the BDM.
    ('serial' has the volume ID information.)
    But in _init_volume_connection method in ComputeManager class,
    'serial' is passed from old connection_info to new connection_info.

    It works fine in the case that cinder initiates swapping volumes
    because the ID of the attached volume isn't changed after
    swapping volumes.
    But in the case that nova initiates swapping volumes,
    the ID of the attached volume is changed.

    So in the case that nova initiated swapping volumes,
    after swap volume function was performed once,
    BDM was got by wrong old volume id (serial)
    when swap volume function was performed for the second time.

    So if 'serial' of new connection_info is None,
    it is set to new volume ID.
    And if cinder 'migrate_volume_completion' API returns
    old volume ID (the case that cinder initiates swapping volumes),
    the 'serial' of new connection_info is set to old volume ID.
    If cinder 'migrate_volume_completion' API returns new volume ID
    (the case that nova initiated swapping volumes),
    the 'serial' is left as it is (new volume ID).

    Change-Id: I86b8fbb09b0f1ed4c667683de3827cd9b63bca7f
    Closes-Bug: #1490236

Changed in nova:
status: In Progress → Fix Released
Sylvain Bauza (sylvain-bauza) wrote :

Please note that the merged solution introduced an end-user regression on detaching volumes https://bugs.launchpad.net/nova/+bug/1625660

Re: the original bug here, I reproduced the problem but I was able to workaround it by first detaching/attaching back the volume to the instance before swapping the volume so I think it's definitely not a huge priority.

This issue was fixed in the openstack/nova 14.0.0.0rc1 release candidate.

Change abandoned by Matt Riedemann (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/369116

This issue was fixed in the openstack/nova 14.0.0.0rc1 release candidate.

Matt Riedemann (mriedem) on 2017-01-29
no longer affects: nova/mitaka
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers