NEC driver: Live-migration failed with FC

Bug #1887908 reported by Naoki Saito
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Undecided
Naoki Saito

Bug Description

When performing the live-migration, it sometimes gives following error message if I use the fc driver.

nova-compute.log
2020-07-16 01:25:51.011 7 ERROR os_brick.initiator.connectors.fibre_channel [-] Fibre Channel volume device not found.
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'os_brick.initiator.connectors.fibre_channel.FibreChannelConnector.connect_volume.<locals>._wait_for_device_discovery' failed: os_brick.exception.No
FibreChannelVolumeDeviceFound: Unable to find a Fibre Channel volume device.
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall Traceback (most recent call last):
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall File "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall result = func(*self.args, **self.kw)
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall File "/usr/lib/python3.6/site-packages/os_brick/initiator/connectors/fibre_channel.py", line 230, in _wait_for_device_discovery
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall raise exception.NoFibreChannelVolumeDeviceFound()
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall os_brick.exception.NoFibreChannelVolumeDeviceFound: Unable to find a Fibre Channel volume device.
2020-07-16 01:25:51.012 7 ERROR oslo.service.loopingcall
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server [req-195196fc-af01-4098-bbca-4b80ff9e461e 07390ef4f5bc44e9953a4829066763b4 2d4abf67683b48c193427720e98e91aa - default default] Exception during message handling: os_brick.exceptio
n.NoFibreChannelVolumeDeviceFound: Unable to find a Fibre Channel volume device.
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/exception_wrapper.py", line 79, in wrapped
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server function_name, call_dict, binary, tb)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server raise value
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/exception_wrapper.py", line 69, in wrapped
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/utils.py", line 1372, in decorated_function
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 219, in decorated_function
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server raise value
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 207, in decorated_function
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 6985, in pre_live_migration
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server bdm.save()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server raise value
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 6950, in pre_live_migration
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server migrate_data)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 9186, in pre_live_migration
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server allow_native_luks=src_native_luks)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1579, in _connect_volume
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server vol_driver.connect_volume(connection_info, instance)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/volume/fibrechannel.py", line 54, in connect_volume
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server device_info = self.connector.connect_volume(connection_info['data'])
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/os_brick/utils.py", line 137, in trace_logging_wrapper
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/os_brick/initiator/connectors/fibre_channel.py", line 244, in connect_volume
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server timer.start(interval=2).wait()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/event.py", line 125, in wait
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server result = hub.switch()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server return self.greenlet.switch()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server result = func(*self.args, **self.kw)
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/os_brick/initiator/connectors/fibre_channel.py", line 230, in _wait_for_device_discovery
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server raise exception.NoFibreChannelVolumeDeviceFound()
2020-07-16 01:25:51.854 7 ERROR oslo_messaging.rpc.server os_brick.exception.NoFibreChannelVolumeDeviceFound: Unable to find a Fibre Channel volume device.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/741602

Changed in cinder:
assignee: nobody → Naoki Saito (n-saito)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.opendev.org/741602
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=94c1d241533397bb938682abeaa51ae74d069cb7
Submitter: Zuul
Branch: master

commit 94c1d241533397bb938682abeaa51ae74d069cb7
Author: Naoki Saito <email address hidden>
Date: Fri Jul 17 17:30:17 2020 +0900

    NEC driver: fix live-migration failure with FC

    The initialize_connection() in the driver sometimes returns wrong LUN
    and causes a live-migration failure.
    The function searches the LUN from LD sets (attached hosts) and returns
    the first hit.
    The function must return an LUN of the destination host, but the first
    hit may be an LUN of the source host.

    This patch fixes initialize_connection() to return correct LUN by
    searching with WWPN of the desitination host.

    Change-Id: I102ae84204e0d88814a7d2e028f7cec118ad6b60
    Closes-Bug: #1887908

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/751572

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/ussuri)

Reviewed: https://review.opendev.org/751572
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=c55e259d59af6a29ebef5deb61ea01633162f7a0
Submitter: Zuul
Branch: stable/ussuri

commit c55e259d59af6a29ebef5deb61ea01633162f7a0
Author: Naoki Saito <email address hidden>
Date: Fri Jul 17 17:30:17 2020 +0900

    NEC driver: fix live-migration failure with FC

    The initialize_connection() in the driver sometimes returns wrong LUN
    and causes a live-migration failure.
    The function searches the LUN from LD sets (attached hosts) and returns
    the first hit.
    The function must return an LUN of the destination host, but the first
    hit may be an LUN of the source host.

    This patch fixes initialize_connection() to return correct LUN by
    searching with WWPN of the desitination host.

    Change-Id: I102ae84204e0d88814a7d2e028f7cec118ad6b60
    Closes-Bug: #1887908
    (cherry picked from commit 94c1d241533397bb938682abeaa51ae74d069cb7)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/752782

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/train)

Reviewed: https://review.opendev.org/752782
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=d77ae8785faac9084de401024178ed785fdd3725
Submitter: Zuul
Branch: stable/train

commit d77ae8785faac9084de401024178ed785fdd3725
Author: Naoki Saito <email address hidden>
Date: Fri Jul 17 17:30:17 2020 +0900

    NEC driver: fix live-migration failure with FC

    The initialize_connection() in the driver sometimes returns wrong LUN
    and causes a live-migration failure.
    The function searches the LUN from LD sets (attached hosts) and returns
    the first hit.
    The function must return an LUN of the destination host, but the first
    hit may be an LUN of the source host.

    This patch fixes initialize_connection() to return correct LUN by
    searching with WWPN of the desitination host.

    Change-Id: I102ae84204e0d88814a7d2e028f7cec118ad6b60
    Closes-Bug: #1887908
    (cherry picked from commit 94c1d241533397bb938682abeaa51ae74d069cb7)
    (cherry picked from commit c55e259d59af6a29ebef5deb61ea01633162f7a0)

tags: added: in-stable-train
Revision history for this message
Igal Katzir (ikatzir) wrote :

Can this Fix also be merged into Queens?

Revision history for this message
Naoki Saito (n-saito) wrote :

No, the patch conflicts with the older test code.
If you need this fix in Queens, you can manually apply the patch in your system excluding test code.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 15.4.1

This issue was fixed in the openstack/cinder 15.4.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.