"No such file or directory" exception when detaching iSCSI volume

Bug #1719719 reported by Gorka Eguileor
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
os-brick
Fix Released
Undecided
Gorka Eguileor

Bug Description

When we detach an iSCSI volume we may encounter an exception such as:

2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [req-4c1c8b2d-4768-4935-b740-18372a124381 00272a48a75f455c95702ea16ec0f809 e78da976712d47ac9184f8d5e443fa17 - - -] [instance: 5be90dd0-7783-4280-a9e8-846f93662178] Failed to detach volume f6638638-ba36-4845-a8e8-779061fc75d1 from /dev/vdb
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] Traceback (most recent call last):
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4739, in _driver_detach_volume
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] encryption=encryption)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1336, in detach_volume
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] self._disconnect_volume(connection_info, disk_dev, instance)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1105, in _disconnect_volume
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] vol_driver.disconnect_volume(connection_info, disk_dev, instance)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume/iscsi.py", line 74, in disconnect_volume
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] self.connector.disconnect_volume(connection_info['data'], None)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 145, in trace_logging_wrapper
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] result = f(*args, **kwargs)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] return f(*args, **kwargs)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 830, in disconnect_volume
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] ignore_errors=ignore_errors)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 867, in _cleanup_connection
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] force, exc)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 235, in remove_connection
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] self._remove_scsi_symlinks(devices_names)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 242, in _remove_scsi_symlinks
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] if os.path.realpath(link) in devices]
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib64/python2.7/posixpath.py", line 367, in realpath
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] path, ok = _joinrealpath('', filename, {})
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] File "/usr/lib64/python2.7/posixpath.py", line 406, in _joinrealpath
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] path, ok = _joinrealpath(path, os.readlink(newpath), seen)
2017-09-26 15:31:27.901 12918 ERROR nova.compute.manager [instance: 5be90dd0-7783-4280-a9e8-846f93662178] OSError: [Errno 2] No such file or directory: '/dev/disk/by-id/scsi-20024f40058540081'

Which results in the final detach exception:

2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server [req-4c1c8b2d-4768-4935-b740-18372a124381 00272a48a75f455c95702ea16ec0f809 e78da976712d47ac9184f8d5e443fa17 - - -] Exception during message handling
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server function_name, call_dict, binary)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self.force_reraise()
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self.force_reraise()
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4867, in detach_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server attachment_id=attachment_id)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4802, in _detach_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self._driver_detach_volume(context, instance, bdm, connection_info)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4751, in _driver_detach_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self.volume_api.roll_detaching(context, volume_id)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self.force_reraise()
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4739, in _driver_detach_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server encryption=encryption)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1336, in detach_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self._disconnect_volume(connection_info, disk_dev, instance)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1105, in _disconnect_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server vol_driver.disconnect_volume(connection_info, disk_dev, instance)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume/iscsi.py", line 74, in disconnect_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self.connector.disconnect_volume(connection_info['data'], None)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 145, in trace_logging_wrapper
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server result = f(*args, **kwargs)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 830, in disconnect_volume
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server ignore_errors=ignore_errors)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 867, in _cleanup_connection
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server force, exc)
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 235, in remove_connection
2017-09-26 15:31:28.022 12918 ERROR oslo_messaging.rpc.server self._remove_scsi_symlinks(devices_names)

And this detach operation should not have failed.

Gorka Eguileor (gorka)
Changed in os-brick:
assignee: nobody → Gorka Eguileor (gorka)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (master)

Fix proposed to branch: master
Review: https://review.openstack.org/507653

Changed in os-brick:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (master)

Reviewed: https://review.openstack.org/507653
Committed: https://git.openstack.org/cgit/openstack/os-brick/commit/?id=de89635ab70b36852d638efd6e45d245217613cc
Submitter: Zuul
Branch: master

commit de89635ab70b36852d638efd6e45d245217613cc
Author: Gorka Eguileor <email address hidden>
Date: Tue Sep 26 21:01:14 2017 +0200

    Protect against race within os.path.realpath

    On iSCSI connections, when we are making sure that there no links remain
    like /dev/disk/by-id/scsi-* we make several calls to os.path.realpath
    which normal behavior is:

    - If file path doesn't exist, returns the same file path
    - If file path exists, return the real path

    But there is a third option, and that is when the file did exist but it
    dissapear right when the call to os.readlink in posixpath:_joinrealpath

       path, ok = _joinrealpath(path, os.readlink(newpath), seen)

    Which ends up raising an exception such as:

       OSError: [Errno 2] No such file or directory:
                '/dev/disk/by-id/scsi-20024f40058540081'

    And because of this exception the detach will fail when it shouldn't.

    This patch adds includes the call to os.path.realpath within a
    try...except clause to prevent this race condition from unexpectely
    making the detach operation fail.

    Change-Id: Ieb58826b28c62094c941fce10863c0a75fb4e8aa
    Closes-Bug: #1719719

Changed in os-brick:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/518035

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 2.0.0

This issue was fixed in the openstack/os-brick 2.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/pike)

Reviewed: https://review.openstack.org/518035
Committed: https://git.openstack.org/cgit/openstack/os-brick/commit/?id=cad9fb07959e7b63208706e4970bd6171eda786b
Submitter: Zuul
Branch: stable/pike

commit cad9fb07959e7b63208706e4970bd6171eda786b
Author: Gorka Eguileor <email address hidden>
Date: Tue Sep 26 21:01:14 2017 +0200

    Protect against race within os.path.realpath

    On iSCSI connections, when we are making sure that there no links remain
    like /dev/disk/by-id/scsi-* we make several calls to os.path.realpath
    which normal behavior is:

    - If file path doesn't exist, returns the same file path
    - If file path exists, return the real path

    But there is a third option, and that is when the file did exist but it
    dissapear right when the call to os.readlink in posixpath:_joinrealpath

       path, ok = _joinrealpath(path, os.readlink(newpath), seen)

    Which ends up raising an exception such as:

       OSError: [Errno 2] No such file or directory:
                '/dev/disk/by-id/scsi-20024f40058540081'

    And because of this exception the detach will fail when it shouldn't.

    This patch adds includes the call to os.path.realpath within a
    try...except clause to prevent this race condition from unexpectely
    making the detach operation fail.

    Change-Id: Ieb58826b28c62094c941fce10863c0a75fb4e8aa
    Closes-Bug: #1719719
    (cherry picked from commit de89635ab70b36852d638efd6e45d245217613cc)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 1.15.4

This issue was fixed in the openstack/os-brick 1.15.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.