Race condition in iscsi disconnect_volume

Bug #1375382 reported by Patrick East
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Undecided
Patrick East

Bug Description

I am seeing an error occur due to a race condition in cinder/brick/connector.py ISCSIConnector's disconnect_volume method. The scenario being that when we do the scsi delete it is expected that the symlink in /dev/disk/by-path is removed if it is no longer in use, and that when we call self.driver.get_all_block_devices() we can check to see if anyone is still using it by assuming if the symlink is there it is in use. Then only if it is no longer in the list we disconnect the iscsi portal.

The issue I am seeing is that there is some delay between calling the scsi delete and the symlink being removed which causes us to see the broken symlink when calling os.listdir(). My understanding is that the rules for udev to clean up the symlinks is done asynchronously, which means we are not guaranteed it will be cleaned up before we call self.driver.get_all_block_devices(). The end result of this causes the iscsi portal to be left open indefinitely when it should have been closed.

Changed in cinder:
assignee: nobody → Patrick East (patrick-east)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/124821

Changed in cinder:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/124821
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=5d22ec17c7548f3de85ba9e3ad54ce5799dc5fff
Submitter: Jenkins
Branch: master

commit 5d22ec17c7548f3de85ba9e3ad54ce5799dc5fff
Author: Patrick East <email address hidden>
Date: Mon Sep 29 10:54:22 2014 -0700

    Fix race condition in ISCSIConnector disconnect_volume

    The list of devices returned by driver.get_all_block_devices() will
    sometimes contain broken symlinks as the SCSI device has been deleted
    but the udev rule for the symlink has not yet completed.

    Adding in a check to os.path.exists() will ensure that we will not
    consider the broken symlinks as an “in use” device.

    Change-Id: Ibb869e10976f894f9e18e9edec6739c2c3bea68c
    Closes-Bug: #1375382

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
milestone: none → juno-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.