copy image to volume is broken for non-local volumes

Bug #1243980 reported by John Griffith
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
John Griffith
Havana
Fix Released
High
John Griffith

Bug Description

Changes late in Havana changed the behavior of the image_copy and disconnect. The result is that we disconnect and remove the disk/by-path entry for the generalized iscsi attach BEFORE we go into the brick/initiator remove routines.

What happens now is in the case of a single disk/by-path being created for the generalized iscsi attach to the cinder node is that when it's completed the image copy, the device is cleaned up and the if there's nothing left in the disk/by-path directory it's removed.

As a result, the intiator/host_driver expects the disk/by-path directory to be present, it checks it for devices in it's clean up but gets an unhandled exception because the directory no longer exists. We should make this safe by checking for the directory first before asking to list the devices that reside in it.

The result is the create fails, even though create was actually succesful and the copy image was succesful as well.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/53483

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/53483
Committed: http://github.com/openstack/cinder/commit/1766a5acc5c948288b4cd81c62d0c1507c55f727
Submitter: Jenkins
Branch: master

commit 1766a5acc5c948288b4cd81c62d0c1507c55f727
Author: John Griffith <email address hidden>
Date: Wed Oct 23 18:04:51 2013 -0600

    Check if dir exists before calling listdir

    Changes along the way to how we clean up and detach after
    copying an image to a volume exposed a problem in the cleanup
    of the brick/initiator routines.

    The clean up in the initiator detach was doing a blind listdir
    of /dev/disk/by-path, however due to detach and cleanup being
    called upon completion of the image download to the volume if
    there are no other devices mapped in this directory the directory
    is removed.

    The result was that even though the create and copy of the image
    was succesful, the HostDriver code called os.lisdir on a directory
    that doesn't exist any longer and raises an unhandled exception that
    cause the taskflow mechanism to mark the volume as failed.

    Change-Id: I488755c1a49a77f42efbb58a7a4eb6f4f084df07
    Closes-bug: #1243980

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/53723

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to cinder (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/53738

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/havana)

Reviewed: https://review.openstack.org/53723
Committed: http://github.com/openstack/cinder/commit/fb8db56ca83a18860ed1ae279d3f390456e224fe
Submitter: Jenkins
Branch: stable/havana

commit fb8db56ca83a18860ed1ae279d3f390456e224fe
Author: John Griffith <email address hidden>
Date: Wed Oct 23 18:04:51 2013 -0600

    Check if dir exists before calling listdir

    Changes along the way to how we clean up and detach after
    copying an image to a volume exposed a problem in the cleanup
    of the brick/initiator routines.

    The clean up in the initiator detach was doing a blind listdir
    of /dev/disk/by-path, however due to detach and cleanup being
    called upon completion of the image download to the volume if
    there are no other devices mapped in this directory the directory
    is removed.

    The result was that even though the create and copy of the image
    was succesful, the HostDriver code called os.lisdir on a directory
    that doesn't exist any longer and raises an unhandled exception that
    cause the taskflow mechanism to mark the volume as failed.

    Change-Id: I488755c1a49a77f42efbb58a7a4eb6f4f084df07
    Closes-bug: #1243980
    (cherry picked from commit 1766a5acc5c948288b4cd81c62d0c1507c55f727)

Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: icehouse-1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.