Failure to create tmp file in image_conversion_dir can lead to the creation of 3 volumes

Bug #1224211 reported by Mathieu Gagné on 2013-09-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
High
Joshua Harlow
Grizzly
High
Mathieu Gagné

Bug Description

When trying to create a volume from an image, if the creation of a temporary file in image_conversion_dir fails, up to 3 volumes are created on the backend. (the number of volumes depends on the value of scheduler_max_attempts)

When tested against the SolidFire driver, the volume status goes to 'error'. Cinder however still managed to create 3 volumes on the SolidFire cluster. This will apply to other drivers as well including the ref LVM driver.

This triggers a side-effect where the volume in error can no longer be deleted by the SolidFire driver ; it is now confused by the existence of 3 volumes with the same name on the cluster.

Errors raised by this kind of failure should be handled properly so Cinder does not reschedule the creation of the volume.

Changed in cinder:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Joshua Harlow (harlowja)
milestone: none → havana-rc1
description: updated
Changed in cinder:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/46177
Committed: http://github.com/openstack/cinder/commit/4227b128a42a1c5785ac13245de511fbcd358e37
Submitter: Jenkins
Branch: master

commit 4227b128a42a1c5785ac13245de511fbcd358e37
Author: Joshua Harlow <email address hidden>
Date: Wed Sep 11 19:36:18 2013 -0700

    Catch generic exceptions

    When the driver copy_image_to_volume fails it can
    at the current time raise more than just CinderException
    as its root exception type. This causes rescheduling due
    to the blacklisted exception list that is used to determine
    if a exception is 'bad enough' to trigger rescheduling or
    should the volume creation action just set the volume to
    error state.

    To avoid the situation where this would cause a rescheduling
    we should make sure (for now) that any exception that is
    emitted on copying an image to a volume is translated
    into a image copy failure and reraised.

    Fixes: bug 1224211

    Change-Id: Ia4a0a81d9e0967b1e7de07577d77084462304c60

Changed in cinder:
status: In Progress → Fix Committed

Reviewed: https://review.openstack.org/46176
Committed: http://github.com/openstack/cinder/commit/25be69584f8a27fd0a9ac808596648905c1716ef
Submitter: Jenkins
Branch: stable/grizzly

commit 25be69584f8a27fd0a9ac808596648905c1716ef
Author: Mathieu Gagné <email address hidden>
Date: Wed Sep 11 22:34:36 2013 -0400

    Do not reschedule if copy_image_to_volume fails

    Catch any exception raised by copy_image_to_volume and wrap them
    in ImageCopyFailure. Raising ImageCopyFailure will prevent
    the volume from being rescheduled.

    Fixes: bug #1224211
    Change-Id: I1366f86927813cadefe742769eb7cff27ed41b7e

Thierry Carrez (ttx) on 2013-10-04
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2013-10-17
Changed in cinder:
milestone: havana-rc1 → 2013.2
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers