Race conditions on xtremio with client3

Bug #1521143 reported by Gorka Eguileor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
Gorka Eguileor

Bug Description

Description of problem:
When working with FC and xtremio is using XtremIOClient3 on FC's terminate_connection we can get exceptions where we shouldn't:

    VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Unable to terminate volume connection: Resource could not be found.

Expected behavior:
terminate_connection should complete without raising any exception.

Cause:
Client3 is open to race conditions on 2 methods, find_lunmap and num_of_mapped_volumes, where a list of mappings is first retrieved and then we iterate this list to retrieved additional info on the mappings.

The race would happen if one of the mappings is removed from the backend in the time it takes to retrieve the additional info after we have retrieved the list.

Gorka Eguileor (gorka)
Changed in cinder:
assignee: nobody → Gorka Eguileor (gorka)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/251305

Changed in cinder:
status: New → In Progress
Xing Yang (xing-yang)
tags: added: drivers emc xtremio
Jay Bryant (jsbryant)
Changed in cinder:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/251305
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=89480f159231eae154119da6b46432849b9df20a
Submitter: Jenkins
Branch: master

commit 89480f159231eae154119da6b46432849b9df20a
Author: Gorka Eguileor <email address hidden>
Date: Mon Nov 30 12:04:58 2015 +0100

    Take into consideration races in XtremIOClient3

    When working with FC and xtremio is using XtremIOClient3 on FC's
    terminate_connection we can get VolumeBackendAPIException saying a
    resource could not be found when we shouldn't.

    The cause is that Client3 is open to race conditions on 2 methods,
    find_lunmap and num_of_mapped_volumes, where a list of mappings is first
    retrieved and then we iterate this list to retrieved additional info on
    the mappings.

    The race would happen if one of the mappings is removed from the backend
    in the time it takes to retrieve the additional info after we have
    retrieved the list.

    This patch fixes this issue by ignoring any mappings that have been
    removed and are now NotFound when retrieving additional information for
    a mapping in those 2 methods.

    This patch also fixes this kind of race problems on volume creation
    since it uses find_lumap method.

    Closes-Bug: #1521143
    Change-Id: I40831e04093ff475395870a333211dd0cb60440f

Changed in cinder:
status: In Progress → Fix Committed
Eric Harney (eharney)
tags: added: liberty-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/252360

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/liberty)

Reviewed: https://review.openstack.org/252360
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=2c12909705e2536dd39f22096167fbf5e4719e2e
Submitter: Jenkins
Branch: stable/liberty

commit 2c12909705e2536dd39f22096167fbf5e4719e2e
Author: Gorka Eguileor <email address hidden>
Date: Mon Nov 30 12:04:58 2015 +0100

    Take into consideration races in XtremIOClient3

    When working with FC and xtremio is using XtremIOClient3 on FC's
    terminate_connection we can get VolumeBackendAPIException saying a
    resource could not be found when we shouldn't.

    The cause is that Client3 is open to race conditions on 2 methods,
    find_lunmap and num_of_mapped_volumes, where a list of mappings is first
    retrieved and then we iterate this list to retrieved additional info on
    the mappings.

    The race would happen if one of the mappings is removed from the backend
    in the time it takes to retrieve the additional info after we have
    retrieved the list.

    This patch fixes this issue by ignoring any mappings that have been
    removed and are now NotFound when retrieving additional information for
    a mapping in those 2 methods.

    This patch also fixes this kind of race problems on volume creation
    since it uses find_lumap method.

    Closes-Bug: #1521143
    Change-Id: I40831e04093ff475395870a333211dd0cb60440f
    (cherry picked from commit 89480f159231eae154119da6b46432849b9df20a)

tags: added: in-stable-liberty
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/cinder 8.0.0.0b1

This issue was fixed in the openstack/cinder 8.0.0.0b1 development milestone.

Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in cinder 7.0.1

This issue was fixed in the cinder 7.0.1 release (liberty).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.