VMAX initialize_conn failure when concurrent terminate req for different volume

Bug #1416035 reported by Carl Pecinovsky
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
Xing Yang

Bug Description

This concurrency issue has the following pieces:

* A storage masking group with a single volume - Vol1
* An initialize connection request is in process for the masking view associated with the storage group above. The request is to add a second volume to the masking - Vol2.
* A terminate_connection request is also in progress for Vol1 and same masking view.
* The initialize_connection request gets to the point of looking up the masking view and storage group to use.
* The terminate_connection then runs and since Vol1 is the last volume in the storage group, the SG is removed and the masking view is removed.
* Now the initialize connection continues processing but blows up when it tries to add Vol2 to the now non-existent storage group.

So the use case here is concurrent cloud-based deployments when a nova host just begins to be utilized, and a new VM is deployed around the same time the only current existing VM is deleted from the host.

It seems like initialize_ and terminate_ need to introduce some level of locking surrounding the masking view/group name such that terminate_ will not remove those entities unless the SG is both empty AND not locked.

Tags: drivers emc vmax
Carl Pecinovsky (csky)
description: updated
tags: added: drivers
tags: added: emc
Xing Yang (xing-yang)
Changed in cinder:
assignee: nobody → Xing Yang (xing-yang)
tags: added: vmax
Revision history for this message
Xing Yang (xing-yang) wrote :

I wonder if this problem still exists with all the bug fixes we submitted lately. Locking could potentially introduce performance problems and other issues. We are actually trying to get rid of locks in Cinder. We will investigate this issue.

Xing Yang (xing-yang)
Changed in cinder:
milestone: none → kilo-3
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/157683

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/157683
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8113c8e1b153ad20ebf1692e8a8b903515d974c6
Submitter: Jenkins
Branch: master

commit 8113c8e1b153ad20ebf1692e8a8b903515d974c6
Author: Xing Yang <email address hidden>
Date: Fri Feb 20 01:38:09 2015 -0500

    Fixed a concurrency issue in VMAX driver

    This patch fixed the following problem:

    When trying to add a second volume to the same masking view,
    the first volume got removed at the same time, causing
    the operation on the second volume to fail.

    When two attach requests happen at the same time on the same
    volume, the second one will fail.

    Also fixed a W503 pep8 issue (line break before binary operator)
    in emc_vmax_common.py.

    Closes-Bug: #1416035
    Closes-Bug: #1403160
    Change-Id: I52975b399c2bd8e2a91bdd09004ee277e54c9a89

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: kilo-3 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.