Cinder

VMAX initialize_conn failure when concurrent terminate req for different volume

Bug #1416035 reported by Carl Pecinovsky on 2015-01-29

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Fix Released	Medium	Xing Yang	Cinder 2015.1.0 "kilo"

Bug Description

This concurrency issue has the following pieces:

* A storage masking group with a single volume - Vol1
* An initialize connection request is in process for the masking view associated with the storage group above. The request is to add a second volume to the masking - Vol2.
* A terminate_connection request is also in progress for Vol1 and same masking view.
* The initialize_connection request gets to the point of looking up the masking view and storage group to use.
* The terminate_connection then runs and since Vol1 is the last volume in the storage group, the SG is removed and the masking view is removed.
* Now the initialize connection continues processing but blows up when it tries to add Vol2 to the now non-existent storage group.

So the use case here is concurrent cloud-based deployments when a nova host just begins to be utilized, and a new VM is deployed around the same time the only current existing VM is deleted from the host.

It seems like initialize_ and terminate_ need to introduce some level of locking surrounding the masking view/group name such that terminate_ will not remove those entities unless the SG is both empty AND not locked.

See original description

Tags:

Carl Pecinovsky (csky) on 2015-01-29

description:	updated
tags:	added: drivers
tags:	added: emc

Xing Yang (xing-yang) on 2015-01-30

Changed in cinder:
assignee:	nobody → Xing Yang (xing-yang)
tags:	added: vmax

Revision history for this message

Xing Yang (xing-yang) wrote on 2015-01-30:

I wonder if this problem still exists with all the bug fixes we submitted lately. Locking could potentially introduce performance problems and other issues. We are actually trying to get rid of locks in Cinder. We will investigate this issue.

Xing Yang (xing-yang) on 2015-02-20

Changed in cinder:
milestone:	none → kilo-3
importance:	Undecided → Medium
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-02-20: Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/157683

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-03-17: Fix merged to cinder (master)

Reviewed: https://review.openstack.org/157683
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8113c8e1b153ad20ebf1692e8a8b903515d974c6
Submitter: Jenkins
Branch: master

commit 8113c8e1b153ad20ebf1692e8a8b903515d974c6
Author: Xing Yang <email address hidden>
Date: Fri Feb 20 01:38:09 2015 -0500

Fixed a concurrency issue in VMAX driver

This patch fixed the following problem:

    When trying to add a second volume to the same masking view,
    the first volume got removed at the same time, causing
    the operation on the second volume to fail.

When two attach requests happen at the same time on the same
volume, the second one will fail.

Also fixed a W503 pep8 issue (line break before binary operator)
in emc_vmax_common.py.

    Closes-Bug: #1416035
    Closes-Bug: #1403160
    Change-Id: I52975b399c2bd8e2a91bdd09004ee277e54c9a89

Changed in cinder:
status:	In Progress → Fix Committed

Thierry Carrez (ttx) on 2015-03-20

Changed in cinder:
status:	Fix Committed → Fix Released

Thierry Carrez (ttx) on 2015-04-30

Changed in cinder:
milestone:	kilo-3 → 2015.1.0

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.