In large scale environment, deployment fails (EMC VMAX cinder driver)

Bug #1403160 reported by Sridhar Venkat
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
Xing Yang

Bug Description

During deployment, mask views are created and deleted (if last volume in it is deleted) and recreate again if needed. And recreation of mask view fails.

2014-12-16 03:17:15.420 20616 ERROR powervc_cinder.volume.drivers.emc.emc_vmax_common_ext [req-37c5c125-7e64-49e2-a37d-a7ce16ec8594 0 2620fa88fc4241afb9806c0b5dcdfb52 - - -] Failed to get or create
 masking view OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV. Examine the cinder volume log for more details about the cause.
2014-12-16 03:17:15.421 20616 ERROR powervc_cinder.volume.drivers.emc.emc_vmax_product [req-37c5c125-7e64-49e2-a37d-a7ce16ec8594 0 2620fa88fc4241afb9806c0b5dcdfb52 - - -] Bad or unexpected response from the storage volume backend API: Failed to get or create masking view OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV. Examine the cinder volume log for more details about the cause.
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product Traceback (most recent call last):
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product File "/usr/lib/python2.6/site-packages/powervc_cinder/volume/drivers/emc/emc_vmax_product.py", line 128, in initialize_connection
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product volume, connector)
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/emc/emc_vmax_common.py", line 368, in initialize_connection
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product self.conn, maskingViewDict)
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product File "/usr/lib/python2.6/site-packages/powervc_cinder/volume/drivers/emc/emc_vmax_common_ext.py", line 914, in get_or_create_masking_view_and_map_lun
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product raise exception.VolumeBackendAPIException(data=errorMessage)
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Failed to get or create masking view OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV. Examine the cinder volume log for more details about the cause.
2014-12-16 03:17:15.421 20616 TRACE powervc_cinder.volume.drivers.emc.emc_vmax_product

The same mask view got created previously and deleted (due to last volume attached to it is getting deleted) and above exception is while trying to creating it back.

The masking view in this problem is OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV.

This masking view got created at
2014-12-15 22:45:33.368 20616 INFO cinder.volume.drivers.emc.emc_vmax_masking [req-da3670bf-3d4b-4b3d-8e1c-a6d60e2279c1 0 2620fa88fc4241afb9806c0b5dcdfb52 - - -] Created new masking view : OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV

And it got deleted due to removal of last volume associated with it :

2014-12-16 02:23:44.685 20616 DEBUG cinder.volume.drivers.emc.emc_vmax_masking [req-8e7cd370-c7f1-47c8-883d-05907cb14b36 0 2620fa88fc4241afb9806c0b5dcdfb52 - - -] Last volume in the storage group, deleting masking view //9.114.181.80/root/emc:Symm_LunMaskingView.CreationClassName="Symm_LunMaskingView",SystemName="SYMMETRIX+000198701426",DeviceID="SYMMETRIX+000198701426+OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV",SystemCreationClassName="Symm_StorageSystem" remove_and_reset_members /usr/lib/python2.6/site-packages/cinder/volume/drivers/emc/emc_vmax_masking.py:1309

Recreate attempt again at :
2014-12-16 03:16:03.847 20616 INFO cinder.volume.drivers.emc.emc_vmax_masking [req-30fea4db-f84c-40f1-a62f-2640f9428308 0 2620fa88fc4241afb9806c0b5dcdfb52 - - -] Created new masking view : OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV

and failure:
2014-12-16 03:17:15.415 20616 ERROR cinder.volume.drivers.emc.emc_vmax_masking [req-37c5c125-7e64-49e2-a37d-a7ce16ec8594 0 2620fa88fc4241afb9806c0b5dcdfb52 - - -] Error Create Masking View: OS-8231E2D_105F9AT_1-VP_FC10_R53-F-MV. Return code: 1. Error: Create Masking View at step Start of run() failed: C:ERROR_CLASS_SOFTWARE F:ERROR_FAMILY_FAILED R:1000037 L:2 C:ERROR_CLASS_SOFTWARE F:ERROR_FAMILY_FAILED R:1000037 A specified object name is not unique : "StorMaskViewCreate failed" : 2 : 10260 : "Cannot use the specified name because it's already in use" @
  [1] com.emc.cmp.osls.se.osl.Masking.StorMaskViewCreate():2067
  [0] com.emc.cmp.osls.se.array.job.JOB_MaskingViewCreate.run():124

so, deletion at 2014-12-16 02:23:44.685 does not look very clean.

Tags: drivers emc
Revision history for this message
Sridhar Venkat (svenkat) wrote :
Xing Yang (xing-yang)
Changed in cinder:
assignee: nobody → Xing Yang (xing-yang)
importance: Undecided → Medium
Revision history for this message
Carl Pecinovsky (csky) wrote :

Actually, the masking view deletion is clean. The problem here is that there are two concurrent attach requests in progress (being serviced by the cinder driver). The first one succeeds in creating the masking view. The second request fails with the error message shown in the description because the masking view now exists, though it did not exist when the initialize_connection request first checked for the masking view existence. So this is another gap in the concurrency handling of multiple attach requests.

Revision history for this message
Carl Pecinovsky (csky) wrote :

Xing,
Can you "confirm" this defect? Thanks.

Mike Perez (thingee)
tags: added: drivers emc
Xing Yang (xing-yang)
Changed in cinder:
status: New → Confirmed
Xing Yang (xing-yang)
Changed in cinder:
milestone: none → kilo-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/157684

Changed in cinder:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by xing-yang (<email address hidden>) on branch: master
Review: https://review.openstack.org/157684
Reason: Fixed this issue in https://review.openstack.org/#/c/157683/ as these two patches are related.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/157683
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8113c8e1b153ad20ebf1692e8a8b903515d974c6
Submitter: Jenkins
Branch: master

commit 8113c8e1b153ad20ebf1692e8a8b903515d974c6
Author: Xing Yang <email address hidden>
Date: Fri Feb 20 01:38:09 2015 -0500

    Fixed a concurrency issue in VMAX driver

    This patch fixed the following problem:

    When trying to add a second volume to the same masking view,
    the first volume got removed at the same time, causing
    the operation on the second volume to fail.

    When two attach requests happen at the same time on the same
    volume, the second one will fail.

    Also fixed a W503 pep8 issue (line break before binary operator)
    in emc_vmax_common.py.

    Closes-Bug: #1416035
    Closes-Bug: #1403160
    Change-Id: I52975b399c2bd8e2a91bdd09004ee277e54c9a89

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: kilo-3 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.