Cinder

VSA volume creation fails with "the same name already exists."

Bug #1779654 reported by Eiji Kobayashi on 2018-07-02

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Fix Released	Undecided	Vivek Soni

Bug Description

nova boot fails with "--max-count" option

$ nova boot --min-count=3 --max-count=3 --flavor XXflavor --block-device source=image,id=${MY_CENTOS_IMG},dest=volume,size=10,shutdown=remove,bootindex=0 --key-name key-for-internal --security-groups sg-all-from-private-net --availability-zone AZ2 --nic net-id=${MY_WORK_NET} --nic net-id=${MY_PRV_NET} web_ext_az2

<cinder-volume.log>
2018-06-07 08:03:48.787 17974 WARNING cinder.volume.manager [req-8a322777-ae06-4765-807c-d987f049288c 45336d44e7174de5b7d0430639437ccf 7ce30e9231de44009f10b555896105d4 - default default] Task 'cinder.volume.flows.manager.create_volume.ExtractVolumeRefTask;volume:create' (34a3e9a3-fca4-4ff5-8290-fd7c3e3528b7) transitioned into state 'REVERTED' from state 'REVERTING' with result 'None'
2018-06-07 08:03:48.790 17974 WARNING cinder.volume.manager [req-8a322777-ae06-4765-807c-d987f049288c 45336d44e7174de5b7d0430639437ccf 7ce30e9231de44009f10b555896105d4 - default default] Flow 'volume_create_manager' (074daf4a-94d5-4884-b420-6684e061b6a8) transitioned into state 'REVERTED' from state 'RUNNING'
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server [req-8a322777-ae06-4765-807c-d987f049288c 45336d44e7174de5b7d0430639437ccf 7ce30e9231de44009f10b555896105d4 - default default] Exception during message handling
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/manager.py", line 4367, in create_volume
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server allow_reschedule=allow_reschedule, volume=volume)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/manager.py", line 635, in create_volume
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server _run_flow()
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/manager.py", line 627, in _run_flow
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server flow_engine.run()
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/taskflow/engines/action_engine/engine.py", line 247, in run
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server for _state in self.run_iter(timeout=timeout):
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/taskflow/engines/action_engine/engine.py", line 340, in run_iter
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server failure.Failure.reraise_if_any(er_failures)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/taskflow/types/failure.py", line 336, in reraise_if_any
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server failures[0].reraise()
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in reraise
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server six.reraise(*self._exc_info)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server result = task.execute(**arguments)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py", line 853, in execute
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server **volume_spec)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py", line 790, in _create_from_image
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server image_service
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py", line 669, in _create_from_image_download
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server image_service)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server File "/opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py", line 550, in _copy_image_to_volume
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server raise exception.ImageCopyFailure(reason=ex)
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server ImageCopyFailure: Failed to copy image to volume: Bad or unexpected response from the storage volume backend API: Unable to fetch connection information from backend: Error (HTTP 500) OPERATION_FAILED - The operation failed: The volume list 'helion-cp1-c1-m1-mgmtVolumeList1528358624' cannot be created because a volume list with the same name already exists. Enter a different name for the new volume list.
2018-06-07 08:03:48.792 17974 ERROR oslo_messaging.rpc.server

The volume of cinder does not create if this issues occurred.

If I set "hplefthand_debug = true" then following message recorded.

2018-06-07 08:45:57.440 30555 DEBUG cinder.volume.drivers.hpe.hpe_lefthand_iscsi [req-9bb0e9b4-33dc-4956-9a8b-baf8add719d5 45336d44e7174de5b7d0430639437ccf 7ce30e9231de44009f10b555896105d4 - default default] <== initialize_connection: exception (247ms) VolumeBackendAPIException(u'Error (HTTP 500) SERVER_ALREADY_EXISTS - The server with the name "$1" already exists. Use a unique name and try again.',) trace_logging_wrapper /opt/stack/venv/cinder-20171028T231402Z/lib/python2.7/site-packages/cinder/utils.py:901

Increasing max-count specification increases the occurrence frequency.

See original description

Tags:

Revision history for this message

Sean McGinnis (sean-mcginnis) wrote on 2018-07-02:

Might be better to bring this up through your Helion support. It appears there is some configuration or connectivity issue with your storage backend that may have caused a retry that created the same thing twice, but that can be better diagnosed by them and not the upstream community.

tags:

added: drivers helion hpe

Revision history for this message

Eiji Kobayashi (kobayashie) wrote on 2018-07-03:

This problem occured with the following configuration.

Release Lefthand OS hpelefthand driver Reproduce
Liberty 11.5.01.1.0079 1.0.9 Yes
Mitaka 11.5.01.1.0079 2.0.8 Yes
Mitaka 11.5.01.0099 2.0.8 No
Mitaka 11.5.01.0099 2.0.8 No
Newton 11.5.01.0099 2.0.10 Yes
Rockey 12.7 2.0.15 Reproduce only once

There is no difference in VSA setting.

Revision history for this message

Vivek Soni (viveksoni) wrote on 2018-07-17:

Furthermore, nova boot command which was executed in your case, basically creates a bootable volume and spawns a new nova instance from it. And that volume is attached & thus its status should be “in-use”.
In the above case, only two driver api's are invoked viz. create_volume() & initialize_connection().

There are couple of error messages seen in provided log file “cinder-volume.log.2”:
1) “Delete snapshot failed, due to snapshot busy.”
2) “2018-06-07 08:03:48.693 17974 ERROR cinder.volume.manager ImageCopyFailure: Failed to copy image to volume: Bad or unexpected response from the storage volume backend API: Unable to fetch connection information from backend: Error (HTTP 500) OPERATION_FAILED - The operation failed: The volume list 'helion-cp1-c1-m1-mgmtVolumeList1528358624' cannot be created because a volume list with the same name already exists. Enter a different name for the new volume list.”

3) 2018-06-07 08:45:57.441 30555 ERROR cinder.volume.driver [req-9bb0e9b4-33dc-4956-9a8b-baf8add719d5 45336d44e7174de5b7d0430639437ccf 7ce30e9231de44009f10b555896105d4 - default default] Unable to fetch connection information from backend: Error (HTTP 500) SERVER_ALREADY_EXISTS - The server with the name "$1" already exists. Use a unique name and try again.

>> wsapi gives the error message "SERVER_ALREADY_EXISTS" when it tries to create a SERVER which is already present in case of attachment.
but drivers handles this scenario, by first checking the SERVER existence source code link & if it’s not found then only it tries to create the new SERVER. I tried to reproduce with only python-lefthandclient & not with cinder.

Revision history for this message

Vivek Soni (viveksoni) wrote on 2018-07-17:

This 'SERVER_ALREADY_EXISTS' error occurs only if we tries to create a server which is already exists: I am able to get this error using standalone script

# creating NEW server
server_info = cl.createServer("cld6b2", "iqn.1993-08.org.debian:01:686b42f5dcc8")
>>> server_info
{u'inServerCluster': False, u'chapTargetSecret': u'', u'iscsiEnabled': True, u'modified': u'', u'iscsiIQN': u'iqn.1993-08.org.debian:01:686b42f5dcc8', u'id': 5904, u'wwnnList': [], u'wwpnList': [], u'chapInitiatorSecret': u'', u'type': u'server', u'bootVolumeID': 0, u'iscsiLoadBalancingEnabled': True, u'description': u'', u'chapName': u'', u'chapAuthenticationRequired': False, u'bootVolumeLun': 0, u'controllingServerName': u'', u'fibreChannelEnabled': False, u'name': u'cld6b2', u'created': u'', u'uri': u'/lhos/servers/5904', u'fibreChannelPaths': None}

# creating a server which is already EXISTS -- ERROR
>>> server_info = cl.createServer("cld6b2", "iqn.1993-08.org.debian:01:686b42f5dcc8")
/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py:857: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/hpelefthandclient/client.py", line 263, in createServer
    response, body = self.http.post('/servers', body=info)
  File "/usr/local/lib/python2.7/dist-packages/hpelefthandclient/http.py", line 343, in post
    return self._cs_request(url, 'POST', **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/hpelefthandclient/http.py", line 291, in _cs_request
    **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/hpelefthandclient/http.py", line 267, in _time_request
    resp, body = self.request(url, method, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/hpelefthandclient/http.py", line 261, in request
    raise exceptions.from_response(resp, body)
hpelefthandclient.exceptions.HTTPServerError: Error (HTTP 500) SERVER_ALREADY_EXISTS - The server with the name "$1" already exists. Use a unique name and try again.

--------------------------------

This will not occurs via VSA driver path flow as it first check the existence of server:
if server exists:
    # it returns and uses that server
    # code reference - https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/hpe/hpe_lefthand_iscsi.py#L964
else:
    # it creates a new server
    # code reference - https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/hpe/hpe_lefthand_iscsi.py#L985

This 'SERVER_ALREADY_EXISTS' error occurs only if we tries to create a server which is already exists: I am able to get this error using standalone script

--------------------------------

Revision history for this message

Vivek Soni (viveksoni) wrote on 2018-07-18:

can you please provide the test setup ?
I tried setting up the newton setup but failed due to some dependency error

Revision history for this message

Vivek Soni (viveksoni) wrote on 2018-07-31:

This occurs due to the concurrent requests, both the request calls 'getServerByName' api https://github.com/openstack/cinder/blob/driverfixes/newton/cinder/volume/drivers/hpe/hpe_lefthand_iscsi.py#L898 & failed and then their is a race condition to call https://github.com/openstack/cinder/blob/driverfixes/newton/cinder/volume/drivers/hpe/hpe_lefthand_iscsi.py#L919

and one request which reaches the createServer() succeed and other request failed with error
hpelefthandclient.exceptions.HTTPServerError: Error (HTTP 500) SERVER_ALREADY_EXISTS - The server with the name "$1" already exists. Use a unique name and try again.

Fix:
------
we need to provide some locking mechanism to avoid such race situation

Eiji Kobayashi (kobayashie) on 2018-08-01

description:

updated

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-02: Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/588171

Changed in cinder:
assignee:	nobody → Vivek Soni (viveksoni)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-02: Fix merged to cinder (master)

Reviewed: https://review.openstack.org/588171
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8a039bb5cd00baac5e97081621e57820586a33d5
Submitter: Zuul
Branch: master

commit 8a039bb5cd00baac5e97081621e57820586a33d5
Author: Vivek Soni <email address hidden>
Date: Thu Aug 2 04:05:23 2018 -0400

VSA: Concurrent request handling in attachment

    Issue: VSA cannot have two server created with
    same host name, so when concurrent request tries
    to create server only one request succeed and
    subsequent request gets response of exception
    of already exists.

    This fix handle concurrent request and ensure
    only one request with same host name request
    server creation on VSA.

Change-Id: I41815510f0e6bf865a46d1efd7b73f613ddbe736
Closes-Bug: #1779654

Changed in cinder:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-06: Fix included in openstack/cinder 13.0.0.0b3

This issue was fixed in the openstack/cinder 13.0.0.0b3 development milestone.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-07: Fix proposed to cinder (driverfixes/newton)

#10

Fix proposed to branch: driverfixes/newton
Review: https://review.openstack.org/589532

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-07: Change abandoned on cinder (driverfixes/newton)

#11

Change abandoned by Keith Berger (<email address hidden>) on branch: driverfixes/newton
Review: https://review.openstack.org/589532

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-07: Fix proposed to cinder (stable/queens)

#12

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/589649

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-07: Fix proposed to cinder (stable/pike)

#13

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/589650

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-09: Fix merged to cinder (stable/queens)

#14

Reviewed: https://review.openstack.org/589649
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=ff4696b41e42e56112f475b0c64f7fcc61f9ac58
Submitter: Zuul
Branch: stable/queens

commit ff4696b41e42e56112f475b0c64f7fcc61f9ac58
Author: Vivek Soni <email address hidden>
Date: Thu Aug 2 04:05:23 2018 -0400

VSA: Concurrent request handling in attachment

    This fix handle concurrent request and ensure
    only one request with same host name request
    server creation on VSA.

    Change-Id: I41815510f0e6bf865a46d1efd7b73f613ddbe736
    Closes-Bug: #1779654
    (cherry picked from commit 8a039bb5cd00baac5e97081621e57820586a33d5)

tags:

added: in-stable-queens

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-10: Fix merged to cinder (stable/pike)

#15

Reviewed: https://review.openstack.org/589650
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=04c8241433a5ffa5aeba295d461f70dcc137870d
Submitter: Zuul
Branch: stable/pike

commit 04c8241433a5ffa5aeba295d461f70dcc137870d
Author: Vivek Soni <email address hidden>
Date: Thu Aug 2 04:05:23 2018 -0400

VSA: Concurrent request handling in attachment

    This fix handle concurrent request and ensure
    only one request with same host name request
    server creation on VSA.

    Change-Id: I41815510f0e6bf865a46d1efd7b73f613ddbe736
    Closes-Bug: #1779654
    (cherry picked from commit 8a039bb5cd00baac5e97081621e57820586a33d5)

tags:

added: in-stable-pike

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-20: Fix proposed to cinder (stable/ocata)

#16

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/593638

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-20: Fix merged to cinder (stable/ocata)

#17

Reviewed: https://review.openstack.org/593638
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=fec61686a0ef7cfe586e44428f7d5dfc2f571fd7
Submitter: Zuul
Branch: stable/ocata

commit fec61686a0ef7cfe586e44428f7d5dfc2f571fd7
Author: Vivek Soni <email address hidden>
Date: Thu Aug 2 04:05:23 2018 -0400

VSA: Concurrent request handling in attachment

    This fix handle concurrent request and ensure
    only one request with same host name request
    server creation on VSA.

    Change-Id: I41815510f0e6bf865a46d1efd7b73f613ddbe736
    Closes-Bug: #1779654
    (cherry picked from commit 8a039bb5cd00baac5e97081621e57820586a33d5)

tags:

added: in-stable-ocata

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-24: Fix merged to cinder (driverfixes/newton)

#18

Reviewed: https://review.openstack.org/589532
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=d92115c2ef91f64720a92f9357ad262d7c106d29
Submitter: Zuul
Branch: driverfixes/newton

commit d92115c2ef91f64720a92f9357ad262d7c106d29
Author: Vivek Soni <email address hidden>
Date: Thu Aug 2 04:05:23 2018 -0400

VSA: Concurrent request handling in attachment

    This fix handle concurrent request and ensure
    only one request with same host name request
    server creation on VSA.

    Change-Id: I41815510f0e6bf865a46d1efd7b73f613ddbe736
    Closes-Bug: #1779654
    (cherry picked from commit 8a039bb5cd00baac5e97081621e57820586a33d5)