DB Deadlock on quota reserve

Bug #1518936 reported by Valeriy Ponomaryov on 2015-11-23
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Manila
Low
Unassigned

Bug Description

See logs of Rally job results [1]. There is following error:

2015-11-23 11:22:39.134 12164 ERROR oslo_db.api DBDeadlock: (pymysql.err.InternalError) (1213, u'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'INSERT INTO quota_usages (created_at, updated_at, deleted_at, deleted, project_id, user_id, resource, in_use, reserved, until_refresh) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'] [parameters: (datetime.datetime(2015, 11, 23, 11, 22, 39, 117932), datetime.datetime(2015, 11, 23, 11, 22, 39, 117688), None, 0, u'4c36967975304f639790e12e783a0c72', u'6490ad4bc45240e4b89958152e23b251', 'share_networks', 0, 0, None)]

[1] http://logs.openstack.org/15/248615/1/check/gate-rally-dsvm-manila-multibackend/b9889c1/logs/screen-m-api.txt.gz?level=TRACE#_2015-11-23_11_22_39_134

tags: added: api db
Changed in manila:
importance: Undecided → Low
milestone: none → newton-1
Peter Wang (peter.wang) wrote :

Is there anyone working on this issue?

Appreciate if someone can give a workaround.

Thanks
Peter

Tom Barron (tpb) on 2018-06-19
tags: added: quotas
Changed in manila:
milestone: newton-1 → none
Goutham Pacha Ravi (gouthamr) wrote :

Sorry to be chiming on this really old bug report.
Peter: do you see this issue occurring? If yes, please feel free to set this bug's status to "Confirmed". I haven't seen this issue in our rally jobs; we'll keep a look out for it.

tags: added: concurrency races
Changed in manila:
status: New → Incomplete
Tom Barron (tpb) wrote :
Download full text (6.9 KiB)

As Goutham just pointed out to me, we just had an instance of what is almost certainly the same issue at root. The report here is for an insert but here's a failure just like it except that it's a deadlock on deletion of quota usages:

From the manila-share log:

Mar 21 18:08:46.263907 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: DEBUG manila.quota [None req-91556767-fbbd-43ba-81f2-5a49c7eb7f86 None None] Created reservations ['e2ffd686-3ec9-4129-bf46-909eca546638'] {{(pid=16959) reserve /opt/stack/new/manila/manila/quota.py:916}}
Mar 21 18:08:46.278873 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: DEBUG eventlet.wsgi.server [-] (16959) accepted ('::ffff:158.69.74.30', 57108, 0, 0) {{(pid=16959) server /usr/local/lib/python3.6/dist-packages/eventlet/wsgi.py:956}}
Mar 21 18:08:46.279440 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: DEBUG eventlet.wsgi.server [-] (16959) accepted ('::ffff:158.69.74.30', 57110, 0, 0) {{(pid=16959) server /usr/local/lib/python3.6/dist-packages/eventlet/wsgi.py:956}}
Mar 21 18:08:46.295613 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: INFO manila.api.openstack.wsgi [None req-fa8107c9-1bd2-423c-ba8d-e2a6e93a5620 None None] DELETE https://158.69.74.30:8786/v1/081d92bd03fb49288548b31f48c36340/share-networks/7c850b58-8ce8-433e-8f53-fba572dc4644
Mar 21 18:08:46.296256 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: DEBUG manila.api.openstack.wsgi [None req-fa8107c9-1bd2-423c-ba8d-e2a6e93a5620 None None] Empty body provided in request {{(pid=16959) get_body /opt/stack/new/manila/manila/api/openstack/wsgi.py:653}}
Mar 21 18:08:46.296789 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: DEBUG manila.api.openstack.wsgi [None req-fa8107c9-1bd2-423c-ba8d-e2a6e93a5620 None None] Calling method 'ShareNetworkController.delete' {{(pid=16959) _process_stack /opt/stack/new/manila/manila/api/openstack/wsgi.py:799}}
Mar 21 18:08:46.347740 ubuntu-bionic-ovh-bhs1-0004095398 manila-api[16852]: ERROR manila.api.middleware.fault [None req-3d6475a9-72fb-42c5-9f66-8deaa54d4e49 None None] Caught error: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: 'UPDATE quota_usages SET updated_at=quota_usages.updated_at, deleted_at=%(deleted_at)s, deleted=quota_usages.id WHERE quota_usages.deleted = %(deleted_1)s AND quota_usages.share_type_id = %(share_type_id_1)s'] [parameters: {'deleted_at': datetime.datetime(2019, 3, 21, 18, 8, 46, 267266), 'deleted_1': 0, 'share_type_id_1': 'd01f19cc-c877-4404-b34f-52fd743e3c5e'}] (Background on this error at: http://sqlalche.me/e/e3q8): oslo_db.exception.DBDeadlock: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: 'UPDATE quota_usages SET updated_at=quota_usages.updated_at, deleted_at=%(deleted_at)s, deleted=quota_usages.id WHERE quota_usages.deleted = %(deleted_1)s AND quota_usages.share_type_id = %(share_type_id_1)s'] [parameters: {'deleted_at': datetime.datetime(2019, 3, 21, 18, 8, 46, 267266), 'deleted_1': 0, 'share_type_id_1': 'd01f19cc-c877-4404-b34f-52fd743e3c5e'}] (Background on this error at: http://sqlalche.me/e/e3q8)
Mar 21 18:08:46.347...

Read more...

Changed in manila:
status: Incomplete → Confirmed
Jason Grosso (jgrosso) wrote :

Tom this looks like an active problem why is this a low priority, is it something that is hard to re-produce or something most customers would not run into very often?

Tom Barron (tpb) wrote :

Jason, yeah I don't think we have a good sense of how to reproduce the issue and so far the reports are from developers with randomish failures in CI rather than from operators with field deployments.

Goutham Pacha Ravi (gouthamr) wrote :
Download full text (4.0 KiB)

One more failure: http://logs.openstack.org/34/671134/5/gate/manila-tempest-minimal-dsvm-dummy/02a769c/logs/screen-m-api.txt.gz#_Jul_25_19_58_14_319488

m-api logs attached.

This is happening so sporadically though - the test class we were trying to execute was SchedulerStatsAdminTest -

2019-07-25 19:58:14.321 28321 INFO tempest.lib.common.rest_client [req-785d8f85-4bb0-47da-b0eb-3461e8e697af ] Request (SchedulerStatsAdminTest:tearDownClass): 500 DELETE https://104.130.132.195:8786/v2/09ac67e686dd45f5a3db0940a38b4b34/types/be13b002-75e7-4a17-bede-f7a813f1854b 0.228s
2019-07-25 19:58:14.322 28321 DEBUG tempest.lib.common.rest_client [req-785d8f85-4bb0-47da-b0eb-3461e8e697af ] Request - Headers: {'x-openstack-manila-api-version': '2.49', 'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '500', u'content-length': '128', 'content-location': 'https://104.130.132.195:8786/v2/09ac67e686dd45f5a3db0940a38b4b34/types/be13b002-75e7-4a17-bede-f7a813f1854b', u'x-compute-request-id': 'req-785d8f85-4bb0-47da-b0eb-3461e8e697af', u'vary': 'X-OpenStack-Manila-API-Version', u'x-openstack-manila-api-version': '2.49', u'server': 'Apache/2.4.29 (Ubuntu)', u'connection': 'close', u'date': 'Thu, 25 Jul 2019 19:58:14 GMT', u'content-type': 'application/json'}
        Body: {"computeFault": {"code": 500, "message": "The server has either erred or is incapable of performing the requested operation."}} _log_request_full tempest/lib/common/rest_client.py:440
2019-07-25 19:58:14.322 28321 ERROR tempest.test [-] teardown of resources failed: Got server fault
Details: The server has either erred or is incapable of performing the requested operation.: ServerFault: Got server fault
Details: The server has either erred or is incapable of performing the requested operation.
2019-07-25 19:58:14.322 28321 ERROR tempest.test Traceback (most recent call last):
2019-07-25 19:58:14.322 28321 ERROR tempest.test File "tempest/test.py", line 197, in tearDownClass
2019-07-25 19:58:14.322 28321 ERROR tempest.test teardown()
2019-07-25 19:58:14.322 28321 ERROR tempest.test File "/opt/stack/new/manila-tempest-plugin/manila_tempest_tests/tests/api/base.py", line 317, in resource_cleanup
2019-07-25 19:58:14.322 28321 ERROR tempest.test cls.clear_resources(cls.class_resources)
2019-07-25 19:58:14.322 28321 ERROR tempest.test File "/opt/stack/new/manila-tempest-plugin/manila_tempest_tests/tests/api/base.py", line 962, in clear_resources
2019-07-25 19:58:14.322 28321 ERROR tempest.test client.delete_share_type(res_id)
2019-07-25 19:58:14.322 28321 ERROR tempest.test File "/opt/stack/new/manila-tempest-plugin/manila_tempest_tests/services/share/v2/json/shares_client.py", line 928, in delete_share_type
2019-07-25 19:58:14.322 28321 ERROR tempest.test resp, body = self.delete("types/%s" % share_type_id, version=version)
2019-07-25 19:58:14.322 28321 ERROR tempest.test File "/opt/stack/new/manila-tempest-plugin/manila_tempest_tests/services/share/v2/json/shares_client.py", line 87, in delete
2019-07-25 19:58:14.322 28321 ERROR tempest.test body=body)
2019-07-25 19:58:14...

Read more...

Goutham Pacha Ravi (gouthamr) wrote :

Another failure observed on a stable/rocky job: https://review.opendev.org/#/c/691006/2
Log from manila-api for the request ID that failed: http://paste.openstack.org/show/785563/

Interestingly, the last three occurrences of this issue seem to be when deleting a share type.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers