DBDeadlock when create & delete multiple volumes

Bug #1901973 reported by Nguyen Thanh Cong
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
New
Low
Unassigned

Bug Description

When i create multiple volumes (200 volumes) or delete multiple volumes at the same time, cinder-volume raise DBDeadlock when get quota usages.I am use version openstack train

Step to reproduce:
1. Create 200 instances on Horizon or create volume by following command:
for i in {1..200}; do openstack volume create --size 17 --image <image-id> --type HDD4 vccorp-volume-$i > /dev/null & done

2. Check log cinder-volume, some times it raise DBDeadlock (not all the time)

OR

1. Delete multiple volumes:
for i in $(openstack volume list --project admin --limit 500 | grep "vccorp-volume" | awk -F "|" '{printf $2}'); do openstack volume delete $i; done

How to fix:
I tried add decorator @oslo_db_api.wrap_db_retry(max_retries=5, retry_on_deadlock=True) before function _get_quota_usages in file cinder/db/sqlalchemy/api.py

I try create & delete volume again but no error deteched.

Can I submit this patch to master branch?

LOG ERROR:
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api [req-bc643b55-4a50-4fa3-85c0-ba7ce70c1a4d 7d04438e232746a5b81b3e50481a6a23 d819c820bff7410496a5a1187fa5a789 - default default] DB exceeded retry limit.: DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'SELECT quota_usages.created_at AS quota_usages_created_at, quota_usages.updated_at AS quota_usages_updated_at, quota_usages.deleted_at AS quota_usages_deleted_at, quota_usages.deleted AS quota_usages_deleted, quota_usages.id AS quota_usages_id, quota_usages.project_id AS quota_usages_project_id, quota_usages.resource AS quota_usages_resource, quota_usages.in_use AS quota_usages_in_use, quota_usages.reserved AS quota_usages_reserved, quota_usages.until_refresh AS quota_usages_until_refresh \nFROM quota_usages \nWHERE quota_usages.deleted = false AND quota_usages.project_id = %(project_id_1)s AND quota_usages.resource IN (%(resource_1)s, %(resource_2)s, %(resource_3)s, %(resource_4)s) ORDER BY quota_usages.id ASC FOR UPDATE'] [parameters: {u'resource_3': u'gigabytes_HDD4', u'project_id_1': u'd819c820bff7410496a5a1187fa5a789', u'resource_1': 'gigabytes', u'resource_4': 'volumes', u'resource_2': u'volumes_HDD4'}] (Background on this error at: http://sqlalche.me/e/2j85)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api Traceback (most recent call last):
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 142, in wrapper
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api return f(*args, **kwargs)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/cinder/db/sqlalchemy/api.py", line 1194, in quota_reserve
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api resources=deltas.keys())
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/cinder/db/sqlalchemy/api.py", line 1153, in _get_quota_usages
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api rows = query.order_by(models.QuotaUsage.id.asc()).\
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2773, in all
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api return list(self)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, in __iter__
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api return self._execute_and_instances(context)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2948, in _execute_and_instances
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api result = conn.execute(querycontext.statement, self._params)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api return meth(self, multiparams, params)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api return connection._execute_clauseelement(self, multiparams, params)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api compiled_sql, distilled_params
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api context)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1409, in _handle_dbapi_exception
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api util.raise_from_cause(newraise, exc_info)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api reraise(type(exception), exception, tb=exc_tb, cause=cause)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api context)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api cursor.execute(statement, parameters)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 170, in execute
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api result = self._query(query)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 328, in _query
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api conn.query(q)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 516, in query
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 727, in _read_query_result
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api result.read()
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1073, in read
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api self._read_result_packet(first_packet)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1143, in _read_result_packet
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api self._read_rowdata_packet()
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1177, in _read_rowdata_packet
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api packet = self.connection._read_packet()
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 683, in _read_packet
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api packet.check_error()
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/protocol.py", line 220, in check_error
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api err.raise_mysql_exception(self._data)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api raise errorclass(errno, errval)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'SELECT quota_usages.created_at AS quota_usages_created_at, quota_usages.updated_at AS quota_usages_updated_at, quota_usages.deleted_at AS quota_usages_deleted_at, quota_usages.deleted AS quota_usages_deleted, quota_usages.id AS quota_usages_id, quota_usages.project_id AS quota_usages_project_id, quota_usages.resource AS quota_usages_resource, quota_usages.in_use AS quota_usages_in_use, quota_usages.reserved AS quota_usages_reserved, quota_usages.until_refresh AS quota_usages_until_refresh \nFROM quota_usages \nWHERE quota_usages.deleted = false AND quota_usages.project_id = %(project_id_1)s AND quota_usages.resource IN (%(resource_1)s, %(resource_2)s, %(resource_3)s, %(resource_4)s) ORDER BY quota_usages.id ASC FOR UPDATE'] [parameters: {u'resource_3': u'gigabytes_HDD4', u'project_id_1': u'd819c820bff7410496a5a1187fa5a789', u'resource_1': 'gigabytes', u'resource_4': 'volumes', u'resource_2': u'volumes_HDD4'}] (Background on this error at: http://sqlalche.me/e/2j85)
2020-10-29 10:54:41.583 3107889 ERROR oslo_db.api

description: updated
summary: - DBDeadlock when creating multiple volumes
+ DBDeadlock when create & delete multiple volumes
description: updated
Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

"Can I submit this patch to master branch?"

If the existing deadlock patch does not cover enough and you are seeing the need for it to be expanded to cover other calls, then yes, please do propose a patch to add that decorator where needed.

Thanks!

Revision history for this message
Nguyen Thanh Cong (congnt95) wrote :

Hi Sean McGinnis,

I tested again with added retry on deadlock in function _get_quota_usages and it still raise error. Can you suupport me fix this bug.

Thank you!

Revision history for this message
michael-mcaleer (mmcaleer) wrote :

Hi,

Can you please confirm the version of Cinder you are using?

And approximately how many volumes can you create before you hit this deadlock error?

Michael

tags: added: database deadlock oslodb
Changed in cinder:
status: New → Incomplete
Revision history for this message
Nguyen Thanh Cong (congnt95) wrote :

We are user cinder version 15.0.0. Sometimes we create successful all volumes, but there are times when it only created 5-10 volumes, it makes a mistake

Revision history for this message
Son Do Xuan (sondx25) wrote :

Hi michael-mcaleer,
I and "Nguyen Thanh Cong" am using Cinder version 15.0.0, the database is running Percona XtraDB Cluster.

Changed in cinder:
status: Incomplete → New
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.