Parallel deletion triggers DB deadlocks

Bug #1685818 reported by Arne Wiebalck
54
This bug affects 10 people
Affects Status Importance Assigned to Milestone
Cinder
Incomplete
Medium
Tushar Trambak Gite

Bug Description

When doing parallel deletions, i.e. "delete vol1 vol2 ... volN", rather than "delete vol1; delete vol2; ... ; delete volN", our c-vol server reports database deadlocks:

"Deadlock detected when running 'reservation_commit': Retrying..."
"Deadlock detected when running 'quota_reserve': Retrying..."

I can reproduce this with N=10, on Mitaka and Newton, on two different Cinder setups (different DBs, different Ceph backends).

This is the query that seems to get stuck:

SELECT quota_usages.created_at AS quota_usages_created_at, quota_usages.updated_at AS quota_usages_updated_at, quota_usages.deleted_at AS quota_usages_deleted_at, quota_usages.deleted AS quota_usages_deleted, quota_usages.id AS quota_usages_id, quota_usages.project_id AS quota_usages_project_id, quota_usages.resource AS quota_usages_resource, quota_usages.in_use AS quota_usages_in_use, quota_usages.reserved AS quota_usages_reserved, quota_usages.until_refresh AS quota_usages_until_refresh
FROM quota_usages
WHERE quota_usages.deleted = false AND quota_usages.project_id = 'MYPROJECTID' ORDER BY quota_usages.id ASC FOR UPDATE

In my test setup with N=10, this seems to recover after a while (the first 2-5 deletions go fast, then the "deadlock" occurs, things are stuck for a couple of minutes, then the rest of the deletions is done). With N sufficiently large, however, this seriously impacts our production service.

I didn't try parallel creations, but from https://bugs.launchpad.net/cinder/+bug/1294855 I understand that there are known races that were addressed by abort/retry, could it be that I hit one of these?

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

I've upgraded one of the Cinder instances to Otaca now, the issue persists (not excluded that this is due to my config, DB setup, or Ceph backend setup, of course).

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

I cannot reproduce this on devstack with LVM.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

One mitigation is to reduce innodb_lock_wait_timeout (from default 50 secs to 1 sec in our case): the deadlock gets detected faster, retries are triggered and deletions finish quicker. The underlying problem is of course still there.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

 https://bugs.launchpad.net/cinder/+bug/1613947 seems to describe the same issue.

Changed in cinder:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Jay Bryant (jsbryant) wrote :

I spent time trying to recreate this on our Ceph cluster this afternoon without any luck. I was able to do a parallel delete of 100 volumes, multiple times, without any failures or DB deadlock messages.

Just to clear, the command you are doing when you see this is 'cinder delete volume1 volume2 volume3 ... volumeN '

If so, are you able to recreate this as both the admin and non-admin user?

I will keep looking into this but it wasn't as easy to recreate in our environment.

Revision history for this message
James Dempsey (jamespd) wrote :

Jay,

I can't comment for Arne, but I believe that my bug is the same issue(See comment #4, bug 1613947) and I have found that I can more reliably encounter the error when deleting an instance with multiple attached volumes. See reproduction steps in https://bugs.launchpad.net/cinder/+bug/1613947/comments/1

Hope this helps.

Revision history for this message
TommyLike (hu-husheng) wrote :

I tried on my devstack environment with these steps below, but failed to reproduce the issue.

1. create 100 volumes parallelly.
2. wait for volumes ready.
3. delete the volumes parallelly (not cinder delete vol1 vol2).

I guess maybe there are something different in the configuration, package version or anywhere else as well. Looking forward to more feedback on this.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

Would you mind retrying with "cinder delete vol1 vol2 ... vol100"?
Because this is how I reproduce the issue.

Revision history for this message
Jay Bryant (jsbryant) wrote :

For comparison I tried doing this using one of my Lenovo backends and I was not able to recreate the problem with that driver either. I was able to delete 100 volumes in parallel and though it took some time, it complete successfully.

Arne, you said that you knew where the lock was that you could remove and avoid this problem. Can you send a pointer to that? I would like to look around the code and see if that approach helps.

James, forgive my ignorance, but what are you doing to cause your volumes to be deleted when the instance is deleted? I have not seen that behavior before.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :
Revision history for this message
Gerhard Muntingh (gerhard-1) wrote :

you should use connection = mysql+pymysql://cinder:REDACTED@REDACTED/cinder

otherwise the greenthreads will block each other while waiting for IO.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

Thanks, Gerhard! This seems to work: with the connection string changed following your suggestion, I cannot reproduce the problem any more, it seems.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

I reconfigured our production setup now: The parallel deletion of 100 volumes still triggers the above deadlock messages, but due to the concurrency introduced by the pymysql driver (in combination with the reduction of the innodb_lock_wait_timeout to 1 second) the deletions went through w/o triggering client side timeouts. Seems to be OK now.

Revision history for this message
MarkSeger (mark-seger) wrote :

I'm seeing something very similar except on creates. Specifically I have a script that fires off 10 threads, Each does a cinder create, gets the ID from the create call and then waits for the volume to become available. Normally all 10 call to 'create' return within the same second or 2. But every once and awhile there's a 50 second delay and I then also see the deadlock message in the API log, so it's not strictly related to a delete.

Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

I've seen previous bug reports for this issue reporting problems on creations as well, yes. It's just that I never tried to reproduce this with creations.

As mentioned, specifying the pymysql driver and reducing the innodb_lock_wait_timeout basically solves this issue (for us).

Revision history for this message
Eric Harney (eharney) wrote :

Should we document a particular config setting to get around this?

Changed in cinder:
status: Triaged → Incomplete
Revision history for this message
Alex Walender (awalende) wrote :

The given solutions did NOT fix this bug for us.
pymysql+sql was already set and increasing the innodb_lock_wait_timeout did not help either.

We make a rally test with creating 100 volumes (20 concurrent) and deleting them afterwards.
On creation we see about 10% failing with "Lost connection to mysql during query", resulting in volumes stuck in Creation.

Therefore we increased connection_timeout in our mariadb from 10 to 30 with mixed result. The volume creation still gets slowed down after a the first 5 have been created.

Deletion of 100 volumes even takes longer, throwing a bunch of db lock exceptions in the end. Also we sometimes see rabbitmq missed heartbeats.

We are running:
OpenStack Rocky+Stein
Ceph RBD as backend
Mariadb with Galera
Volume Ceph pool with ~350 volumes

We have already increased osapi_volume_workers, tuned InnoDB, checked on ceph and what not.
My assumption is, that there must be a blocking call in the rbd driver that prevents releasing locks in time for the other volume creations. It blocks so hard, that even rabbit starts timing out.

Changed in cinder:
assignee: nobody → Tushar Trambak Gite (tushargite96)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.