The given solutions did NOT fix this bug for us.
pymysql+sql was already set and increasing the innodb_lock_wait_timeout did not help either.
We make a rally test with creating 100 volumes (20 concurrent) and deleting them afterwards.
On creation we see about 10% failing with "Lost connection to mysql during query", resulting in volumes stuck in Creation.
Therefore we increased connection_timeout in our mariadb from 10 to 30 with mixed result. The volume creation still gets slowed down after a the first 5 have been created.
Deletion of 100 volumes even takes longer, throwing a bunch of db lock exceptions in the end. Also we sometimes see rabbitmq missed heartbeats.
We are running:
OpenStack Rocky+Stein
Ceph RBD as backend
Mariadb with Galera
Volume Ceph pool with ~350 volumes
We have already increased osapi_volume_workers, tuned InnoDB, checked on ceph and what not.
My assumption is, that there must be a blocking call in the rbd driver that prevents releasing locks in time for the other volume creations. It blocks so hard, that even rabbit starts timing out.
The given solutions did NOT fix this bug for us. lock_wait_ timeout did not help either.
pymysql+sql was already set and increasing the innodb_
We make a rally test with creating 100 volumes (20 concurrent) and deleting them afterwards.
On creation we see about 10% failing with "Lost connection to mysql during query", resulting in volumes stuck in Creation.
Therefore we increased connection_timeout in our mariadb from 10 to 30 with mixed result. The volume creation still gets slowed down after a the first 5 have been created.
Deletion of 100 volumes even takes longer, throwing a bunch of db lock exceptions in the end. Also we sometimes see rabbitmq missed heartbeats.
We are running:
OpenStack Rocky+Stein
Ceph RBD as backend
Mariadb with Galera
Volume Ceph pool with ~350 volumes
We have already increased osapi_volume_ workers, tuned InnoDB, checked on ceph and what not.
My assumption is, that there must be a blocking call in the rbd driver that prevents releasing locks in time for the other volume creations. It blocks so hard, that even rabbit starts timing out.