Cinder

workers table has many volumes stuck with null service_id being NULL

Bug #2077172 reported by Walt Boring on 2024-08-16

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Cinder	New	Undecided	Unassigned

Bug Description

If cinder restarts or rabbitmq restarts/bounces before the volume is scheduled on a host a volume can be stuck in 'creating' and the workers table is left with an entry without a service_id. This makes the volume not cleanable and will never get out of creating status even on restarts.

The following is from our QA system where we redeploy cinder pods on kubernetes frequently to update
rabbitmq or cinder services.

These entries will NEVER get cleaned as the volume service requires the service_id to be set in order for the db query to find them at do_cleanup() time in the volume manager.

https://github.com/openstack/cinder/blob/master/cinder/manager.py#L241-L246

MariaDB root@127.0.0.1:cinder> select count(*), status from workers where deleted=0 and service_id is NULL group by status;
+----------+----------+
| count(*) | status |
+----------+----------+
| 1951 | creating |
| 211 | deleting |
| 1 | OK |
+----------+----------+

This is one of our production systems.

MariaDB root@127.0.0.1:cinder> select count(*), status from workers where deleted=0 and service_id is NULL group by status;
+----------+----------+
| count(*) | status |
+----------+----------+
| 292 | creating |
| 42 | deleting |
| 1 | OK |
+----------+----------+

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.