Cinder

Bug #2048396
Comment #0

Comment 0 for bug 2048396

Revision history for this message

Christian Rohmann (christian-rohmann) wrote on 2024-01-05: backups db table lacks indexes to speed up common queries

I observed strangely long MariaDB duration for queries to the backups table which is used by cinder-backup. This is very likely due to the fact that there no indexes helping to make common queries efficient:

```

As you can see I have >500k entries in this table, but most of them are (marked as) deleted:

```
SELECT deleted, count(deleted) FROM backups GROUP BY deleted;
+---------+----------------+
| deleted | count(deleted) |
+---------+----------------+
| 0 | 35689 |
| 1 | 482261 |
+---------+----------------+

```

I know about the `cinder-manage db purge` [2] tool. But on larger clouds a few (ten)thousand rows of deleted volumes / backups / ... do accumulate before they are purged away.

And so very common queries for e.g. backups for a certain project_id or volume_id will end up as sequential scans using where.

You might argue that the table is simple and likely cached. But it's not about a single query for a list of backups for a certain project. It's rather about tooling making more intensive use of the API being slowed down unnecessarily.

The volumes table in comparison certainly has some indexes:

```
SHOW indexes FROM volumes;
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| volumes | 0 | PRIMARY | 1 | id | A | 78598 | NULL | NULL | | BTREE | | | NO |
| volumes | 1 | ix_volumes_consistencygroup_id | 1 | consistencygroup_id | A | 2 | NULL | NULL | YES | BTREE | | | NO |
| volumes | 1 | ix_volumes_group_id | 1 | group_id | A | 2 | NULL | NULL | YES | BTREE | | | NO |
| volumes | 1 | volumes_service_uuid_idx | 1 | service_uuid | A | 2 | NULL | NULL | YES | BTREE | | | NO |
| volumes | 1 | volumes_service_uuid_idx | 2 | deleted | A | 4 | NULL | NULL | YES | BTREE | | | NO |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
```

and volumes_service_uuid_idx also covering the deleted column, One might argue that NOT having the deleted column as first in the compound index [1] makes looking up non-deleted volumes without service_uuid inefficient as the index will not be used:

Would it not make sense to have indexes on volume_id, project_id and also the deleted column to cover the most common selections?

[1] https://mariadb.com/kb/en/compound-composite-indexes/
[2] https://docs.openstack.org/cinder/latest/cli/cinder-manage.html

```

As you can see I have >500k entries in this table, but most of them are (marked as) deleted:

```
SELECT deleted, count(deleted) FROM backups GROUP BY deleted;
+---------+----------------+
| deleted | count(deleted) |
+---------+----------------+
|       0 |          35689 |
|       1 |         482261 |
+---------+----------------+

```

I know about the `cinder-manage db purge` [2] tool. But on larger clouds a few (ten)thousand rows of deleted volumes / backups / ... do accumulate before they are purged away.

And so very common queries for e.g. backups for a certain project_id or volume_id will end up as sequential scans using where.

The volumes table in comparison certainly has some indexes:

```
SHOW indexes FROM volumes;
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table   | Non_unique | Key_name                       | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| volumes |          0 | PRIMARY                        |            1 | id                  | A         |       78598 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| volumes |          1 | ix_volumes_consistencygroup_id |            1 | consistencygroup_id | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| volumes |          1 | ix_volumes_group_id            |            1 | group_id            | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| volumes |          1 | volumes_service_uuid_idx       |            1 | service_uuid        | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| volumes |          1 | volumes_service_uuid_idx       |            2 | deleted             | A         |           4 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
```

Would it not make sense to have indexes on volume_id, project_id and also the deleted column to cover the most common selections?

[1] https://mariadb.com/kb/en/compound-composite-indexes/
[2] https://docs.openstack.org/cinder/latest/cli/cinder-manage.html