backups db table lacks any indices which is slowing down common queries

Bug #2048396 reported by Christian Rohmann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
New
Undecided
Eric Harney

Bug Description

I observed strangely long duration for MariaDB queries to the backups table which is used by cinder-backup. This is very likely due to the fact that this table has no indexes helping with common queries:

```
EXPLAIN SELECT * FROM backups WHERE project_id="myproject";
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | backups | ALL | NULL | NULL | NULL | NULL | 431890 | Using where |
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+

EXPLAIN SELECT * FROM backups WHERE volume_id="myproject" AND deleted=0;
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | backups | ALL | NULL | NULL | NULL | NULL | 431890 | Using where |
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
```

```
SHOW indexes FROM backups;
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| backups | 0 | PRIMARY | 1 | id | A | 431646 | NULL | NULL | | BTREE | | | NO |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+

```

As you can see I have >500k entries in this table, but most of them are (marked as) deleted:

```
SELECT deleted, count(deleted) FROM backups GROUP BY deleted;
+---------+----------------+
| deleted | count(deleted) |
+---------+----------------+
| 0 | 35689 |
| 1 | 482261 |
+---------+----------------+

```

I know about the `cinder-manage db purge` [2] tool. But on larger clouds a few (ten)thousand rows of deleted volumes / backups / ... do accumulate before they are purged away depending on retention time.

And so very common queries for e.g. backups for a certain project_id or volume_id will end up as sequential scans using where.

You might argue that the table is simple and likely cached. But it's not about a single query for a list of backups for a certain project. It's rather about tooling making more intensive use of the API being slowed down on each request unnecessarily.

The volumes table in comparison certainly has some indexes:

```
SHOW indexes FROM volumes;
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| volumes | 0 | PRIMARY | 1 | id | A | 78598 | NULL | NULL | | BTREE | | | NO |
| volumes | 1 | ix_volumes_consistencygroup_id | 1 | consistencygroup_id | A | 2 | NULL | NULL | YES | BTREE | | | NO |
| volumes | 1 | ix_volumes_group_id | 1 | group_id | A | 2 | NULL | NULL | YES | BTREE | | | NO |
| volumes | 1 | volumes_service_uuid_idx | 1 | service_uuid | A | 2 | NULL | NULL | YES | BTREE | | | NO |
| volumes | 1 | volumes_service_uuid_idx | 2 | deleted | A | 4 | NULL | NULL | YES | BTREE | | | NO |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
```

and volumes_service_uuid_idx also covering the deleted column, One might argue that NOT having the deleted column as first in the compound index [1] makes looking up non-deleted volumes without service_uuid inefficient as the index will not be used:

```
EXPLAIN * FROM volumes WHERE deleted=0;
+------+-------------+---------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+---------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | volumes | ALL | NULL | NULL | NULL | NULL | 78598 | Using where |
+------+-------------+---------+------+---------------+------+---------+------+-------+-------------+
```

Would it not make sense to have indexes on volume_id, project_id and also the deleted column to cover the most common selections?

[1] https://mariadb.com/kb/en/compound-composite-indexes/
[2] https://docs.openstack.org/cinder/latest/cli/cinder-manage.html

summary: - backups db table lacks indexes to speed up common queries
+ backups db table lacks indexes which is slowing down common queries
description: updated
description: updated
Eric Harney (eharney)
Changed in cinder:
assignee: nobody → Eric Harney (eharney)
Revision history for this message
Christian Rohmann (christian-rohmann) wrote : Re: backups db table lacks indexes which is slowing down common queries

Thanks Eric for tanking care of this one.

There was another bug filed a while ago with similar observations:
 https://bugs.launchpad.net/cinder/+bug/1694935

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

Eric is there anything that would help you with this one?
Since schema changes only comes with releases (I believe), it would awesome to have some indexes with Caracal.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

If we could discuss which columns should receive an index and where it make sense to use composite indices I gladly get a patch ready.

summary: - backups db table lacks indexes which is slowing down common queries
+ backups db table lacks any indices which is slowing down common queries
Revision history for this message
Christian Rohmann (christian-rohmann) wrote (last edit ):

While there is patch https://review.opendev.org/c/openstack/cinder/+/819669 which added an index on the deleted column, there are are not indices on any other columns.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.