In a load test where a nova environment w/ networking enabled was set up to have ~250K instances, attempting to launch 50 instances would cause many to time out, with the error "Timeout while waiting on RPC response - topic: "network", RPC method: "allocate_for_instance". The tester isolated the latency here to queries against the virtual_interfaces table, which in this test is executed some 500 times, spending ~.5 seconds per query for a total of 200 seconds. An example query looks like:
SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '9774e729-7695-4e2b-a9b2-a104a4b020d0'
LIMIT 1;
Query profiling against this table /query directly proceeded as follows:
I scripted up direct DB access to get 250K rows in a blank database:
MariaDB [nova]> select count(*) from virtual_interfaces;
+----------+
| count(*) |
+----------+
| 250000 |
+----------+
1 row in set (0.09 sec)
emitting the query when the row is found, on this particular system is returning in .03 sec:
MariaDB [nova]> SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '0a269012-cbc7-4093-9602-35f003a766c4' LIMIT 1;
+---------------------+------------+------------+---------+-------+---------------+------------+--------------------------------------+--------------------------------------+
| created_at | updated_at | deleted_at | deleted | id | address | network_id | instance_uuid | uuid |
+---------------------+------------+------------+---------+-------+---------------+------------+--------------------------------------+--------------------------------------+
| 2014-08-12 22:22:14 | NULL | NULL | 0 | 58393 | address_58393 | 22 | 41f1b859-8c5d-4c27-a52e-3e97652dfe7a | 0a269012-cbc7-4093-9602-35f003a766c4 |
+---------------------+------------+------------+---------+-------+---------------+------------+--------------------------------------+--------------------------------------+
1 row in set (0.03 sec)
we can see that for a row not found, where it has to scan the whole table, it's 10x longer:
MariaDB [nova]> SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '0a269012-cbc7-4093-9602-35f003a766c5' LIMIT 1;
Empty set (0.14 sec)
There's nothing mysterious going on here as an EXPLAIN shows plainly that we are doing a full table scan:
MariaDB [nova]> EXPLAIN SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '0a269012-cbc7-4093-9602-35f003a766c4' LIMIT 1;
+------+-------------+--------------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | virtual_interfaces | ALL | NULL | NULL | NULL | NULL | 250170 | Using where |
+------+-------------+--------------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)
adding an index on the uuid field via "create index vuidx on virtual_interfaces(uuid)", the EXPLAIN now shows the index used:
MariaDB [nova]> EXPLAIN SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '0a269012-cbc7-4093-9602-35f003a766c4' LIMIT 1;
+------+-------------+--------------------+------+---------------+-------+---------+-------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------------+------+---------------+-------+---------+-------+------+------------------------------------+
| 1 | SIMPLE | virtual_interfaces | ref | vuidx | vuidx | 111 | const | 1 | Using index condition; Using where |
+------+-------------+--------------------+------+---------------+-------+---------+-------+------+------------------------------------+
1 row in set (0.00 sec)
and we get 0.00 response time for both queries:
MariaDB [nova]> SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '0a269012-cbc7-4093-9602-35f003a766c5' LIMIT 1;
Empty set (0.00 sec)
MariaDB [nova]> SELECT virtual_interfaces.created_at , virtual_interfaces.updated_at , virtual_interfaces.deleted_at , virtual_interfaces.deleted , virtual_interfaces.id , virtual_interfaces.address , virtual_interfaces.network_id , virtual_interfaces.instance_uuid , virtual_interfaces.uuid FROM virtual_interfaces WHERE virtual_interfaces.deleted = 0 AND virtual_interfaces.uuid = '0a269012-cbc7-4093-9602-35f003a766c4' LIMIT 1;
+---------------------+------------+------------+---------+-------+---------------+------------+--------------------------------------+--------------------------------------+
| created_at | updated_at | deleted_at | deleted | id | address | network_id | instance_uuid | uuid |
+---------------------+------------+------------+---------+-------+---------------+------------+--------------------------------------+--------------------------------------+
| 2014-08-12 22:22:14 | NULL | NULL | 0 | 58393 | address_58393 | 22 | 41f1b859-8c5d-4c27-a52e-3e97652dfe7a | 0a269012-cbc7-4093-9602-35f003a766c4 |
+---------------------+------------+------------+---------+-------+---------------+------------+--------------------------------------+--------------------------------------+
1 row in set (0.00 sec)
whether or not the index includes "deleted" doesn't really matter. If we're searching for UUIDs, we get that UUID row first, then the "deleted=0" is checked, not a big deal.
For an immediate fix, I propose to add the aforementioned index to the virtual_interfaces.uuid column.
review at https:/ /review. openstack. org/#/c/ 154243/