ceilometer sample-list with limit does a table scan and takes long time to complete, needs optimization

Bug #1400476 reported by Rohit Jaiswal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Incomplete
Undecided
Rohit Jaiswal

Bug Description

ceilometer sample-list -m instance -l 20 results in the following sql query -

SELECT sample.id AS id, meter.name AS counter_name, meter.type AS counter_type, meter.unit AS counter_unit, sample.volume AS counter_volume, resource.resource_id AS resource_id, resource.source_id AS source_id, resource.user_id AS user_id, resource.project_id AS project_id, resource.resource_metadata AS resource_metadata, resource.internal_id AS internal_id, sample.timestamp AS timestamp, sample.message_id AS message_id, sample.message_signature AS message_signature, sample.recorded_at AS recorded_at FROM sample INNER JOIN meter ON sample.meter_id = meter.id INNER JOIN resource ON sample.resource_id = resource.internal_id WHERE meter.name = 'instance' ORDER BY sample.timestamp DESC limit 20;

Running explain plan on it generates the following plan -

+----+-------------+----------+--------+-------------------------------------------------------------------------+--------------------+---------+-------------------------------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+-------------------------------------------------------------------------+--------------------+---------+-------------------------------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | meter | ref | PRIMARY,def_unique,ix_meter_name | def_unique | 767 | const | 1 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | sample | ref | ix_sample_meter_id,ix_sample_resource_id,ix_sample_meter_id_resource_id | ix_sample_meter_id | 5 | ceilometer.meter.id | 3363398 | Using where |
| 1 | SIMPLE | resource | eq_ref | PRIMARY | PRIMARY | 4 | ceilometer.sample.resource_id | 1 | |
+----+-------------+----------+--------+-------------------------------------------------------------------------+--------------------+---------+-------------------------------+---------+-----------------------------------------------------------+

The number 3363398 denotes all the rows being scanned by the join in the sample table even when only 20 results are expected.

There should be a better way to perform this query so that the number of rows scanned is minimized and use the indexes in a better way to achieve that.

description: updated
Changed in ceilometer:
status: New → In Progress
Changed in ceilometer:
assignee: nobody → Rohit Jaiswal (rohit-jaiswal-3)
Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :
Download full text (5.2 KiB)

With the fix in place, ceilometer sample-list -m instance -l 20 generates the following query -

SELECT sample.id AS id, meter.name AS counter_name, meter.type AS counter_type, meter.unit AS counter_unit, sample.volume AS counter_volume, resource.resource_id AS resource_id, resource.source_id AS source_id, resource.user_id AS user_id, resource.project_id AS project_id, resource.resource_metadata AS resource_metadata, resource.internal_id AS internal_id, sample.timestamp AS timestamp, sample.message_id AS message_id, sample.message_signature AS message_signature, sample.recorded_at AS recorded_at FROM sample INNER JOIN (select * from meter where meter.name = 'instance') as meter ON sample.meter_id = meter.id INNER JOIN resource ON sample.resource_id = resource.internal_id ORDER BY sample.timestamp DESC limit 20;

Running explain on it generates the following plan -

+----+-------------+------------+--------+-------------------------------------------------------------------------+---------------------+---------+-------------------------------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+-------------------------------------------------------------------------+---------------------+---------+-------------------------------+-------+--------------------------+
| 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | sample | index | ix_sample_meter_id,ix_sample_resource_id,ix_sample_meter_id_resource_id | ix_sample_timestamp | 11 | NULL | 15927 | Using where |
| 1 | PRIMARY | resource | eq_ref | PRIMARY | PRIMARY | 4 | ceilometer.sample.resource_id | 1 | |
| 2 | DERIVED | meter | ref | def_unique,ix_meter_name | def_unique | 767 | | 1 | Using where; Using index |
+----+-------------+------------+--------+-------------------------------------------------------------------------+---------------------+---------+-------------------------------+-------+--------------------------+

The number of rows scanned in the sample table has gone down to 15,927.
From the plan, it also seems that the file sort and temporary table are not needed any more.

Querying with meter as image gives the following query -

SELECT sample.id AS id, meter.name AS counter_name, meter.type AS counter_type, meter.unit AS counter_unit, sample.volume AS counter_volume, resource.resource_id AS resource_id, resource.source_id AS source_id, resource.user_id AS user_id, resource.project_id AS project_id, resource.resource_metadata AS resource_metadata, resource.internal_id AS internal_id, samp...

Read more...

Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :
Download full text (4.9 KiB)

Increasing the limit increases the number of rows scanned proportionately -

explain SELECT sample.id AS id, meter.name AS counter_name, meter.type AS counter_type, meter.unit AS counter_unit, sample.volume AS counter_volume, resource.resource_id AS resource_id, resource.source_id AS source_id, resource.user_id AS user_id, resource.project_id AS project_id, resource.resource_metadata AS resource_metadata, resource.internal_id AS internal_id, sample.timestamp AS timestamp, sample.message_id AS message_id, sample.message_signature AS message_signature, sample.recorded_at AS recorded_at FROM sample INNER JOIN (select * from meter where meter.name = 'image') as meter ON sample.meter_id = meter.id INNER JOIN resource ON sample.resource_id = resource.internal_id ORDER BY sample.timestamp DESC LIMIT 60;
+----+-------------+------------+--------+-------------------------------------------------------------------------+---------------------+---------+-------------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+-------------------------------------------------------------------------+---------------------+---------+-------------------------------+------+--------------------------+
| 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | sample | index | ix_sample_meter_id,ix_sample_resource_id,ix_sample_meter_id_resource_id | ix_sample_timestamp | 11 | NULL | 120 | Using where |
| 1 | PRIMARY | resource | eq_ref | PRIMARY | PRIMARY | 4 | ceilometer.sample.resource_id | 1 | |
| 2 | DERIVED | meter | ref | def_unique,ix_meter_name | def_unique | 767 | | 1 | Using where; Using index |
+----+-------------+------------+--------+-------------------------------------------------------------------------+---------------------+---------+-------------------------------+------+--------------------------+

 explain SELECT sample.id AS id, meter.name AS counter_name, meter.type AS counter_type, meter.unit AS counter_unit, sample.volume AS counter_volume, resource.resource_id AS resource_id, resource.source_id AS source_id, resource.user_id AS user_id, resource.project_id AS project_id, resource.resource_metadata AS resource_metadata, resource.internal_id AS internal_id, sample.timestamp AS timestamp, sample.message_id AS message_id, sample.message_signature AS message_signature, sample.recorded_at AS recorded_at FROM sample INNER JOIN (select * from meter where meter.name = 'instance') as meter ON sample.meter_id = meter.id INNER JOIN resource ON sample.resource_...

Read more...

Revision history for this message
Chris Dent (cdent) wrote :

Which release of ceilometer are you using? If you are in icehouse there have been a significant number of changes to the sql storage engine in juno.

description: updated
summary: - ceilometer sample-list does a table scan and takes long time to
- complete, needs optimization
+ ceilometer sample-list with limit does a table scan and takes long time
+ to complete, needs optimization
Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :

@chdent: We are using juno

Revision history for this message
gordon chung (chungg) wrote :

is this mysql? because i've found a mysql bug[1]. basically mysql sucks.

[1] https://bugs.launchpad.net/ceilometer/+bug/1492303

Changed in ceilometer:
status: In Progress → Incomplete
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ceilometer (master)

Change abandoned by gordon chung (<email address hidden>) on branch: master
Review: https://review.openstack.org/139677
Reason: clean up

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.