Comment 0 for bug 921773

Revision history for this message
Igor Babaev (igorb-seattle) wrote :

The optimizer of maridb-5.3 chooses a suboptimal execution plan for MyISAM DBT-3 database of scale factor 10
if a join buffer is employed.

With the settings:

set tmp_table_size=1024*1024*64;
set max_heap_table_size=1024*1024*64;
set sort_buffer_size=1024*1024*64;
set optimizer_switch='semijoin=on';
set optimizer_switch='materialization=on';
set optimizer_switch='mrr=on';
set join_buffer_space_limit=1024*1024*128;
set join_buffer_size=1024*1024*32;
set optimizer_switch='mrr_sort_keys=on';
set join_cache_level=6;

the optimizer of 5.3 chooses the following execution plan:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------+----------+--------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------+--------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| 1 | PRIMARY | partsupp | index | PRIMARY,i_ps_partkey | PRIMARY | 8 | NULL | 8000000 | Using where; Using index; Using temporary; Using filesort |
| 1 | PRIMARY | part | eq_ref | PRIMARY | PRIMARY | 4 | dbt3x10_myisam_56.partsupp.ps_partkey | 1 | Using where; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan |
| 2 | MATERIALIZED | supplier | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Using where |
+----+--------------+----------+--------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+

The execution by this plan on a cold server took me 1 min 29.81 sec.

However, when with the same settings the driving table is the table part and the execution plan is:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part use index()
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------+----------+------+----------------------+--------------+---------+----------------------------------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------+------+----------------------+--------------+---------+----------------------------------+---------+----------------------------------------------+
| 1 | PRIMARY | part | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | partsupp | ref | PRIMARY,i_ps_partkey | i_ps_partkey | 4 | dbt3x10_myisam_56.part.p_partkey | 4 | Using where |
| 2 | MATERIALIZED | supplier | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Using where |
+----+--------------+----------+------+----------------------+--------------+---------+----------------------------------+---------+----------------------------------------------+

an execution of the query takes significantly less time.
The execution by this plan on a cold server took me 38.42 sec .

A similar performance difference can be observed if to set the materialization flag of the optimizer switch to 'off'.

The execution by the plan:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------------+----------+-----------------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-----------------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| 1 | PRIMARY | partsupp | index | PRIMARY,i_ps_partkey | PRIMARY | 8 | NULL | 8000000 | Using where; Using index; Using temporary; Using filesort |
| 1 | PRIMARY | part | eq_ref | PRIMARY | PRIMARY | 4 | dbt3x10_myisam_56.partsupp.ps_partkey | 1 | Using where; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan |
| 2 | DEPENDENT SUBQUERY | supplier | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
+----+--------------------+----------+-----------------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+

took me 3 min 8.70 sec,
while the execution by this plan:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part use index()
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------------+----------+-----------------+----------------------+--------------+---------+----------------------------------+---------+---------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-----------------+----------------------+--------------+---------+----------------------------------+---------+---------------------------------------------------------------------------------+
| 1 | PRIMARY | part | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | partsupp | ref | PRIMARY,i_ps_partkey | i_ps_partkey | 4 | dbt3x10_myisam_56.part.p_partkey | 4 | Using where; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan |
| 2 | DEPENDENT SUBQUERY | supplier | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
+----+--------------------+----------+-----------------+----------------------+--------------+---------+----------------------------------+---------+---------------------------------------------------------------------------------+

took me 50.25 sec.

The current mysql 5.6 does not support materialization of subqueries, but it chooses the faster plan with the table part as the driving table.