Suboptimal plan chosen for Q16 of a MyISAM DBT-3 database

Bug #921773 reported by Igor Babaev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
In Progress
Low
Igor Babaev

Bug Description

The optimizer of maridb-5.3 chooses a suboptimal execution plan for Q16 over MyISAM DBT-3 database of scale factor 10
if a join buffer is employed.

With the settings:

set tmp_table_size=1024*1024*64;
set max_heap_table_size=1024*1024*64;
set sort_buffer_size=1024*1024*64;
set optimizer_switch='semijoin=on';
set optimizer_switch='materialization=on';
set optimizer_switch='mrr=on';
set join_buffer_space_limit=1024*1024*128;
set join_buffer_size=1024*1024*32;
set optimizer_switch='mrr_sort_keys=on';
set join_cache_level=6;

the optimizer of 5.3 chooses the following execution plan:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------+----------+--------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------+--------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| 1 | PRIMARY | partsupp | index | PRIMARY,i_ps_partkey | PRIMARY | 8 | NULL | 8000000 | Using where; Using index; Using temporary; Using filesort |
| 1 | PRIMARY | part | eq_ref | PRIMARY | PRIMARY | 4 | dbt3x10_myisam_56.partsupp.ps_partkey | 1 | Using where; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan |
| 2 | MATERIALIZED | supplier | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Using where |
+----+--------------+----------+--------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+

The execution by this plan on a cold server took me 1 min 29.81 sec.

However, when with the same settings the driving table is the table part and the execution plan is:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part use index()
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------+----------+------+----------------------+--------------+---------+----------------------------------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+----------+------+----------------------+--------------+---------+----------------------------------+---------+----------------------------------------------+
| 1 | PRIMARY | part | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | partsupp | ref | PRIMARY,i_ps_partkey | i_ps_partkey | 4 | dbt3x10_myisam_56.part.p_partkey | 4 | Using where |
| 2 | MATERIALIZED | supplier | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Using where |
+----+--------------+----------+------+----------------------+--------------+---------+----------------------------------+---------+----------------------------------------------+

an execution of the query takes significantly less time.
The execution by this plan on a cold server took me 38.42 sec .

A similar performance difference can be observed if to set the materialization flag of the optimizer switch to 'off'.

The execution by the plan:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------------+----------+-----------------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-----------------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+
| 1 | PRIMARY | partsupp | index | PRIMARY,i_ps_partkey | PRIMARY | 8 | NULL | 8000000 | Using where; Using index; Using temporary; Using filesort |
| 1 | PRIMARY | part | eq_ref | PRIMARY | PRIMARY | 4 | dbt3x10_myisam_56.partsupp.ps_partkey | 1 | Using where; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan |
| 2 | DEPENDENT SUBQUERY | supplier | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
+----+--------------------+----------+-----------------+----------------------+---------+---------+---------------------------------------+---------+---------------------------------------------------------------------------------+

took me 3 min 8.70 sec,
while the execution by this plan:

MariaDB [dbt3x10_myisam_56]> explain
    -> select sql_calc_found_rows
    -> p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
    -> from partsupp, part use index()
    -> where p_partkey = ps_partkey and p_brand <> 'Brand#45'
    -> and p_type not like 'MEDIUM POLISHED%'and p_size in
    -> (49, 14, 23, 45, 19, 3, 36, 9)
    -> and ps_suppkey not in (select s_suppkey from supplier
    -> where s_comment like '%Customer%Complaints%')
    -> group by p_brand, p_type, p_size
    -> order by supplier_cnt desc, p_brand, p_type, p_size
    -> limit 10;
+----+--------------------+----------+-----------------+----------------------+--------------+---------+----------------------------------+---------+---------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-----------------+----------------------+--------------+---------+----------------------------------+---------+---------------------------------------------------------------------------------+
| 1 | PRIMARY | part | ALL | NULL | NULL | NULL | NULL | 2000000 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | partsupp | ref | PRIMARY,i_ps_partkey | i_ps_partkey | 4 | dbt3x10_myisam_56.part.p_partkey | 4 | Using where; Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan |
| 2 | DEPENDENT SUBQUERY | supplier | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
+----+--------------------+----------+-----------------+----------------------+--------------+---------+----------------------------------+---------+---------------------------------------------------------------------------------+

took me 50.25 sec.

The current mysql 5.6 does not support materialization of subqueries, but it chooses the faster plan with the table part as the driving table.

Changed in maria:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Igor Babaev (igorb-seattle)
milestone: none → 5.3
description: updated
Changed in maria:
status: Confirmed → In Progress
Michael Widenius (monty)
Changed in maria:
importance: High → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers