Optimizer chooses a suboptimal excution plan for Q18 from DBT-3
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Fix Released
|
High
|
Sergey Petrunia |
Bug Description
When working with DBT-3 database (scale factor 10) created for MyISAM the optimizer of 5.3.4 chooses a suboptimal execution plan if all optimizer flags are set by default:
MariaDB [dbt3x10_myisam]> explain
-> select sql_calc_found_rows
-> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
-> from customer, orders, lineitem
-> where o_orderkey in (select l_orderkey from lineitem
-> group by l_orderkey having sum(l_quantity) > 300)
-> and c_custkey = o_custkey and o_orderkey = l_orderkey
-> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
-> order by o_totalprice desc, o_orderdate
-> limit 10;
+----+-
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-
| 1 | PRIMARY | orders | ALL | PRIMARY,i_o_custkey | NULL | NULL | NULL | 15000000 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | customer | eq_ref | PRIMARY | PRIMARY | 4 | dbt3x10_
| 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 4 | dbt3x10_
| 1 | PRIMARY | lineitem | ref | PRIMARY,
| 2 | MATERIALIZED | lineitem | index | NULL | i_l_orderkey_
+----+-
5 rows in set (0.00 sec)
MariaDB [dbt3x10_myisam]> select sql_calc_found_rows
-> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
-> from customer, orders, lineitem
-> where o_orderkey in (select l_orderkey from lineitem
-> group by l_orderkey having sum(l_quantity) > 300)
-> and c_custkey = o_custkey and o_orderkey = l_orderkey
-> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
-> order by o_totalprice desc, o_orderdate
-> limit 10;
+------
| c_name | c_custkey | o_orderkey | o_orderdate | o_totalprice | sum(l_quantity) |
+------
| Customer#001287812 | 1287812 | 42290181 | 1997-11-26 | 558289.17 | 318 |
| Customer#000644812 | 644812 | 2745894 | 1996-07-04 | 557664.53 | 304 |
| Customer#001172513 | 1172513 | 36667107 | 1997-06-06 | 550142.18 | 322 |
| Customer#000399481 | 399481 | 43906817 | 1995-04-06 | 549431.65 | 312 |
| Customer#000571654 | 571654 | 21213895 | 1992-01-03 | 549380.08 | 327 |
| Customer#000667882 | 667882 | 2199712 | 1996-09-30 | 542154.01 | 327 |
| Customer#001492954 | 1492954 | 30332516 | 1996-03-10 | 541181.8 | 310 |
| Customer#001471966 | 1471966 | 1263015 | 1997-02-02 | 540476.8 | 320 |
| Customer#001082018 | 1082018 | 31018979 | 1995-12-06 | 537993.05 | 304 |
| Customer#001114039 | 1114039 | 30417318 | 1995-10-31 | 536420.39 | 305 |
+------
10 rows in set (11 min 22.22 sec)
If the 'semijoin' flag of the optimizer switch is set to 'off' the optimizer chooses a better execution plan:
MariaDB [dbt3x10_myisam]> set optimizer_
Query OK, 0 rows affected (0.00 sec)
MariaDB [dbt3x10_myisam]> explain
-> select sql_calc_found_rows
-> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
-> from customer, orders, lineitem
-> where o_orderkey in (select l_orderkey from lineitem
-> group by l_orderkey having sum(l_quantity) > 300)
-> and c_custkey = o_custkey and o_orderkey = l_orderkey
-> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
-> order by o_totalprice desc, o_orderdate
-> limit 10;
+----+-
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-
| 1 | PRIMARY | orders | ALL | PRIMARY,i_o_custkey | NULL | NULL | NULL | 15000000 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | customer | eq_ref | PRIMARY | PRIMARY | 4 | dbt3x10_
| 1 | PRIMARY | lineitem | ref | PRIMARY,
| 2 | MATERIALIZED | lineitem | index | NULL | i_l_orderkey_
+----+-
4 rows in set (0.00 sec)
MariaDB [dbt3x10_myisam]> select sql_calc_found_rows
-> c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
-> from customer, orders, lineitem
-> where o_orderkey in (select l_orderkey from lineitem
-> group by l_orderkey having sum(l_quantity) > 300)
-> and c_custkey = o_custkey and o_orderkey = l_orderkey
-> group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
-> order by o_totalprice desc, o_orderdate
-> limit 10;
+------
| c_name | c_custkey | o_orderkey | o_orderdate | o_totalprice | sum(l_quantity) |
+------
| Customer#001287812 | 1287812 | 42290181 | 1997-11-26 | 558289.17 | 318 |
| Customer#000644812 | 644812 | 2745894 | 1996-07-04 | 557664.53 | 304 |
| Customer#001172513 | 1172513 | 36667107 | 1997-06-06 | 550142.18 | 322 |
| Customer#000399481 | 399481 | 43906817 | 1995-04-06 | 549431.65 | 312 |
| Customer#000571654 | 571654 | 21213895 | 1992-01-03 | 549380.08 | 327 |
| Customer#000667882 | 667882 | 2199712 | 1996-09-30 | 542154.01 | 327 |
| Customer#001492954 | 1492954 | 30332516 | 1996-03-10 | 541181.8 | 310 |
| Customer#001471966 | 1471966 | 1263015 | 1997-02-02 | 540476.8 | 320 |
| Customer#001082018 | 1082018 | 31018979 | 1995-12-06 | 537993.05 | 304 |
| Customer#001114039 | 1114039 | 30417318 | 1995-10-31 | 536420.39 | 305 |
+------
10 rows in set (2 min 31.82 sec)
The second plan is better because the materialized table is joined right after the first table orders.
summary: |
- Optimizer chooses suboptimal excution plan for Q18 from DBT-3 + Optimizer chooses a suboptimal excution plan for Q18 from DBT-3 |
Changed in maria: | |
status: | New → Confirmed |
importance: | Undecided → Critical |
assignee: | nobody → Sergey Petrunia (sergefp) |
Changed in maria: | |
milestone: | none → 5.3 |
Changed in maria: | |
importance: | Critical → High |
Changed in maria: | |
status: | Confirmed → Fix Committed |
The first attempt was this patch:
http:// lists.askmonty. org/pipermail/ commits/ 2012-January/ 002893. html
BUG#913030: Optimizer chooses a suboptimal excution plan for Q18 from DBT-3 ION_LOOKUP_ MATCH_RATIO= 0.75, made the optimizer assume
- Added SJ_MATERIALIZAT
that this is the probability that a lookup in semi-join materialized table will
produce a match.
.. decided not to push it.