The optimizer does not use ref access to the tables of a dependent subquery if ref is built over outer fields.

Bug #927458 reported by Igor Babaev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Confirmed
Low
Unassigned

Bug Description

This is a performance issue.

Let's create and populate tables t1, t2,t3 with the following commands:

CREATE TABLE t1 (
  pk int NOT NULL,
  col_int_nokey INT NOT NULL,
  col_int_key INT NOT NULL,
  PRIMARY KEY (pk),
  KEY col_int_key (col_int_key)
) ENGINE=MyISAM;

INSERT INTO t1 VALUES
(1,4,0),
(2,6,8),
(3,3,1),
(7,2,6),
(8,9,1),
(9,3,6),
(10,8,2),
(11,1,4),
(12,8,8),
(13,8,4),
(14,5,4);

CREATE TABLE t2 (
  pk int NOT NULL,
  col_int_nokey int NOT NULL,
  col_int_key int NOT NULL,
  PRIMARY KEY (pk),
  KEY col_int_key (col_int_key)
) ENGINE=MyISAM;

INSERT INTO t2 VALUES
(10,8,7);

CREATE TABLE t3
SELECT grandparent1.col_int_nokey AS g1
FROM t1 AS grandparent1
WHERE (grandparent1.col_int_nokey, grandparent1.col_int_key) IN
   (SELECT parent1.col_int_key AS p1,
           parent1.col_int_key AS p2
    FROM t1 AS parent1
      LEFT JOIN t2 AS parent2
      ON parent1.col_int_nokey = parent2.col_int_key
    )
  AND grandparent1.col_int_key <> 3
;

Then for the query

SELECT * FROM t3
WHERE g1 NOT IN
   (SELECT grandparent1.col_int_nokey AS g1
    FROM t1 AS grandparent1
    WHERE (grandparent1.col_int_nokey, grandparent1.col_int_key) IN
       (SELECT parent1.col_int_key AS p1,
               parent1.col_int_key AS p2
        FROM t1 AS parent1
        LEFT JOIN t2 AS parent2
        ON parent1.col_int_nokey = parent2.col_int_key
        )
      AND grandparent1.col_int_key <> 3
);

mysql-5.6 chooses the plan:

+----+--------------------+--------------+-------+---------------+-------------+---------+------+------+-----------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------+-------+---------------+-------------+---------+------+------+-----------------------------------------------------------------+
| 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | parent1 | ref | col_int_key | col_int_key | 4 | func | 2 | Using index condition; Start temporary |
| 2 | DEPENDENT SUBQUERY | parent2 | index | col_int_key | col_int_key | 4 | NULL | 1 | Using where; Using index; Using join buffer (Block Nested Loop) |
| 2 | DEPENDENT SUBQUERY | grandparent1 | ref | col_int_key | col_int_key | 4 | func | 2 | Using index condition; Using where; End temporary |
+----+--------------------+--------------+-------+---------------+-------------+---------+------+------+-----------------------------------------------------------------+
4 rows in set (0.03 sec)

while maria db-5.3 chooses the plan:

+----+--------------------+--------------+--------+---------------+-------------+---------+----------------------------+------+----------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------+--------+---------------+-------------+---------+----------------------------+------+----------------------------------------------------------------+
| 1 | PRIMARY | t3 | system | NULL | NULL | NULL | NULL | 1 | |
| 2 | DEPENDENT SUBQUERY | parent1 | ALL | col_int_key | NULL | NULL | NULL | 11 | Start temporary |
| 2 | DEPENDENT SUBQUERY | parent2 | ref | col_int_key | col_int_key | 4 | test.parent1.col_int_nokey | 2 | Using index |
| 2 | DEPENDENT SUBQUERY | grandparent1 | ALL | col_int_key | NULL | NULL | NULL | 11 | Using where; End temporary; Using join buffer (flat, BNL join) |
+----+--------------------+--------------+--------+---------------+-------------+---------+----------------------------+------+----------------------------------------------------------------+

once the materialization flag of the optimizer switch is set to 'off'.

The first plan looks more efficient as table parent1 is accessed with an index look-up.

After an execution of the query in mysql-5.6 we have the following values of the read handler counters:

mysql> show status like '%Handler_read%';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| Handler_read_first | 2 |
| Handler_read_key | 4 |
| Handler_read_last | 0 |
| Handler_read_next | 4 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 2 |
+-----------------------+-------+

while after an execution of the query in mariadb-5.3 we have:

MariaDB [test]> show status like '%Handler_read%';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| Handler_read_first | 0 |
| Handler_read_key | 12 |
| Handler_read_next | 0 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 22 |
+--------------------------+-------+

Tags: optimizer
Changed in maria:
status: New → Confirmed
importance: Undecided → Medium
milestone: none → 5.3
Michael Widenius (monty)
Changed in maria:
milestone: 5.3 → 5.5
importance: Medium → Low
Revision history for this message
Elena Stepanova (elenst) wrote :

Also filed in JIRA as MDEV-195

Elena Stepanova (elenst)
tags: added: optimizer
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.