Wrong result (missing rows, wrong values) with materialization+semijoin+join_cache_hashed, join_cache_level>2, RIGHT JOIN, IN subquery
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Fix Released
|
Critical
|
Igor Babaev |
Bug Description
The following query
SELECT *
FROM t1 RIGHT JOIN t2 ON ( a = c )
WHERE ( b, c ) IN
( SELECT b, c FROM t2 )
on the test data returns wrong result (less rows and in one row a NULL instead of a value) with join_cache_level>2, and correct result otherwise.
bzr version-info
revision-id: <email address hidden>
date: 2012-02-16 20:15:57 +0400
build-date: 2012-02-17 21:10:34 +0400
revno: 3424
EXPLAIN with join_cache_level=3 (wrong result):
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00
1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00
1 PRIMARY t1 hash_index a #hash#a:a 4:4 test.t2.c 4 50.00 Using where; Using join buffer (flat, BNLH join)
2 MATERIALIZED t2 ALL b NULL NULL NULL 3 100.00
Warnings:
Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t2` semi join (`test`.`t2`) left join `test`.`t1` on(((`test`
EXPLAIN with join_cache_level=2 (correct result):
d select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY t2 ALL b NULL NULL NULL 3 100.00
1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00
1 PRIMARY t1 ref a a 4 test.t2.c 2 100.00 Using where; Using index
2 MATERIALIZED t2 ALL b NULL NULL NULL 3 100.00
Warnings:
Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t2`.`b` AS `b`,`test`.`t2`.`c` AS `c` from `test`.`t2` semi join (`test`.`t2`) left join `test`.`t1` on(((`test`
Minimal optimizer_switch: materialization
(and join_cache_
Full optimizer_switch: index_merge=
Test case:
SET optimizer_switch = 'materializatio
# join_cache_level >= 3
SET SESSION join_cache_level = 3;
CREATE TABLE t1 ( a VARCHAR(1), KEY(a) );
INSERT INTO t1 VALUES ('v'),(
CREATE TABLE t2 ( b VARCHAR(1), c VARCHAR(1), KEY(b) );
INSERT INTO t2 VALUES ('v','v'
SELECT *
FROM t1 RIGHT JOIN t2 ON ( a = c )
WHERE ( b, c ) IN
( SELECT b, c FROM t2 );
# End of test case
# Expected result:
# a b c
# -----------------
# v v v
# v v v
# NULL w w
# NULL t t
# Actual result:
# a b c
# -----------------
# NULL v v
# NULL w w
# NULL t t
Changed in maria: | |
status: | New → Confirmed |
importance: | Undecided → Critical |
Changed in maria: | |
assignee: | Sergey Petrunia (sergefp) → Igor Babaev (igorb-seattle) |
status: | Confirmed → In Progress |
Changed in maria: | |
status: | In Progress → Fix Committed |
The bug is reproducible with LEFT JOIN as well.
With Elena's settings and
set join_cache_level = 6
I get the following execution plan:
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t2 LEFT JOIN t1 ON ( a = c ) WHERE ( b, c ) IN ( SELECT b, c FROM t2 t ); ------- ------+ ------- ------+ ------- -+----- ------- ---+--- ------- ----+-- ------- +------ -----+- -----+- ------- --+---- ------- ------- ------- -+ ------- ------+ ------- ------+ ------- -+----- ------- ---+--- ------- ----+-- ------- +------ -----+- -----+- ------- --+---- ------- ------- ------- -+ ------- ------+ ------- ------+ ------- -+----- ------- ---+--- ------- ----+-- ------- +------ -----+- -----+- ------- --+---- ------- ------- ------- -+
+----+-
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-
| 1 | PRIMARY | t2 | ALL | b | NULL | NULL | NULL | 3 | 100.00 | |
| 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 10 | func,func | 1 | 100.00 | |
| 1 | PRIMARY | t1 | ref | a | a | 4 | test.t.c | 2 | 100.00 | Using where; Using index |
| 2 | MATERIALIZED | t | ALL | b | NULL | NULL | NULL | 3 | 100.00 | |
+----+-
I get a wrong result set when executing by this plan.
If I use
set join_cache_level = 0;
I get the execution plan:
MariaDB [test]> EXPLAIN EXTENDED SELECT * FROM t2 LEFT JOIN t1 ON ( a = c ) WHERE ( b, c ) IN ( SELECT b, c FROM t2 t ); ------- ------+ ------- ------+ ------- -+----- ------- ---+--- ------- ----+-- ------- +------ -----+- -----+- ------- --+---- ------- ------- ------- -+ ------- ------+ ------- ------+ ------- -+----- ------- ---+--- ------- ----+-- ------- +------ -----+- -----+- ------- --+---- ------- ------- ------- -+ ------- ------+ ------- ------+ ------- -+----- ------- ---+--- ------- ----+-- ------- +------ -----+- -----+- ------- --+---- ------- ------- ------- -+
+----+-
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-
| 1 | PRIMARY | t2 | ALL | b | NULL | NULL | NULL | 3 | 100.00 | |
| 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 10 | func,func | 1 | 100.00 | |
| 1 | PRIMARY | t1 | ref | a | a | 4 | test.t2.c | 2 | 100.00 | Using where; Using index |
| 2 | MATERIALIZED | t | ALL | b | NULL | NULL | NULL | 3 | 100.00 | |
+----+-
4 rows in set, 1 warning (0.00 sec)
I get the right result set when executing by this plan.
The plans differ only in line 3:
the first plan uses test.t.c to build the key to access table t1,
while the second plan uses test.t2.c for this purpose.