Wrong result (missing rows) with materialization+semijoin+join_cache_hashed, join_cache_level>2, HAVING
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Fix Released
|
Critical
|
Igor Babaev |
Bug Description
The following query
SELECT a FROM t1
WHERE a IN (
SELECT b FROM t1, t2
WHERE b = a
)
GROUP BY a
HAVING a != 'z'
on the test data returns only one row if it's run with materialization, semijoin, join_cache_hashed and join_cache_level>2, and two rows otherwise. The latter is correct.
bzr version-info
revision-id: <email address hidden>
date: 2012-02-16 20:15:57 +0400
build-date: 2012-02-17 21:19:42 +0400
revno: 3424
EXPLAIN with join_cache_level = 3 (wrong result):
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort
1 PRIMARY t1 hash_index a #hash#a:a 4:4 test.t1.a 3 66.67 Using where; Using join buffer (flat, BNLH join)
2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 100.00 Using where
2 MATERIALIZED t1 hash_index a #hash#a:a 4:4 test.t2.b 3 66.67 Using where; Using join buffer (flat, BNLH join)
Warnings:
Note 1003 select `test`.`t1`.`a` AS `a` from `test`.`t1` semi join (`test`.`t1` join `test`.`t2`) where ((`test`.`t1`.`a` = `test`.`t2`.`b`) and (`test`.`t1`.`a` = `test`.`t2`.`b`)) group by `test`.`t1`.`a` having (`test`.`t1`.`a` <> 'z')
EXPLAIN with join_cache_level =2 (correct result):
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort
1 PRIMARY t1 ref a a 4 test.t2.b 2 100.00 Using index
2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 100.00 Using where
2 MATERIALIZED t1 ref a a 4 test.t2.b 2 100.00 Using index
Warnings:
Note 1003 select `test`.`t1`.`a` AS `a` from `test`.`t1` semi join (`test`.`t1` join `test`.`t2`) where ((`test`.`t1`.`a` = `test`.`t2`.`b`) and (`test`.`t1`.`a` = `test`.`t2`.`b`)) group by `test`.`t1`.`a` having (`test`.`t1`.`a` <> 'z')
Minimal optimizer_switch: materialization
(and join_cache_
Full optimizer_switch: index_merge=
# Test case:
SET optimizer_switch = 'materializatio
# join_cache_level >= 3
SET join_cache_level = 3;
# MyISAM or Aria
CREATE TABLE t1 ( a VARCHAR(1), KEY(a) ) ENGINE=MyISAM;
INSERT INTO t1 VALUES ('c'),('v'),('c');
CREATE TABLE t2 ( b VARCHAR(1) ) ENGINE=MyISAM;
INSERT INTO t2 VALUES ('v'),('c');
SELECT a FROM t1
WHERE a IN (
SELECT b FROM t1, t2
WHERE b = a
)
GROUP BY a
HAVING a != 'z';
# End of test case
# Expected result:
# a
# ---
# c
# v
# Actual result:
# a
# ---
# c
# There were 6 other similar cases during the test run.
Changed in maria: | |
status: | New → Confirmed |
importance: | Undecided → Critical |
Changed in maria: | |
assignee: | Sergey Petrunia (sergefp) → Igor Babaev (igorb-seattle) |
Changed in maria: | |
status: | Confirmed → Fix Committed |
The problem is seen already in the execution plan:
MariaDB [test]> EXPLAIN SELECT a FROM t1 t WHERE t.a IN ( SELECT b FROM t1, t2 WHERE b = a ) GROUP BY t.a HAVING t.a != 'z'; ------- ------+ ------- ------+ ------+ ------- ------- -+----- -+----- ----+-- ------- --+---- --+---- ------+ ------- ------- ------- ------- -----+ ------- ------+ ------- ------+ ------+ ------- ------- -+----- -+----- ----+-- ------- --+---- --+---- ------+ ------- ------- ------- ------- -----+ ------- ------+ ------- ------+ ------+ ------- ------- -+----- -+----- ----+-- ------- --+---- --+---- ------+ ------- ------- -----
+----+-
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 2 | 100.00 | Using temporary; Using filesort |
| 1 | PRIMARY | t | ref | a | a | 4 | test.t1.a | 2 | 100.00 | Using where; Using index |
| 2 | MATERIALIZED | t2 | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where |
| 2 | MATERIALIZED | t1 | ref | a | a | 4 | test.t2.b | 2 | 100.00 | Using where; Using index |
+----+-
The key to access t is built oveer column a of the materialized table. Yet there is no such column there.