Wrong result (extra rows) with derived_with_keys+loosescan+semijoin=ON, materialization=OFF, IN subquery from a temptable view
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Fix Released
|
High
|
Sergey Petrunia |
Bug Description
The following query
SELECT * FROM t1 AS t1_1, t1 AS t1_2
WHERE (t1_1.a, t1_2.a) IN ( SELECT a, b FROM v );
on the test data produces 13 rows with semijoin+
bzr version-info
revision-id: <email address hidden>
date: 2012-04-07 17:27:00 -0700
build-date: 2012-04-11 04:54:24 +0400
revno: 3500
Notes:
Reproducible on MariaDB 5.5 revno 3364.
Could not reproduce on MySQL trunk, although maybe I was trying wrong optimizer switches.
Could not reproduce by replacing the view with a table.
Minimal optimizer_switch: derived_
Full optimizer_switch: index_merge=
EXPLAIN with the minimal optimizer switch (semijoin+
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY t1_1 ALL NULL NULL NULL NULL 11 100.00 Using where
1 PRIMARY <derived3> ref key0 key0 5 test.t1_1.a 2 100.00 Start temporary
1 PRIMARY t1_2 ALL NULL NULL NULL NULL 11 100.00 Using where; End temporary; Using join buffer (flat, BNL join)
3 DERIVED t1 ALL NULL NULL NULL NULL 11 100.00
Warnings:
Note 1003 select count(0) AS `COUNT(*)` from `test`.`t1` `t1_1` semi join (`test`.`v`) join `test`.`t1` `t1_2` where ((`test`.`t1_2`.`a` = `v`.`b`) and (`v`.`a` = `test`.`t1_1`.`a`))
Test case:
SET optimizer_switch = 'derived_
CREATE TABLE t1 ( a INT, b INT );
INSERT INTO t1 VALUES
(4,0)
(2,6)
CREATE ALGORITHM=TEMPTABLE
VIEW v AS SELECT * FROM t1;
SELECT * FROM t1 AS t1_1, t1 AS t1_2
WHERE (t1_1.a, t1_2.a) IN ( SELECT a, b FROM v );
# End of test case
# Expected result:
# a b a b
# -------------------
# 5 8 4 0
# 2 4 4 0
# 2 6 4 0
# 5 4 4 0
# 5 4 4 0
# 2 4 6 8
# 2 6 6 8
# 3 1 9 1
# 3 9 9 1
# 7 7 7 7
# Actual result:
# a b a b
# -------------------
# 5 8 4 0
# 5 8 4 0
# 2 4 4 0
# 2 6 4 0
# 5 4 4 0
# 5 4 4 0
# 5 4 4 0
# 5 4 4 0
# 2 4 6 8
# 2 6 6 8
# 3 1 9 1
# 3 9 9 1
# 7 7 7 7
Changed in maria: | |
importance: | Undecided → High |
assignee: | nobody → Sergey Petrunia (sergefp) |
milestone: | none → 5.3 |
tags: | added: optimizer wrong-result |
Changed in maria: | |
status: | In Progress → Fix Committed |
* The problem can be repeated with set @@join_ cache_level= 0;
* There seems to be a difference between EXPLAIN and actual execution: EXPLAIN statement will pick Duplicate-Weedout strategy, while the SELECT itself will use LooseScan.
* If it's using Loose Scan, how is it using it for derived table? (which is expected to be of type=heap and have a HASH index? I didn't check this, I'll need to do)
* The produced dataset, 13 rows - is a join (i.e. duplicates are not removed).
* different optimization of EXPLAIN SELECT and SELECT is also a bug, I'll need to investigate.