Wrong result (extra rows) with semijoin+materialization, IN subqueries, join_cache_level>0
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Fix Released
|
High
|
Sergey Petrunia |
Bug Description
The following query
SELECT * FROM A, B
WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' )
AND (
b1 < 'l' OR
a1 IN ( SELECT c1 FROM C )
)
on the test data produces 6 rows when executed with semijoin+
bzr version-info
revision-id: <email address hidden>
date: 2012-05-15 08:31:07 +0300
revno: 3523
Also reproducible on maria/5.5 revno 3403.
With the provided test case the problem is reproducible with MyISAM and InnoDB, but not Aria, (with Aria the plan is slightly different).
Minimal optimizer_switch: materialization
Full optimizer_switch (default): index_merge=
EXPLAIN (with minimal optimizer_switch and join_cache_level=2 which is current default):
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY A ALL NULL NULL NULL NULL 2 100.00
1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 4 100.00
1 PRIMARY B ALL b1 NULL NULL NULL 6 83.33 Using where; Using join buffer (flat, BNL join)
2 MATERIALIZED B ALL b1 NULL NULL NULL 6 66.67 Using where
3 MATERIALIZED C ALL NULL NULL NULL NULL 2 100.00
Warnings:
Note 1003 select `test`.`A`.`a1` AS `a1`,`test`
# Test case
SET optimizer_switch = 'materializatio
CREATE TABLE A (a1 VARCHAR(1), a2 VARCHAR(1))
ENGINE=MyISAM;
INSERT INTO A VALUES ('b','b'
CREATE TABLE B (b1 VARCHAR(1), b2 VARCHAR(1), KEY(b1))
ENGINE=MyISAM;
INSERT INTO B VALUES
('v',
('y',
CREATE TABLE C (c1 VARCHAR(1)) ENGINE=MyISAM;
INSERT INTO C VALUES ('b'),('c');
SELECT * FROM A, B
WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' )
AND (
b1 < 'l' OR
a1 IN ( SELECT c1 FROM C )
);
# End of test case
# Expected result:
# a1 a2 b1 b2
# -------
# b b v v
# b b s s
# b b y y
# Actual result:
# a1 a2 b1 b2
# -------
# b b v v
# e e v v
# b b s s
# e e s s
# b b y y
# e e y y
Changed in maria: | |
status: | Confirmed → In Progress |
Changed in maria: | |
status: | In Progress → Fix Committed |
The second subquery is redundant, one can replace it with list of constants
SELECT * FROM A, B WHERE b1 IN ( SELECT b2 FROM B WHERE b1 > 'o' ) AND ( b1 < 'l' OR a1 IN ('b','c') );
and the bug is still visible.