Wrong result (missing rows) with exists_to_in=on, semijoin+firstmatch, MERGE views or derived_merge, MyISAM or Aria

Bug #912795 reported by Elena Stepanova on 2012-01-06
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Invalid
High
Oleksandr "Sanja" Byelkin

Bug Description

The following query

SELECT * FROM v1, v2
WHERE v1.a = v2.a
  AND EXISTS (
        SELECT c FROM t1, t2
        WHERE b = d
          AND d = v2.c
    );

on the test data returns 2 rows if it's run with exists_to_in=ON, and 4 rows otherwise. The latter is correct.
A query with subselects instead of views return the same wrong result, but it also requires derived_merge=ON (current default in 5.3).

I could not convert this test into an 'IN' subquery, because it is important for the test that the column `c` is selected in subselect, while the outer column is matched to `d`. So possibly the problem is related to exists2in.

bzr version-info
revision-id: <email address hidden>
date: 2011-12-26 10:08:40 +0200
build-date: 2012-01-06 19:10:00 +0400
revno: 3370
branch-nick: lp-5.3-exists2in

EXPLAIN with exists_to_in=ON (wrong result):

id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY alias2 ALL a NULL NULL NULL 7 71.43 Using where
1 PRIMARY t2 ref d d 5 test.alias2.c 2 100.00 Using index
1 PRIMARY t1 ref a a 5 test.alias2.a 2 100.00
1 PRIMARY t1 ALL NULL NULL NULL NULL 7 100.00 Using where; FirstMatch(t2)
1 PRIMARY t2 index d d 5 NULL 3 100.00 Using where; Using index; Using join buffer (flat, BNL join)
1 PRIMARY alias1 ALL NULL NULL NULL NULL 7 100.00 Using where; Using join buffer (incremental, BNL join)
Warnings:
Note 1276 Field or reference 'v2.c' of SELECT #2 was resolved in SELECT #1
Note 1003 select `t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`c` AS `c`,`test`.`t2`.`d` AS `d`,`test`.`alias2`.`a` AS `a`,`test`.`alias2`.`b` AS `b`,`test`.`alias2`.`c` AS `c` from `test`.`t1` semi join (`test`.`t1` join `test`.`t2`) join `test`.`t2` join `test`.`t1` `alias1` join `test`.`t1` `alias2` where ((`test`.`t2`.`d` = `t1`.`b`) and (`alias1`.`b` = `test`.`alias2`.`b`) and (`test`.`t2`.`d` = `test`.`alias2`.`c`) and (`test`.`t1`.`b` = `test`.`alias2`.`c`) and (`t1`.`a` = `test`.`alias2`.`a`) and (`test`.`alias2`.`a` < 6))

EXPLAIN with exists_to_in=OFF (correct result):

id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY t2 index d d 5 NULL 3 100.00 Using index
1 PRIMARY t1 ALL a NULL NULL NULL 7 71.43 Using where; Using join buffer (flat, BNL join)
1 PRIMARY alias2 ALL a NULL NULL NULL 7 71.43 Using where; Using join buffer (incremental, BNL join)
1 PRIMARY alias1 ALL NULL NULL NULL NULL 7 100.00 Using where; Using join buffer (incremental, BNL join)
2 DEPENDENT SUBQUERY t2 ref d d 5 test.alias2.c 2 100.00 Using index
2 DEPENDENT SUBQUERY t1 ALL NULL NULL NULL NULL 7100.00 Using where; Using join buffer (flat, BNL join)
Warnings:
Note 1276 Field or reference 'v2.c' of SELECT #2 was resolved in SELECT #1
Note 1003 select `t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`c` AS `c`,`test`.`t2`.`d` AS `d`,`test`.`alias2`.`a` AS `a`,`test`.`alias2`.`b` AS `b`,`test`.`alias2`.`c` AS `c` from `test`.`t1` join `test`.`t2` join `test`.`t1` `alias1` join `test`.`t1` `alias2` where ((`t1`.`b` = `test`.`t2`.`d`) and (`alias1`.`b` = `test`.`alias2`.`b`) and (`test`.`alias2`.`a` = `t1`.`a`) and <expr_cache><`test`.`alias2`.`c`>(exists(select `test`.`t1`.`c` from `test`.`t1` join `test`.`t2` where ((`test`.`t1`.`b` = `test`.`t2`.`d`) and (`test`.`t2`.`d` = `test`.`alias2`.`c`)))) and (`t1`.`a` < 6))

Minimal optimizer_switch: in_to_exists=on,semijoin=on,firstmatch=on,exists_to_in=on
(in_to_exists or materialization are required, otherwise the query does not run)

Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,exists_to_in=on

Test case:

# in_to_exists or materialization are required,
# otherwise the query does not run
SET optimizer_switch = 'in_to_exists=on,semijoin=on,firstmatch=on,exists_to_in=on';

# MyISAM or Aria, but not InnoDB

CREATE TABLE t1 ( a INT, b INT, c INT, KEY(a) ) ENGINE=MyISAM;
INSERT INTO t1 VALUES (9,1,1);
INSERT INTO t1 VALUES (3,2,2);
INSERT INTO t1 VALUES (2,3,3);
INSERT INTO t1 VALUES (5,4,4);
INSERT INTO t1 VALUES (3,5,5);
INSERT INTO t1 VALUES (1,6,6);
INSERT INTO t1 VALUES (3,7,7);

CREATE TABLE t2 ( d INT, KEY(d) );
INSERT INTO t2 VALUES (9),(7),(2);

CREATE ALGORITHM=MERGE VIEW v1 AS
  SELECT * FROM t1, t2
    WHERE b = d;

CREATE ALGORITHM=MERGE VIEW v2 AS
  SELECT alias2.* FROM t1 AS alias1, t1 AS alias2
    WHERE alias1.b = alias2.b AND alias2.a < 6;

SELECT * FROM v1, v2
WHERE v1.a = v2.a
  AND EXISTS (
        SELECT c FROM t1, t2
        WHERE b = d
          AND d = v2.c
    );

# End of test case

# Expected result:

a b c d a b c
3 2 2 2 3 2 2
3 7 7 7 3 2 2
3 2 2 2 3 7 7
3 7 7 7 3 7 7

# Actual result:

a b c d a b c
3 2 2 2 3 2 2
3 2 2 2 3 7 7

Elena Stepanova (elenst) wrote :

A link to this bug is added to JIRA as MDEV-75

description: updated

Now fixed:

SET optimizer_switch = 'in_to_exists=on,semijoin=on,firstmatch=on,exists_to_in=on';
CREATE TABLE t1 ( a INT, b INT, c INT, KEY(a) ) ENGINE=MyISAM;
INSERT INTO t1 VALUES (9,1,1);
INSERT INTO t1 VALUES (3,2,2);
INSERT INTO t1 VALUES (2,3,3);
INSERT INTO t1 VALUES (5,4,4);
INSERT INTO t1 VALUES (3,5,5);
INSERT INTO t1 VALUES (1,6,6);
INSERT INTO t1 VALUES (3,7,7);
CREATE TABLE t2 ( d INT, KEY(d) );
INSERT INTO t2 VALUES (9),(7),(2);
CREATE ALGORITHM=MERGE VIEW v1 AS
SELECT * FROM t1, t2
WHERE b = d;
CREATE ALGORITHM=MERGE VIEW v2 AS
SELECT alias2.* FROM t1 AS alias1, t1 AS alias2
WHERE alias1.b = alias2.b AND alias2.a < 6;
SELECT * FROM v1, v2
WHERE v1.a = v2.a
AND EXISTS (
SELECT c FROM t1, t2
WHERE b = d
AND d = v2.c
);
a b c d a b c
3 7 7 7 3 2 2
3 2 2 2 3 2 2
3 7 7 7 3 7 7
3 2 2 2 3 7 7
drop view v1,v2;
drop table t1,t2;

Changed in maria:
status: New → In Progress
importance: Undecided → High
Changed in maria:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers