Wrong result (extra row) with semijoin=ON, materialization=OFF, optimizer_prune_level=0

Bug #901399 reported by Elena Stepanova
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
High
Sergey Petrunia

Bug Description

The following query

SELECT *
FROM t1 AS alias1, t1 AS alias2
WHERE ( alias1.c, alias2.c )
  IN (
      SELECT alias3.a, alias3.a
        FROM t2 AS alias3, t2 alias4
        WHERE alias3.b = alias4.b
     );

produces an extra row on test data if it's run with materialization=OFF and otherwise default optimizer_switch values, while it returns the correct result set with materialization=ON (or with semijoin=OFF, or with loosescan=OFF).

EXPLAIN with materialization=OFF (wrong result):

1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00
1 PRIMARY alias3 ALL a NULL NULL NULL 3 66.67 Using where; LooseScan
1 PRIMARY alias4 ALL NULL NULL NULL NULL 3 100.00 Using where; FirstMatch(alias3)
1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)

select `test`.`alias1`.`c` AS `c`,`test`.`alias2`.`c` AS `c` from `test`.`t1` `alias1` semi join (`test`.`t2` `alias3` join `test`.`t2` `alias4`) join `test`.`t1` `alias2` where ((`test`.`alias3`.`a` = `test`.`alias2`.`c`) and (`test`.`alias1`.`c` = `test`.`alias2`.`c`) and (`test`.`alias4`.`b` = `test`.`alias3`.`b`))

EXPLAIN with materialization=ON (correct result):

1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00
1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)
1 PRIMARY <subquery2> eq_ref distinct_key distinct_key 10 func,func 1 100.00
2 MATERIALIZED alias3 ALL a NULL NULL NULL 3 100.00
2 MATERIALIZED alias4 ALL NULL NULL NULL NULL 3 100.00 Using where

select `test`.`alias1`.`c` AS `c`,`test`.`alias2`.`c` AS `c` from `test`.`t1` `alias1` semi join (`test`.`t2` `alias3` join `test`.`t2` `alias4`) join `test`.`t1` `alias2` where ((`test`.`alias2`.`c` = `test`.`alias1`.`c`) and (`test`.`alias4`.`b` = `test`.`alias3`.`b`))

Minimal optimizer_switch: materialization=off

Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

Reproducible with MyISAM and InnoDB, but not Aria.
Further reducing the test data eliminates the problem as the plan is not triggered anymore.

Test case:

SET optimizer_prune_level=0;
SET optimizer_switch = 'materialization=off';

CREATE TABLE t1 ( c INT ) ENGINE=MyISAM;
INSERT INTO t1 VALUES
  (0),(1),(2),(3),(4),(5),
  (6),(7),(8),(9),(10),(11),(12);
CREATE TABLE t2 ( a INT, b INT, KEY(a)) ENGINE=MyISAM;
INSERT INTO t2 VALUES (3,20),(2,21),(3,22);

SELECT *
FROM t1 AS alias1, t1 AS alias2
WHERE ( alias1.c, alias2.c )
  IN (
      SELECT alias3.a, alias3.a
        FROM t2 AS alias3, t2 alias4
        WHERE alias3.b = alias4.b
     );

# End of test case

# Notes on result:

# The correct result is 2 rows
# c c
# 2 2
# 3 3

# The test produces 3 rows
# c c
# 2 2
# 3 3
# 3 3

bzr version-info
revision-id: <email address hidden>
date: 2011-12-06 13:42:18 -0800
build-date: 2011-12-08 00:34:55 +0300
revno: 3334
branch-nick: maria-5.3

Also reproducible on revno 3315

Elena Stepanova (elenst)
Changed in maria:
milestone: none → 5.3
assignee: nobody → Sergey Petrunia (sergefp)
Elena Stepanova (elenst)
description: updated
Revision history for this message
Sergey Petrunia (sergefp) wrote :

The EXPLAIN for wrong result mentions use of join buffer, but the bug can be observed without join buffer, too:
MariaDB [j2]> set join_cache_level=0;
MariaDB [j2]> SELECT * FROM t1 AS alias1, t1 AS alias2 WHERE ( alias1.c, alias2.c ) IN ( SELECT alias3.a, alias3.a FROM t2 AS alias3, t2 alias4 WHERE alias3.b = alias4.b );
+------+------+
| c | c |
+------+------+
| 2 | 2 |
| 3 | 3 |
| 3 | 3 |
+------+------+
3 rows in set (0.00 sec)

Elena Stepanova (elenst)
description: updated
Changed in maria:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Sergey Petrunia (sergefp) wrote :

> EXPLAIN with materialization=OFF (wrong result):
>
> 1 PRIMARY alias2 ALL NULL NULL NULL NULL 13 100.00
> 1 PRIMARY alias3 ALL a NULL NULL NULL 3 66.67 Using where; LooseScan
> 1 PRIMARY alias4 ALL NULL NULL NULL NULL 3 100.00 Using where; FirstMatch(alias3)
> 1 PRIMARY alias1 ALL NULL NULL NULL NULL 13 100.00 Using where; Using join buffer (flat, BNL join)

Here one can see the problem: table alias3 uses 'ALL' scan and LooseScan. This
is incorrect: LooseScan needs index-ordered read.

Changed in maria:
status: Confirmed → Fix Committed
Revision history for this message
Elena Stepanova (elenst) wrote :

Fix released with 5.3.3-rc.

Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.