Wrong result (extra rows) with derived_with_keys+loosescan+semijoin=ON, materialization=OFF, IN subquery from a temptable view

Reported by Elena Stepanova on 2012-04-11
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Maria
Fix Released
High
Sergey Petrunia

Bug Description

The following query

SELECT * FROM t1 AS t1_1, t1 AS t1_2
WHERE (t1_1.a, t1_2.a) IN ( SELECT a, b FROM v );

on the test data produces 13 rows with semijoin+loosescan+derived_with_keys and v being a temptable view, and 10 rows when not all of these conditions are met. 10 rows is correct.

 bzr version-info
revision-id: <email address hidden>
date: 2012-04-07 17:27:00 -0700
build-date: 2012-04-11 04:54:24 +0400
revno: 3500

Notes:
Reproducible on MariaDB 5.5 revno 3364.
Could not reproduce on MySQL trunk, although maybe I was trying wrong optimizer switches.
Could not reproduce by replacing the view with a table.

Minimal optimizer_switch: derived_with_keys=on,loosescan=on,semijoin=on,materialization=off
Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

EXPLAIN with the minimal optimizer switch (semijoin+loosescan+derived_with_keys=ON, everything else OFF):

id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY t1_1 ALL NULL NULL NULL NULL 11 100.00 Using where
1 PRIMARY <derived3> ref key0 key0 5 test.t1_1.a 2 100.00 Start temporary
1 PRIMARY t1_2 ALL NULL NULL NULL NULL 11 100.00 Using where; End temporary; Using join buffer (flat, BNL join)
3 DERIVED t1 ALL NULL NULL NULL NULL 11 100.00
Warnings:
Note 1003 select count(0) AS `COUNT(*)` from `test`.`t1` `t1_1` semi join (`test`.`v`) join `test`.`t1` `t1_2` where ((`test`.`t1_2`.`a` = `v`.`b`) and (`v`.`a` = `test`.`t1_1`.`a`))

Test case:

SET optimizer_switch = 'derived_with_keys=on,loosescan=on,semijoin=on,materialization=off';

CREATE TABLE t1 ( a INT, b INT );
INSERT INTO t1 VALUES
  (4,0),(6,8),(3,1),(5,8),(3,9),(2,4),
  (2,6),(9,1),(5,4),(7,7),(5,4);

CREATE ALGORITHM=TEMPTABLE
  VIEW v AS SELECT * FROM t1;

SELECT * FROM t1 AS t1_1, t1 AS t1_2
  WHERE (t1_1.a, t1_2.a) IN ( SELECT a, b FROM v );

# End of test case

# Expected result:
# a b a b
# -------------------
# 5 8 4 0
# 2 4 4 0
# 2 6 4 0
# 5 4 4 0
# 5 4 4 0
# 2 4 6 8
# 2 6 6 8
# 3 1 9 1
# 3 9 9 1
# 7 7 7 7

# Actual result:
# a b a b
# -------------------
# 5 8 4 0
# 5 8 4 0
# 2 4 4 0
# 2 6 4 0
# 5 4 4 0
# 5 4 4 0
# 5 4 4 0
# 5 4 4 0
# 2 4 6 8
# 2 6 6 8
# 3 1 9 1
# 3 9 9 1
# 7 7 7 7

Elena Stepanova (elenst) on 2012-04-11
Changed in maria:
importance: Undecided → High
assignee: nobody → Sergey Petrunia (sergefp)
milestone: none → 5.3
Elena Stepanova (elenst) on 2012-04-11
tags: added: optimizer wrong-result
Sergey Petrunia (sergefp) wrote :

* The problem can be repeated with set @@join_cache_level=0;
* There seems to be a difference between EXPLAIN and actual execution: EXPLAIN statement will pick Duplicate-Weedout strategy, while the SELECT itself will use LooseScan.
* If it's using Loose Scan, how is it using it for derived table? (which is expected to be of type=heap and have a HASH index? I didn't check this, I'll need to do)
* The produced dataset, 13 rows - is a join (i.e. duplicates are not removed).

* different optimization of EXPLAIN SELECT and SELECT is also a bug, I'll need to investigate.

Changed in maria:
status: New → In Progress
Changed in maria:
status: In Progress → Fix Committed
Elena Stepanova (elenst) wrote :

Fix released in 5.3.7 and 5.5.24

Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers