Wrong result (extra rows) with derived_with_keys+loosescan+semijoin=ON, materialization=OFF, IN subquery from a temptable view

Bug #978479 reported by Elena Stepanova
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
High
Sergey Petrunia

Bug Description

The following query

SELECT * FROM t1 AS t1_1, t1 AS t1_2
WHERE (t1_1.a, t1_2.a) IN ( SELECT a, b FROM v );

on the test data produces 13 rows with semijoin+loosescan+derived_with_keys and v being a temptable view, and 10 rows when not all of these conditions are met. 10 rows is correct.

 bzr version-info
revision-id: <email address hidden>
date: 2012-04-07 17:27:00 -0700
build-date: 2012-04-11 04:54:24 +0400
revno: 3500

Notes:
Reproducible on MariaDB 5.5 revno 3364.
Could not reproduce on MySQL trunk, although maybe I was trying wrong optimizer switches.
Could not reproduce by replacing the view with a table.

Minimal optimizer_switch: derived_with_keys=on,loosescan=on,semijoin=on,materialization=off
Full optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

EXPLAIN with the minimal optimizer switch (semijoin+loosescan+derived_with_keys=ON, everything else OFF):

id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY t1_1 ALL NULL NULL NULL NULL 11 100.00 Using where
1 PRIMARY <derived3> ref key0 key0 5 test.t1_1.a 2 100.00 Start temporary
1 PRIMARY t1_2 ALL NULL NULL NULL NULL 11 100.00 Using where; End temporary; Using join buffer (flat, BNL join)
3 DERIVED t1 ALL NULL NULL NULL NULL 11 100.00
Warnings:
Note 1003 select count(0) AS `COUNT(*)` from `test`.`t1` `t1_1` semi join (`test`.`v`) join `test`.`t1` `t1_2` where ((`test`.`t1_2`.`a` = `v`.`b`) and (`v`.`a` = `test`.`t1_1`.`a`))

Test case:

SET optimizer_switch = 'derived_with_keys=on,loosescan=on,semijoin=on,materialization=off';

CREATE TABLE t1 ( a INT, b INT );
INSERT INTO t1 VALUES
  (4,0),(6,8),(3,1),(5,8),(3,9),(2,4),
  (2,6),(9,1),(5,4),(7,7),(5,4);

CREATE ALGORITHM=TEMPTABLE
  VIEW v AS SELECT * FROM t1;

SELECT * FROM t1 AS t1_1, t1 AS t1_2
  WHERE (t1_1.a, t1_2.a) IN ( SELECT a, b FROM v );

# End of test case

# Expected result:
# a b a b
# -------------------
# 5 8 4 0
# 2 4 4 0
# 2 6 4 0
# 5 4 4 0
# 5 4 4 0
# 2 4 6 8
# 2 6 6 8
# 3 1 9 1
# 3 9 9 1
# 7 7 7 7

# Actual result:
# a b a b
# -------------------
# 5 8 4 0
# 5 8 4 0
# 2 4 4 0
# 2 6 4 0
# 5 4 4 0
# 5 4 4 0
# 5 4 4 0
# 5 4 4 0
# 2 4 6 8
# 2 6 6 8
# 3 1 9 1
# 3 9 9 1
# 7 7 7 7

Elena Stepanova (elenst)
Changed in maria:
importance: Undecided → High
assignee: nobody → Sergey Petrunia (sergefp)
milestone: none → 5.3
Elena Stepanova (elenst)
tags: added: optimizer wrong-result
Revision history for this message
Sergey Petrunia (sergefp) wrote :

* The problem can be repeated with set @@join_cache_level=0;
* There seems to be a difference between EXPLAIN and actual execution: EXPLAIN statement will pick Duplicate-Weedout strategy, while the SELECT itself will use LooseScan.
* If it's using Loose Scan, how is it using it for derived table? (which is expected to be of type=heap and have a HASH index? I didn't check this, I'll need to do)
* The produced dataset, 13 rows - is a join (i.e. duplicates are not removed).

* different optimization of EXPLAIN SELECT and SELECT is also a bug, I'll need to investigate.

Changed in maria:
status: New → In Progress
Changed in maria:
status: In Progress → Fix Committed
Revision history for this message
Elena Stepanova (elenst) wrote :

Fix released in 5.3.7 and 5.5.24

Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.