Wrong result with in_to_exists=ON and NOT IN

Bug #869036 reported by Philip Stoev
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Critical
Timour Katchaounov

Bug Description

This query:

SELECT *
FROM t1 , t2
WHERE ( t1.f2 , t2.f3 ) NOT IN (
 SELECT t3.f3 , t4.f3
 FROM t3 STRAIGHT_JOIN t4
 ON ( t4.f1 = t3.f1 )
);

returns 1 row when run with in_to_exists:

| g | c |

and 2 rows otherwise:

| NULL | c |
| g | c |

explain:

| 1 | PRIMARY | t2 | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 2 | Using where |
| 2 | DEPENDENT SUBQUERY | t3 | ALL | NULL | NULL | NULL | NULL | 17 | Using where |
| 2 | DEPENDENT SUBQUERY | t4 | ALL | NULL | NULL | NULL | NULL | 43 | Using where |

minimal optimizer switch: in_to_Exists=on,materialization=off
full switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

also affects mysql 5.5

test case:

CREATE TABLE t1 ( f2 varchar(1) ) ;
INSERT INTO t1 VALUES (NULL),('g');

CREATE TABLE t2 ( f3 varchar(1) ) ;
INSERT INTO t2 VALUES ('c');

CREATE TABLE t3 ( f1 int(11), f3 varchar(1) ) ;
INSERT INTO t3 VALUES (4,'v'),(7,'c'),(0,'x'),(7,'i'),(7,'e'),(1,'p'),(7,'s'),(1,'j'),(5,'z'),(2,'c'),(0,'a'),(1,'q'),(8,'y'),(1,'r'),(9,'v'),(1,NULL),(5,'r');

DROP TABLE IF EXISTS t4;
CREATE TABLE t4 ( f1 int(11), f3 varchar(1) ) ;
INSERT INTO t4 VALUES (1,'x'),(4,'l'),(1,'i'),(0,'i'),(0,'e'),(9,'h'),(5,'f'),(9,NULL),(0,'p'),(2,'n'),(5,'x'),(1,'d'),(4,'t'),(5,'u'),(5,'p'),(1,'o'),(7,'v'),(8,'x'),(2,'b'),(7,'m'),(4,'v'),(0,'j'),(8,NULL),(5,NULL),(8,'h'),(2,'k'),(9,'k'),(7,NULL),(5,'n'),(7,'e'),(0,'s'),(4,'w'),(1,'z'),(0,'b'),(5,'i'),(1,'t'),(2,'g'),(1,'q'),(7,'l'),(1,'n'),(9,'z'),(9,'n'),(8,'r');

SET SESSION optimizer_switch='in_to_exists=on,materialization=off';

SELECT *
FROM t1 , t2
WHERE ( t1.f2 , t2.f3 ) NOT IN (
 SELECT t3.f3 , t4.f3
 FROM t3 STRAIGHT_JOIN t4
 ON ( t4.f1 = t3.f1 )
);

bzr version-info:

revision-id: <email address hidden>
date: 2011-10-06 01:21:15 +0400
build-date: 2011-10-06 13:02:50 +0300
revno: 3213
branch-nick: maria-5.3

Related branches

Changed in maria:
milestone: none → 5.3
assignee: nobody → Timour Katchaounov (timour)
Changed in maria:
status: New → In Progress
importance: Undecided → Critical
Revision history for this message
Timour Katchaounov (timour) wrote :

The problem can be demonstrate with the following very simple test case:

create table outer_sq (f1 char(1), f2 char(1));
insert into outer_sq values (NULL, 'c'), ('g', 'c');

create table inner_sq (f3 char(1), f4 char(1));
insert into inner_sq values(null, 'i'), ('v', null);

set @@optimizer_switch='in_to_exists=off,materialization=on,partial_match_rowid_merge=on,partial_match_table_scan=off';
SELECT outer_sq.*, (f1, f2) IN (select * from inner_sq) from outer_sq;
SELECT * from outer_sq where (f1, f2) NOT IN (select * from inner_sq);

set @@optimizer_switch='in_to_exists=off,materialization=on,partial_match_rowid_merge=off,partial_match_table_scan=on';
SELECT outer_sq.*, (f1, f2) IN (select * from inner_sq) from outer_sq;
SELECT * from outer_sq where (f1, f2) NOT IN (select * from inner_sq);

set @@optimizer_switch='in_to_exists=on,materialization=off';
SELECT outer_sq.*, (f1, f2) IN (select * from inner_sq) from outer_sq;
SELECT * from outer_sq where (f1, f2) NOT IN (select * from inner_sq);

Revision history for this message
Timour Katchaounov (timour) wrote :

There is a partial match between the outer row (NULL, 'c') and the inner row ('v', null).
Therefore the correct result of the IN predicate is NULL, and the strategy that produces
an incorrect result is partial_match_rowid_merge.

Changed in maria:
status: In Progress → Fix Committed
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.