Wrong result with semijoin and a specific query plan

Bug #833572 reported by Philip Stoev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Incomplete
High
Sergey Petrunia

Bug Description

The following query:

SELECT alias1.col_varchar_key FROM t2 AS alias1 , t3 AS alias2 , t1 AS alias3 WHERE alias3.col_int_nokey IN ( SELECT col_int_key FROM v1 AS SQ1_alias1 WHERE SQ1_alias1.col_varchar_key < alias3.col_varchar_nokey OR SQ1_alias1.col_int_key < alias2.pk );

returns 1 extra row when using the following plan:

1 PRIMARY C ref col_int_key,col_varchar_key col_int_key 4 test.alias3.col_int_nokey 3
1 PRIMARY alias1 index NULL col_varchar_key 7 NULL 20 Using index; Using join buffer (flat, BNL join)
1 PRIMARY alias2 index PRIMARY PRIMARY 4 NULL 100 Using where; Using index; End temporary; Using join buffer (f
1 PRIMARY alias3 ALL NULL NULL NULL NULL 20 Start temporary

Usually, some other plan is chosen and there is no way to force this particular plan.

Dataset:

CREATE TABLE t1 ( pk int(11) NOT NULL AUTO_INCREMENT, col_int_nokey int(11) NOT NULL, col_int_key int(11) NOT NULL, col_varchar_key varchar(1) NOT NULL,col_varchar_nokey varchar(1) NOT NULL, PRIMARY KEY (pk), KEY col_int_key (col_int_key), KEY col_varchar_key (col_varchar_key,col_int_key)) ENGINE=MyISAM AUTO_INCREMENT=30 DEFAULT CHARSET=latin1;
CREATE OR REPLACE VIEW v1 AS SELECT * FROM t1;
INSERT INTO t1 VALUES (10,1,7,'v','v'),(11,7,0,'s','s'),(12,4,9,'l','l'),(13,7,3,'y','y'),(14,0,4,'c','c'),(15,2,2,'i','i'),(16,9,5,'h','h'),(17,4,3,'q','q'),(18,0,1,'a','a'),(19,9,3,'v','v'),(20,1,6,'u','u'),(21,3,7,'s','s'),(22,8,5,'y','y'),(23,8,1,'z','z'),(24,18,204,'h','h'),(25,84,224,'p','p'),(26,6,9,'e','e'),(27,3,5,'i','i'),(28,6,0,'y','y'),(29,6,3,'w','w');
CREATE TABLE t2 ( pk int(11) NOT NULL AUTO_INCREMENT, col_int_nokey int(11) NOT NULL, col_int_key int(11) NOT NULL, col_varchar_key varchar(1) NOT NULL,col_varchar_nokey varchar(1) NOT NULL, PRIMARY KEY (pk), KEY col_int_key (col_int_key), KEY col_varchar_key (col_varchar_key,col_int_key)) ENGINE=MyISAM AUTO_INCREMENT=21 DEFAULT CHARSET=latin1;
INSERT INTO t2 VALUES (1,4,0,'j','j'),(2,6,8,'v','v'),(3,3,1,'c','c'),(4,5,8,'m','m'),(5,3,9,'d','d'),(6,246,24,'d','d'),(7,2,6,'y','y'),(8,9,1,'t','t'),(9,3,6,'d','d'),(10,8,2,'s','s'),(11,1,4,'r','r'),(12,8,8,'m','m'),(13,8,4,'b','b'),(14,5,4,'x','x'),(15,7,7,'g','g'),(16,5,4,'p','p'),(17,1,1,'q','q'),(18,6,9,'w','w'),(19,2,4,'d','d'),(20,9,8,'e','e');
CREATE TABLE t3 ( pk int(11) NOT NULL AUTO_INCREMENT, col_int_nokey int(11) NOT NULL, col_int_key int(11) NOT NULL, col_varchar_key varchar(1) NOT NULL,col_varchar_nokey varchar(1) NOT NULL, PRIMARY KEY (pk), KEY col_int_key (col_int_key), KEY col_varchar_key (col_varchar_key,col_int_key)) ENGINE=MyISAM AUTO_INCREMENT=101 DEFAULT CHARSET=latin1;
INSERT INTO t3 VALUES (1,4,4,'b','b'),(2,5,8,'y','y'),(3,2,0,'p','p'),(4,5,0,'f','f'),(5,1,0,'p','p'),(6,1,7,'d','d'),(7,2,7,'f','f'),(8,2,5,'j','j'),(9,2,3,'e','e'),(10,197,188,'u','u'),(11,4,4,'v','v'),(12,5,9,'u','u'),(13,3,6,'i','i'),(14,1,1,'x','x'),(15,2,5,'l','l'),(16,7,6,'q','q'),(17,6,2,'n','n'),(18,1,4,'r','r'),(19,156,231,'c','c'),(20,8,4,'h','h'),(21,7,3,'k','k'),(22,2,3,'t','t'),(23,6,7,'t','t'),(24,2,6,'k','k'),(25,1,7,'g','g'),(26,0,9,'z','z'),(27,7,4,'n','n'),(28,5,4,'j','j'),(29,7,2,'l','l'),(30,2,1,'d','d'),(31,1,2,'t','t'),(32,80,194,'y','y'),(33,3,2,'i','i'),(34,8,3,'j','j'),(35,5,8,'r','r'),(36,0,4,'b','b'),(37,9,9,'o','o'),(38,9,4,'k','k'),(39,7,5,'a','a'),(40,0,5,'f','f'),(41,5,9,'t','t'),(42,6,3,'c','c'),(43,9,8,'c','c'),(44,3,0,'r','r'),(45,91,98,'k','k'),(46,6,3,'l','l'),(47,7,1,'o','o'),(48,3,0,'t','t'),(49,161,189,'v','v'),(50,7,8,'x','x'),(51,7,3,'j','j'),(52,6,3,'x','x'),(53,5,9,'k','k'),(54,8,6,'o','o'),(55,7,8,'z','z'),(56,2,3,'n','n'),(57,1,9,'c','c'),(58,3,5,'d','d'),(59,6,9,'s','s'),(60,6,2,'j','j'),(61,5,2,'w','w'),(62,0,5,'f','f'),(63,7,8,'p','p'),(64,7,6,'o','o'),(65,6,9,'f','f'),(66,0,0,'x','x'),(67,0,3,'q','q'),(68,8,6,'g','g'),(69,0,5,'x','x'),(70,4,8,'p','p'),(71,0,2,'q','q'),(72,213,120,'q','q'),(73,248,25,'v','v'),(74,1,1,'g','g'),(75,6,3,'l','l'),(76,6,1,'w','w'),(77,3,3,'h','h'),(78,140,153,'c','c'),(79,0,5,'o','o'),(80,7,9,'o','o'),(81,6,1,'v','v'),(82,6,8,'y','y'),(83,8,7,'d','d'),(84,5,6,'p','p'),(85,8,2,'z','z'),(86,7,4,'t','t'),(87,3,7,'b','b'),(88,7,3,'y','y'),(89,3,8,'k','k'),(90,8,4,'c','c'),(91,0,6,'z','z'),(92,1,1,'t','t'),(93,3,7,'o','o'),(94,6,1,'u','u'),(95,8,0,'t','t'),(96,1,2,'k','k'),(97,1,7,'u','u'),(98,9,2,'b','b'),(99,0,1,'m','m'),(100,6,5,'o','o');

bzr version-info:

revision-id: <email address hidden>
date: 2011-08-23 15:51:47 +0300
build-date: 2011-08-25 10:39:21 +0300
revno: 3166
branch-nick: maria-5.3

optimizer switch:

index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=off,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on

Changed in maria:
milestone: none → 5.3
assignee: nobody → Sergey Petrunia (sergefp)
Changed in maria:
importance: Undecided → High
Revision history for this message
Sergey Petrunia (sergefp) wrote :

The posted query plan :

1 PRIMARY C ref col_int_key,col_varchar_key col_int_key 4 test.alias3.col_int_nokey 3
1 PRIMARY alias1 index NULL col_varchar_key 7 NULL 20 Using index; Using join buffer (flat, BNL join)
1 PRIMARY alias2 index PRIMARY PRIMARY 4 NULL 100 Using where; Using index; End temporary; Using join buffer (f
1 PRIMARY alias3 ALL NULL NULL NULL NULL 20 Start temporary

has a pecularity:

line#3: ... alias2 ... End temporary;
line#4: ... alias3 ... Start temporary

Did the query plan really have "End temporary" before the "Start temporary" ?

Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

I looked at the original log and yes, start temporary appears after end temporary. I am attaching the log. The issue described starts on line 317.

If it is not possible to reproduce this join order in the debugger, I am OK with deferring this bug until it happens again.

If it is incorrect for "End temporary" to come before "Start temporary", please add an assertion that detects this situation.

Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

The query for the plan above is:
SELECT alias1 .`col_varchar_key` FROM C AS alias1 , D AS alias2 , CC AS alias3 WHERE alias3 .`col_int_nokey` IN ( SELECT `col_int_key` FROM view_C AS SQ1_alias1 WHERE SQ1_alias1 .`col_varchar_key` < alias3 .`col_varchar_nokey` OR SQ1_alias1 .`col_int_key` < alias2 .`pk` ) ;

Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

Original dataset

Revision history for this message
Sergey Petrunia (sergefp) wrote :

I get 31640 records both with and without semi-join.
I couldn't get the optimizer to put table C on the first place in the join order (like it is shown in the posted EXPLAIN).

Revision history for this message
Sergey Petrunia (sergefp) wrote :

Due to above, changing status to Incomplete.

Changed in maria:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.