Wrong result (missing rows) with simple join and WHERE with = and ANDs, no subqueries, MyISAM or Aria

Bug #954900 reported by Peter (Stig) Edwards
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Critical
Igor Babaev

Bug Description

Thank you for MariaDB 5.3.5-ga

I will try and attach the reproducer, it is 34MB.

MariaDB 5.3.5-ga from mariadb-5.3.5-ga-Linux-x86_64.tar.gz on Linux 2.6.32-220.el6.x86_64 x86_64 x86_64 x86_64 GNU/Linux

There are 5 tables being joined in the reproducer, MariaDB 5.3.5-ga has an EXPLAIN plan that differs from MySQL 5.1.48 and 5.5.17, and Percona Server 5.5.16-rel22.0. MariabDB 5.3.5-ga returns 0 rows, 26 are expected. In particular MariaDB 5.3.5-ga seems to be referencing different tables in the last 2 joins.

Cheers

Revision history for this message
Peter (Stig) Edwards (thatsafunnyname) wrote :
Revision history for this message
Elena Stepanova (elenst) wrote :
Download full text (8.2 KiB)

Reduced test case (from the original SQL) below.

bzr version-info
revision-id: <email address hidden>
date: 2012-03-13 13:49:18 -0700
build-date: 2012-03-15 03:12:31 +0400
revno: 3459

Also reproducible on MariaDB 5.5 (revno 3319).
Not reproducible on MariaDB 5.2, MySQL 5.1.61, 5.5.21, trunk.

Reproducible with the default optimizer_switch, as well as with all OFF values, MyISAM or Aria tables.

EXPLAIN with all OFFs (wrong result):

id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE DU system dog_id NULL NULL NULL 1 100.00
1 SIMPLE D system PRIMARY NULL NULL NULL 1 100.00
1 SIMPLE DSAR system NULL NULL NULL NULL 1 100.00
1 SIMPLE DT range t_id t_id 2 NULL 2 100.00 Using where
1 SIMPLE DSA ref PRIMARY PRIMARY 8 const,db4.DT.t_id,db4.D.birthday 1 100.00Using index
Warnings:
Note 1003 select 5918 AS `dog_id`,5918 AS `dog_id`,'2004-07-22' AS `birthday`,`db4`.`DT`.`dog_id` AS `dog_id`,`db4`.`DT`.`t_id` AS `t_id`,`db4`.`DT`.`birthday` AS `birthday`,`db4`.`DSA`.`dog_id` AS `dog_id`,`db4`.`DSA`.`t_id` AS `t_id`,`db4`.`DSA`.`birthday` AS `birthday`,`db4`.`DSA`.`a_id` AS `a_id`,5918 AS `dog_id` from `db4`.`t5` `DU` join `db4`.`t1` `D` join `db4`.`t4` `DT` join `db4`.`t2` `DSA` join `db4`.`t3` `DSAR` where ((`db4`.`DSA`.`dog_id` = 5918) and (`db4`.`DSA`.`t_id` = `db4`.`DT`.`t_id`) and (`db4`.`DT`.`birthday` = '2004-07-22') and (`db4`.`DSA`.`birthday` = '2004-07-22') and (`db4`.`DT`.`dog_id` = 5918) and (5918 = 5918))
SELECT * FROM t5 DU, t1 D, t4 DT, t2 DSA, t3 DSAR

EXPLAIN with the default optimizer_switch (wrong result, too):

id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE DU system dog_id NULL NULL NULL 1 100.00
1 SIMPLE D system PRIMARY NULL NULL NULL 1 100.00
1 SIMPLE DSAR system NULL NULL NULL NULL 1 100.00
1 SIMPLE DT range t_id t_id 2 NULL 2 100.00 Using index condition; Using where
1 SIMPLE DSA ref PRIMARY PRIMARY 8 const,db4.DT.t_id,db4.D.birthday 1 100.00Using index
Warnings:
Note 1003 select 5918 AS `dog_id`,5918 AS `dog_id`,'2004-07-22' AS `birthday`,`db4`.`DT`.`dog_id` AS `dog_id`,`db4`.`DT`.`t_id` AS `t_id`,`db4`.`DT`.`birthday` AS `birthday`,`db4`.`DSA`.`dog_id` AS `dog_id`,`db4`.`DSA`.`t_id` AS `t_id`,`db4`.`DSA`.`birthday` AS `birthday`,`db4`.`DSA`.`a_id` AS `a_id`,5918 AS `dog_id` from `db4`.`t5` `DU` join `db4`.`t1` `D` join `db4`.`t4` `DT` join `db4`.`t2` `DSA` join `db4`.`t3` `DSAR` where ((`db4`.`DSA`.`dog_id` = 5918) and (`db4`.`DSA`.`t_id` = `db4`.`DT`.`t_id`) and (`db4`.`DT`.`birthday` = '2004-07-22') and (`db4`.`DSA`.`birthday` = '2004-07-22') and (`db4`.`DT`.`dog_id` = 5918) and (5918 = 5918))
SELECT * FROM t5 DU, t1 D, t4 DT, t2 DSA, t3 DSAR

# Test case:

DROP DATABASE IF EXISTS db4;
CREATE DATABASE db4;
USE db4;
CREATE TABLE t1 (
  dog_...

Read more...

Changed in maria:
milestone: none → 5.3
assignee: nobody → Timour Katchaounov (timour)
Elena Stepanova (elenst)
summary: - Wrong result (34MB reproducer)
+ Wrong result (missing rows) with simple join and WHERE with = and ANDs,
+ no subqueries, MyISAM or Aria
Changed in maria:
status: New → In Progress
Changed in maria:
importance: Undecided → High
Michael Widenius (monty)
Changed in maria:
importance: High → Critical
Revision history for this message
Timour Katchaounov (timour) wrote :
Download full text (3.9 KiB)

The following query allows to show the problem in a more stable way across different server versions:

set @@optimizer_switch='index_condition_pushdown=off';

explain extended
SELECT count(*) FROM t5 straight_join t1 straight_join t3 straight_join t4 straight_join t2
WHERE t5.dog_id = t1.dog_id
  AND t1.dog_id = t4.dog_id
  AND t1.birthday = t4.birthday
  AND t4.t_id = t2.t_id
  AND t4.birthday = t2.birthday
  AND t2.dog_id = t3.dog_id;

ICP is switched off to make comparison easier with 5.2, as well as to rule it out as a possible
cause and simplify the query plan. I will use the above query in the rest of the analysis.

MySQL 5.6.4 shows the closest query plan that works correctly compared to 5.3.

In 5.3 the plan is:

+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+----------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+----------+------------------------------------+
| 1 | SIMPLE | t5 | system | dog_id | NULL | NULL | NULL | 1 | 100.00 | |
| 1 | SIMPLE | t1 | system | PRIMARY | NULL | NULL | NULL | 1 | 100.00 | |
| 1 | SIMPLE | t3 | system | NULL | NULL | NULL | NULL | 1 | 100.00 | |
| 1 | SIMPLE | t4 | range | t_id | t_id | 2 | NULL | 2 | 100.00 | Using index condition; Using where |
| 1 | SIMPLE | t2 | ref | PRIMARY | PRIMARY | 8 | const,db4.t4.t_id,db4.t1.birthday | 1 | 100.00 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+----------+------------------------------------+

While in MySQL 5.6.4 the query plan is:

+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+----------+--------------------------+
| 1 | SIMPLE | t5 | system | dog_id | NULL | NULL | NULL | 1 | 100.00 | |
| 1 | SIMPLE | t1 | system | PRIMARY | NULL | NULL | NULL | 1 | 100.00 | |
| 1 | SIMPLE | t3 | system | NULL | NULL | NULL | NULL | 1 | 100.00 | |
| 1 | SIMP...

Read more...

Revision history for this message
Timour Katchaounov (timour) wrote :

Parallel debugging that compares execution of MariaDB 5.3 and MySQL 5.6.4
shows the following:

= The second call to sub_select() accesses table 't2'.

= The call:
  error= (*join_tab->read_first_record)(join_tab);
returns 0 in MySQL (key found), but 120 in MariaDB (key not found)

= The cause is result of incorrect contents of the key buffer. I think
  so because if we manually replace ref.key_buff[4] with the contents
  it has in MySQL, then the key is found correctly.

= The key buffer of the ref access method is filled by the following
  call chain:

#0 store_key_field::copy_inner
#1 store_key::copy
#2 cp_buffer_from_ref
#3 join_read_always_key
#4 sub_select
...

= Investigation of store_key_field::copy_inner shows that there is a
  difference in the second field of the key.
- MySQL copies t4.t_id into t2.t_id.
- MariaDB copies t1.birthday into t2.t_id.

This is clearly wrong. The rest of the analysis will figure out why
store_key_field.copy_field->from_field is incorrect (and possibly the
rest of the store_key objects contained in the join_tab->ref->key_copy
array.

Changed in maria:
assignee: Timour Katchaounov (timour) → Igor Babaev (igorb-seattle)
Changed in maria:
status: In Progress → Fix Committed
Revision history for this message
Peter (Stig) Edwards (thatsafunnyname) wrote :

Thank you very much. I can confirm the fix stops the problem using the other (large) reproducers I have when building revision 3467.

Myself and others shall continue testing MariaDB 5.3 using:

http://terrier.askmonty.org/archive/pack/5.3/build-1924/kvm-bintar-hardy-amd64/mariadb-5.3.5-ga-Linux-x86_64.tar.gz

from:

http://buildbot.askmonty.org/buildbot/builders/kvm-tarbake-jaunty-x86/builds/1924

Revision history for this message
Elena Stepanova (elenst) wrote :

Fix released in 5.3.6

Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.