key_infix_len can be overwritten causing some group by queries to return no rows

Bug #309547 reported by Eric Bergen
4
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
Medium
Joe Daly

Bug Description

Description:
select distinct foo from table where bar=N can return no rows found where select foo from
table where bar=N returns rows.

In opt_range.cc:get_best_group_min_max() the variable key_infix_len is passed by ref to
get_constant_key_infix(). It's initialized to 0 in get_constant_key_infix. When choosing
an index if two indexes are tested a, and b and a is the index that gets chosen
key_infix_len can be set to zero by the test for index b. This causes the resulting query
to return zero results.

How to repeat:
DROP TABLE IF EXISTS `distinct_fail`;
CREATE TABLE `distinct_fail` (
  `a` int NOT NULL,
  `b` int NOT NULL,
  `c` int NOT NULL,
  `d` int NOT NULL,
  KEY `foo` (`c`,`d`,`a`,`b`),
  KEY `bar` (`c`,`a`,`b`,`d`)
) ENGINE=InnoDB;

INSERT INTO `distinct_fail` VALUES
  (0,0,0,3),
  (0,0,0,4),
  (0,0,0,4),
  (0,0,0,5),
  (0,0,1,0),
  (0,0,1,0),
  (0,0,1,1),
  (0,0,1,1),
  (0,0,1,2),
  (0,0,1,3),
  (0,0,1,4),
  (0,0,1,5),
  (0,0,2,1),
  (0,0,2,2),
  (0,0,2,2),
  (0,0,2,3),
  (0,0,2,3),
  (0,0,2,4),
  (0,0,2,4),
  (0,0,3,0),
  (0,0,3,0),
  (0,0,3,1),
  (0,0,3,1),
  (0,0,3,1),
  (0,0,3,1),
  (0,0,3,3),
  (0,0,3,3),
  (0,0,3,3),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,5),
  (0,0,4,0),
  (0,0,4,0),
  (0,0,4,0),
  (0,0,4,1),
  (0,0,4,1),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,4),
  (0,0,4,4),
  (0,0,5,0),
  (0,0,5,0),
  (0,0,5,1),
  (0,0,5,3);

analyze table distinct_fail;
select 'This next query should return rows' as message;
select c from distinct_fail where d=4;

select 'If the bug exists this next query will be an empty set' as message;
select distinct c from distinct_fail where d=4;

select 'This is what the result should be' as message;
select distinct c from distinct_fail where d=4 order by a;

Suggested fix:
I have a hacky patch that saves the last known good key_infix_len and uses it if it's
available. I think the code should be refactored not to pass the key_infix_len by
reference. The patch is available here:

http://ebergen.net/patches/distinct_hacky_fix.patch

Revision history for this message
Padraig O'Sullivan (posulliv) wrote :

I was able to reproduce this behavior:

drizzle> create table dis (
    -> a int not null,
    -> b int not null,
    -> c int not null,
    -> d int not null,
    -> key foo (c, d, a, b),
    -> key bar (c, a, b, d)
    -> ) engine=InnoDB;
Query OK, 0 rows affected (0.01 sec)

drizzle> insert into dis values
    -> (0,0,0,3),
    -> (0,0,0,4),
    -> (0,0,0,4),
    -> (0,0,0,5),
    -> (0,0,1,0),
    -> (0,0,1,0),
    -> (0,0,1,1),
    -> (0,0,1,1),
    -> (0,0,1,2),
    -> (0,0,1,3),
    -> (0,0,1,4),
    -> (0,0,1,5),
    -> (0,0,2,1),
    -> (0,0,2,2),
    -> (0,0,2,2),
    -> (0,0,2,3),
    -> (0,0,2,3),
    -> (0,0,2,4),
    -> (0,0,2,4),
    -> (0,0,3,0),
    -> (0,0,3,0),
    -> (0,0,3,1),
    -> (0,0,3,1),
    -> (0,0,3,1),
    -> (0,0,3,1),
    -> (0,0,3,3),
    -> (0,0,3,3),
    -> (0,0,3,3),
    -> (0,0,3,4),
    -> (0,0,3,4),
    -> (0,0,3,4),
    -> (0,0,3,4),
    -> (0,0,3,4),
    -> (0,0,3,5),
    -> (0,0,4,0),
    -> (0,0,4,0),
    -> (0,0,4,0),
    -> (0,0,4,1),
    -> (0,0,4,1),
    -> (0,0,4,3),
    -> (0,0,4,3),
    -> (0,0,4,3),
    -> (0,0,4,3),
    -> (0,0,4,3),
    -> (0,0,4,4),
    -> (0,0,4,4),
    -> (0,0,5,0),
    -> (0,0,5,0),
    -> (0,0,5,1),
    -> (0,0,5,3);
Query OK, 50 rows affected (0.01 sec)
Records: 50 Duplicates: 0 Warnings: 0

drizzle> analyze table dis;
+----------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+----------+---------+----------+----------+
| test.dis | analyze | status | OK |
+----------+---------+----------+----------+
1 row in set (0.00 sec)

drizzle> select c from dis where d = 4;
+---+
| c |
+---+
| 0 |
| 0 |
| 1 |
| 2 |
| 2 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 4 |
+---+
12 rows in set (0.00 sec)

drizzle> select distinct c from dis where d = 4;
Empty set (0.00 sec)

drizzle> select distinct c from dis where d = 4 order by a;
+---+
| c |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
+---+
5 rows in set (0.00 sec)

drizzle>

Thanks, Padraig.

Changed in drizzle:
status: New → Confirmed
Changed in drizzle:
importance: Undecided → Medium
milestone: none → aloha
Revision history for this message
Joe Daly (skinny.moey) wrote :

Ill take a look at this

Changed in drizzle:
assignee: nobody → Joe Daly (skinny.moey)
Revision history for this message
Joe Daly (skinny.moey) wrote :

This looks like it was fixed in the mysql stream back in Feb 09 http://lists.mysql.org/commits/67887?f=plain Ill go ahead and port over the applicable changes to get_best_group_min_max() in opt_range.cc

Revision history for this message
Joe Daly (skinny.moey) wrote :

code in main trunk now

Changed in drizzle:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.