Wrong result with materialization and semijoin, and valgrind warnings in Protocol::net_store_data with materialization for implicit grouping

Bug #613029 reported by Philip Stoev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
High
Sergey Petrunia

Bug Description

Queries such as

SELECT table1 .`col_varchar_nokey` , MAX( table1 .`col_int_key` )
FROM C table1 JOIN (
SELECT *
FROM B ) table2 ON table2 .`pk` = table1 .`pk`
WHERE ( table1 .`col_int_nokey` , table2 .`pk` ) IN (
SELECT `col_int_nokey` , `col_int_key`
FROM C ) ;

SELECT *
FROM (
SELECT table1 .`col_varchar_nokey` , MAX( table1 .`col_int_key` )
FROM C table1 JOIN (
SELECT *
FROM B ) table2 ON table2 .`pk` = table1 .`pk`
WHERE ( table1 .`col_int_nokey` , table2 .`pk` ) IN (
SELECT `col_int_nokey` , `col_int_key`
FROM C ) ) AS
FROM_SUBQUERY /* TRANSFORM_OUTCOME_UNORDERED_MATCH */;

cause valgrind warnings when run with semijoin=off,partial_match_rowid_merge=off. With a suitable client, one can also observe garbage data being sent to the client.

Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :
Download full text (6.3 KiB)

Warnings:

==13550== Conditional jump or move depends on uninitialised value(s)
==13550== at 0x40074E7: memcpy (mc_replace_strmem.c:77)
==13550== by 0x823F3F6: String::append(char const*, unsigned int, unsigned int) (sql_string.h:370)
==13550== by 0x827440B: Protocol_text::store_null() (protocol.cc:849)
==13550== by 0x8274CE1: Protocol_text::store(Field*) (protocol.cc:1010)
==13550== by 0x81C619C: Item_field::send(Protocol*, String*) (item.cc:5807)
==13550== by 0x8267CF5: select_send::send_data(List<Item>&) (sql_class.cc:1871)
==13550== by 0x831F394: end_send(JOIN*, st_join_table*, bool) (sql_select.cc:14063)
==13550== by 0x831C4DC: do_select(JOIN*, List<Item>*, st_table*, Procedure*) (sql_select.cc:12610)
==13550== by 0x83029F2: JOIN::exec() (sql_select.cc:2355)
==13550== by 0x8303128: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2556)
==13550== by 0x82FB71A: handle_select(THD*, st_lex*, select_result*, unsigned long) (sql_select.cc:276)
==13550== by 0x829A163: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5081)
==13550== by 0x8290B3B: mysql_execute_command(THD*) (sql_parse.cc:2265)
==13550== by 0x829C324: mysql_parse(THD*, char const*, unsigned int, char const**) (sql_parse.cc:6027)
==13550== by 0x828E5F1: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1184)
==13550== by 0x828DADF: do_command(THD*) (sql_parse.cc:890)

==13550== Conditional jump or move depends on uninitialised value(s)
==13550== at 0x827102B: my_net_write (net_serv.cc:386)
==13550== by 0x8274227: Protocol::write() (protocol.cc:772)
==13550== by 0x8267E1C: select_send::send_data(List<Item>&) (sql_class.cc:1890)
==13550== by 0x831F394: end_send(JOIN*, st_join_table*, bool) (sql_select.cc:14063)
==13550== by 0x831C4DC: do_select(JOIN*, List<Item>*, st_table*, Procedure*) (sql_select.cc:12610)
==13550== by 0x83029F2: JOIN::exec() (sql_select.cc:2355)
==13550== by 0x8303128: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:2556)
==13550== by 0x82FB71A: handle_select(THD*, st_lex*, select_result*, unsigned long) (sql_select.cc:276)
==13550== by 0x829A163: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:5081)
==13550== by 0x8290B3B: mysql_execute_command(THD*) (sql_parse.cc:2265)
==13550== by 0x829C324: mysql_parse(THD*, char const*, unsigned int, char const**) (sql_parse.cc:6027)
==13550== by 0x828E5F1: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1184)
==13550== by 0x828DADF: do_command(THD*) (sql_parse.cc:890)
==13550== by 0x828AC77: handle_one_connection (sql_connect.cc:1153)
==13550== by 0xA08918: start_thread (in /lib/libpthread-2.12.so)
==13550== by 0x951E5D: clone (in /lib/libc-2.12.so)

==13550== Syscall param write(buf) points to uninitialised byte(s)
==13550== ...

Read more...

Changed in maria:
assignee: nobody → Timour Katchaounov (timour)
milestone: none → 5.3
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :
Download full text (8.6 KiB)

Test case

--disable_abort_on_error
SET SESSION optimizer_switch = 'semijoin=off,partial_match_rowid_merge=off';
--enable_abort_on_error

--disable_warnings
DROP TABLE /*! IF EXISTS */ C;
DROP TABLE /*! IF EXISTS */ B;
--enable_warnings

CREATE TABLE `C` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  `col_int_nokey` int(11) NOT NULL,
  `col_int_key` int(11) NOT NULL,
  `col_varchar_key` varchar(1) NOT NULL,
  `col_varchar_nokey` varchar(1) NOT NULL,
  PRIMARY KEY (`pk`),
  KEY `col_int_key` (`col_int_key`),
  KEY `col_varchar_key` (`col_varchar_key`,`col_int_key`)
) ENGINE=MARIA AUTO_INCREMENT=21 DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1;
INSERT INTO `C` VALUES (1,2,9,'x','x');
INSERT INTO `C` VALUES (2,9,5,'g','g');
INSERT INTO `C` VALUES (3,6,1,'o','o');
INSERT INTO `C` VALUES (4,7,0,'g','g');
INSERT INTO `C` VALUES (5,0,1,'v','v');
INSERT INTO `C` VALUES (6,97,190,'m','m');
INSERT INTO `C` VALUES (7,3,6,'x','x');
INSERT INTO `C` VALUES (8,3,3,'c','c');
INSERT INTO `C` VALUES (9,4,4,'z','z');
INSERT INTO `C` VALUES (10,9,3,'i','i');
INSERT INTO `C` VALUES (11,101,186,'x','x');
INSERT INTO `C` VALUES (12,0,1,'g','g');
INSERT INTO `C` VALUES (13,8,8,'q','q');
INSERT INTO `C` VALUES (14,194,226,'m','m');
INSERT INTO `C` VALUES (15,148,133,'p','p');
INSERT INTO `C` VALUES (16,9,6,'e','e');
INSERT INTO `C` VALUES (17,9,3,'t','t');
INSERT INTO `C` VALUES (18,1,8,'j','j');
INSERT INTO `C` VALUES (19,1,5,'h','h');
INSERT INTO `C` VALUES (20,5,7,'w','w');
CREATE TABLE `B` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  `col_int_nokey` int(11) NOT NULL,
  `col_int_key` int(11) NOT NULL,
  `col_varchar_key` varchar(1) NOT NULL,
  `col_varchar_nokey` varchar(1) NOT NULL,
  PRIMARY KEY (`pk`),
  KEY `col_int_key` (`col_int_key`),
  KEY `col_varchar_key` (`col_varchar_key`,`col_int_key`)
) ENGINE=MARIA AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1;
INSERT INTO `B` VALUES (1,1,7,'p','p');

SELECT table1 .`col_varchar_nokey` , MAX( table1 .`col_int_key` )
FROM C table1 JOIN (
SELECT *
FROM B ) table2 ON table2 .`pk` = table1 .`pk`
WHERE ( table1 .`col_int_nokey` , table2 .`pk` ) IN (
SELECT `col_int_nokey` , `col_int_key`
FROM C ) ;

DROP TABLE C;
DROP TABLE B;

--disable_warnings
DROP TABLE /*! IF EXISTS */ C;
DROP TABLE /*! IF EXISTS */ B;
--enable_warnings

CREATE TABLE `C` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  `col_int_nokey` int(11) NOT NULL,
  `col_int_key` int(11) NOT NULL,
  `col_varchar_key` varchar(1) NOT NULL,
  `col_varchar_nokey` varchar(1) NOT NULL,
  PRIMARY KEY (`pk`),
  KEY `col_int_key` (`col_int_key`),
  KEY `col_varchar_key` (`col_varchar_key`,`col_int_key`)
) ENGINE=MARIA AUTO_INCREMENT=21 DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1;
INSERT INTO `C` VALUES (1,2,9,'x','x');
INSERT INTO `C` VALUES (2,9,5,'g','g');
INSERT INTO `C` VALUES (3,6,1,'o','o');
INSERT INTO `C` VALUES (4,7,0,'g','g');
INSERT INTO `C` VALUES (5,0,1,'v','v');
INSERT INTO `C` VALUES (6,97,190,'m','m');
INSERT INTO `C` VALUES (7,3,6,'x','x');
INSERT INTO `C` VALUES (8,3,3,'c','c');
INSERT INTO `C` VALUES (9,4,4,'z','z');
INSERT INTO `C` VALUES (10,9,3,'i','i');
INSERT INTO `C` VALUES (11,101,186,'x','x');
INSERT INTO `C` VALUES (12,0,1,'g','g');
INS...

Read more...

summary: - Valgrind warnings with semijoin=off,partial_match_rowid_merge=off
+ Valgrind warnings with semijoin=off
Changed in maria:
assignee: Timour Katchaounov (timour) → nobody
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote : Re: Valgrind warnings with semijoin=off

Bug is not observed with materialization=off, so I am assigning it to Timour.

Changed in maria:
assignee: nobody → Timour Katchaounov (timour)
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

Reproducible in maria-5.3-mwl90 as well.

summary: - Valgrind warnings with semijoin=off
+ Valgrind warnings in String::append with materialization
Changed in maria:
status: New → Confirmed
Revision history for this message
Timour Katchaounov (timour) wrote : Re: Valgrind warnings in String::append with materialization

These are the steps how I simplified the last test case:

DROP TABLE C;
DROP TABLE B;

CREATE TABLE C (
  pk int(11) NOT NULL AUTO_INCREMENT,
  col_int_nokey int(11) NOT NULL,
  col_int_key int(11) NOT NULL,
  col_varchar_key varchar(1) NOT NULL,
  col_varchar_nokey varchar(1) NOT NULL,
  PRIMARY KEY (pk),
  KEY col_int_key (col_int_key),
  KEY col_varchar_key (col_varchar_key,col_int_key)
) ENGINE=MARIA AUTO_INCREMENT=21 DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1;

INSERT INTO C VALUES (1,2,9,'x','x');
INSERT INTO C VALUES (2,9,5,'g','g');

CREATE TABLE B (
  pk int(11) NOT NULL AUTO_INCREMENT,
  col_int_nokey int(11) NOT NULL,
  col_int_key int(11) NOT NULL,
  col_varchar_key varchar(1) NOT NULL,
  col_varchar_nokey varchar(1) NOT NULL,
  PRIMARY KEY (pk),
  KEY col_int_key (col_int_key),
  KEY col_varchar_key (col_varchar_key,col_int_key)
) ENGINE=MARIA AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1;

INSERT INTO B VALUES (1,1,7,'p','p');

set @@optimizer_switch='materialization=on,semijoin=off';

-- original
SELECT *
FROM (SELECT table1.col_varchar_nokey, MAX(table1.col_int_key)
      FROM C table1 JOIN (SELECT * FROM B) table2 ON table2.pk = table1.pk
      WHERE (table1.col_int_nokey, table2.pk) IN (SELECT col_int_nokey, col_int_key FROM C))
      AS FROM_SUBQUERY;

-- v1
SELECT *
FROM (SELECT table1.col_varchar_nokey, MAX(table1.col_int_key)
      FROM C table1 JOIN B ON B.pk = table1.pk
      WHERE (table1.col_int_nokey, B.pk) IN (SELECT col_int_nokey, col_int_key FROM C))
      AS FROM_SUBQUERY;

-- v2 - extract the derived table
SELECT C.col_varchar_nokey, MAX(C.col_int_key)
FROM C JOIN B ON B.pk = C.pk
WHERE (C.col_int_nokey, B.pk) IN (SELECT col_int_nokey, col_int_key FROM C);

-- v3 - single column IN
SELECT C.col_varchar_nokey, MAX(C.col_int_key)
FROM C JOIN B ON B.pk = C.pk
WHERE B.pk IN (SELECT col_int_key FROM C);

-- v4 - no ON clause
SELECT C.col_varchar_nokey, MAX(C.col_int_key)
FROM C, B
WHERE (B.pk = C.pk) AND B.pk IN (SELECT col_int_key FROM C);

Revision history for this message
Timour Katchaounov (timour) wrote :

Simplified test case:

CREATE TABLE t1 (
  pk int(11) NOT NULL AUTO_INCREMENT,
  f2 int(11) NOT NULL,
  f3 varchar(1) NOT NULL,
  PRIMARY KEY (pk),
  KEY f2 (f2));

INSERT INTO t1 VALUES (1,9,'x');
INSERT INTO t1 VALUES (2,5,'g');

CREATE TABLE t2 (
  pk int(11) NOT NULL AUTO_INCREMENT,
  f2 int(11) NOT NULL,
  f3 varchar(1) NOT NULL,
  PRIMARY KEY (pk),
  KEY f2 (f2));

INSERT INTO t2 VALUES (1,7,'p');

set @@optimizer_switch='materialization=on,semijoin=off';

SELECT t1.f3, MAX(t1.f2)
FROM t1, t2
WHERE (t2.pk = t1.pk) AND t2.pk IN (SELECT f2 FROM t1);

Revision history for this message
Timour Katchaounov (timour) wrote :

Valgrind reports "Conditional jump or move depends on uninitialised value(s)"
for the following line: in Protocol::net_store_data at protocol.cc:51:

51 if (packet_length+9+length > packet->alloced_length() &&

called from:

#0 0x000000000069c92c in Protocol::net_store_data (this=0x2a642108, from=0x17639349 "", length=0) at protocol.cc:51
#1 0x000000000069e892 in Protocol::store_string_aux (this=0x2a642108, from=0x17639349 "", length=0, fromcs=0x1306340, tocs=0x1306340) at protocol.cc:881
#2 0x000000000069f3c5 in Protocol_text::store (this=0x2a642108, field=0x2ffa1b48) at protocol.cc:1040
#3 0x00000000005c273f in Item_field::send (this=0x176391f8, protocol=0x2a642108, buffer=0x40d34f0) at item.cc:5815
#4 0x000000000068f0b3 in select_send::send_data (this=0x1762ead8, items=...) at sql_class.cc:1911
#5 0x000000000075e41e in do_select (join=0x2ff99088, fields=0x2ff9ed48, table=0x0, procedure=0x0) at sql_select.cc:13389
#6 0x0000000000742cbf in JOIN::exec (this=0x2ff99088) at sql_select.cc:2437
#7 0x00000000007434ff in mysql_select (thd=0x2a641d30, rref_pointer_array=0x2a6447f0, tables=0x1762b178, wild_num=0, fields=..., conds=0x1762c4e8,
    og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x1762ead8, unit=0x2a644150, select_lex=0x2a6445d8)
    at sql_select.cc:2654
#8 0x000000000073b0c7 in handle_select (thd=0x2a641d30, lex=0x2a6440b0, result=0x1762ead8, setup_tables_done_option=0) at sql_select.cc:282
#9 0x00000000006c8871 in execute_sqlcom_select (thd=0x2a641d30, all_tables=0x1762b178) at sql_parse.cc:5076
#10 0x00000000006bf760 in mysql_execute_command (thd=0x2a641d30) at sql_parse.cc:2235
#11 0x00000000006cb2b2 in mysql_parse (thd=0x2a641d30,
    rawbuf=0x1762abd8 "SELECT t1.f3, MAX(t1.f2)\nFROM t1, t2\nWHERE (t2.pk = t1.pk) AND t2.pk IN (SELECT f2 FROM t1)", length=91, found_semicolon=0x40d4b00)
    at sql_parse.cc:6083

where Item_field represents the selected "t1.f3".

The result of the query is:
1) with materialization:
+----+------------+
| f3 | MAX(t1.f2) |
+----+------------+
| | NULL |
+----+------------+

2) with in-to-exists:
+----+------------+
| f3 | MAX(t1.f2) |
+----+------------+
| NULL | NULL |
+----+------------+

3) with semijoin
+----+------------+
| f3 | MAX(t1.f2) |
+----+------------+
| x | NULL |
+----+------------+

All three are different for "f3", and the correct one is
produced by in-to-exists, because the result of the whole
query is empty.

Revision history for this message
Timour Katchaounov (timour) wrote :

Interestingly, in 5.3-mwl89 both materialization and in-to-exists
produce the correct result, while semijoin still produces the same
wrong result.

summary: - Valgrind warnings in String::append with materialization
+ Wrong result with materialization and semijoin, and valgrind warnings in
+ Protocol::net_store_data with materialization for implicit grouping
Changed in maria:
status: Confirmed → In Progress
importance: Undecided → Low
Changed in maria:
status: In Progress → Fix Committed
Revision history for this message
Timour Katchaounov (timour) wrote :

Assigned to Sergey because there is a wrong result with a semijoin plan.

Changed in maria:
assignee: Timour Katchaounov (timour) → Sergey Petrunia (sergefp)
Changed in maria:
status: Fix Committed → In Progress
Changed in maria:
status: In Progress → Confirmed
Changed in maria:
status: Confirmed → In Progress
Changed in maria:
importance: Low → Medium
Revision history for this message
Sergey Petrunia (sergefp) wrote :

This seems to be fixed by recent implicit-grouping fixes. With current 5.3, I get:

MariaDB [j34]> set @@optimizer_switch='materialization=on,semijoin=off';
Query OK, 0 rows affected (0.00 sec)

MariaDB [j34]>
MariaDB [j34]> SELECT t1.f3, MAX(t1.f2)
    -> FROM t1, t2
    -> WHERE (t2.pk = t1.pk) AND t2.pk IN (SELECT f2 FROM t1);
+----+------------+
| f3 | MAX(t1.f2) |
+----+------------+
| NULL | NULL |
+----+------------+
1 row in set (0.00 sec)

Changed in maria:
importance: Medium → High
Revision history for this message
Sergey Petrunia (sergefp) wrote :

> Assigned to Sergey because there is a wrong result with a semijoin plan.

The problem has nothing to do with semi-joins, or subqueries:

create table t11 (a int primary key, b int);
insert into t11 values (1,1),(2,2);

create table t10 (a int, b int, c int);
insert into t10 values (10, NULL, NULL), (10, NULL, NULL);

MariaDB [j12]> explain select t11.b, max(t10.b) from t10, t11 where t10.c<1 and t11.a=2;
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | t11 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | t10 | ALL | NULL | NULL | NULL | NULL | 2 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
2 rows in set (2.63 sec)

MariaDB [j12]> select t11.b, max(t10.b) from t10, t11 where t10.c<1 and t11.a=2;
+------+------------+
| b | max(t10.b) |
+------+------------+
| 2 | NULL |
+------+------------+
1 row in set (3.31 sec)

MariaDB [j12]> alter table t11 drop primary key;
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0

MariaDB [j12]> select t11.b, max(t10.b) from t10, t11 where t10.c<1 and t11.a=2;
+------+------------+
| b | max(t10.b) |
+------+------------+
| NULL | NULL |
+------+------------+
1 row in set (3.66 sec)

Revision history for this message
Sergey Petrunia (sergefp) wrote :

The real problem is: when the query has implicit grouping, and it produces no rows, all columns of non-const tables get NULL values. However, columns of const tables remain non-NULL, and that is the cause of incorrect results.

Revision history for this message
Sergey Petrunia (sergefp) wrote :

.. and this can be repeated on the current MySQL 5.1

Revision history for this message
Sergey Petrunia (sergefp) wrote :

Moving off the last issue to separate bug, bug #844997

Changed in maria:
status: In Progress → Fix Committed
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.