Percona XtraDB Cluster - HA scalable solution for MySQL

CTAS fails with explicit TEMPORARY tables, binlogging enabled and autocommit=0

Reported by Francesco on 2013-10-15
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Undecided
Unassigned
Percona XtraDB Cluster
Status tracked in Trunk
5.6
Undecided
Unassigned
Trunk
Undecided
Unassigned

Bug Description

I have a cluster of 3 nodes all running Percona-XtraDB-Cluster-server.x86_64 1:5.5.33-23.7.6.495.rhel6.

When I run this SQL:

USE test;

DROP TABLE IF EXISTS test_bug_source;
CREATE TABLE test_bug_source(`id` int);

INSERT INTO
  test_bug_source
VALUES
  (1)
;
COMMIT;

START TRANSACTION;
DROP TEMPORARY TABLE IF EXISTS test_bug;
CREATE TEMPORARY TABLE test_bug AS
  SELECT
    *
  FROM
    test_bug_source
;
COMMIT;

I am getting the following error:

Lookup Error - MySQL Database Error: Deadlock found when trying to get lock; try restarting transaction

I don't know what's going on! This scripts works fine on a previous version (MySQL Percona 5.5.31).

Here is the mysql configuration file:

[mysql]
# CLIENT #
port = 3306
socket = /data/mysql/mysql.sock
default-character-set=utf8

[client]
socket = /data/mysql/mysql.sock
default-character-set=utf8

[mysqld]
# GENERAL #
user = mysql
default_storage_engine = InnoDB
socket = /data/mysql/mysql.sock
pid_file = /data/mysql/mysql.pid

# MyISAM #
key_buffer_size = 32M
myisam_recover = FORCE,BACKUP

# SAFETY #
max_allowed_packet = 16M
max_connect_errors = 1000000
skip_name_resolve
sql_mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
sysdate_is_now = 1
innodb = FORCE
innodb_strict_mode = 1

# DATA STORAGE #
datadir = /data/mysql/

# BINARY LOGGING #
log_bin = /data/mysql/mysql-bin
expire_logs_days = 14
sync_binlog = 1

# CACHES AND LIMITS #
tmp_table_size = 32M
max_heap_table_size = 32M
query_cache_type = 0
query_cache_size = 0
max_connections = 500
thread_cache_size = 50
open_files_limit = 65535
table_definition_cache = 1024
table_open_cache = 100

# INNODB #
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 64M
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table = 1
innodb_buffer_pool_size = 512M

# LOGGING #
log_error = /data/mysql/mysql-error.log
log_queries_not_using_indexes = 1
slow_query_log = 1
slow_query_log_file = /data/mysql/mysql-slow.log

# UTF8 #
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8

#### Cluster config ####
# Path to Galera library
wsrep_provider = /usr/lib64/libgalera_smm.so

# Cluster connection URL contains the IPs of all the nodes
wsrep_cluster_address = gcomm://<cluster_ip_addresses>

# In order for Galera to work correctly binlog format should be ROW
binlog_format = ROW

# This is a recommended tuning variable for performance
innodb_locks_unsafe_for_binlog = 1

# This changes how InnoDB auto-increment locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode = 2

# IP address of this node
wsrep_node_address = <node_ip_address>

# SST method
wsrep_sst_method = xtrabackup

# Cluster name
wsrep_cluster_name = data_cluster

# Authentication for SST method
wsrep_sst_auth = "<user>:<password>"

Any ideas?

no longer affects: percona-server
summary: - Deadlock when dropping and creating temporary table on MySQL Percona
- 5.5.33
+ Deadlock when dropping and creating temporary table on Percona XtraDB
+ Cluster 5.5.33

Update, I have done also this test:

1) create a machine with Percona XtraDB Cluster 5.5.31 and 3 nodes.
2) run the script, no errors.
3) upgrade the same machine to Percona XtraDB Cluster 5.5.33
4) run the same script, Deadlock error.

Francesco (francesco-l) wrote :

Clarification: in the previous comment when I said "update the same machine" I meant all the machine in the cluster.
I started with a cluster of 3 nodes, all with Percona XtraDB Cluster 5.5.31 (CentOS 6).
Run the script, no Deadlock error.
Upgrade all the nodes to Percona XtraDB Cluster 5.5.33.
Run the script, yes Deadlock error.

If I take the DROP TEMPORARY TABLE out of the transaction, no error occurs (weird?).

I've got a few other weird deadlock examples with temporary tables and store procs, if anyone is interested I can post them.

Thanks for reporting this.

I am able to reproduce this.

Mainly with wsrep_debug I see the following:

==============
131022 10:37:40 [Note] WSREP: skipping tmp table DML: THD: 3 tmp: 2 SQL: CREATE TEMPORARY TABLE test_bug AS
  SELECT
    *
  FROM
    test_bug_source
131022 10:37:42 [Warning] WSREP: SQL statement was ineffective, THD: 3, buf: 139
QUERY: COMMIT
 => Skipping replication
131022 10:37:42 [Note] WSREP: commit failed for reason: 3
131022 10:37:42 [Note] WSREP: conflict state: 0
131022 10:37:42 [Note] WSREP: cluster conflict due to certification failure for threads:
131022 10:37:42 [Note] WSREP: Victim thread:
   THD: 3, mode: local, state: executing, conflict: cert failure, seqno: 49
   SQL: COMMIT
131022 10:37:42 [Note] WSREP: COMMIT failed, MDL released: 3
131022 10:37:42 [Note] WSREP: cert failure, thd: 3 is_AC: 0, retry: 0 - 1 SQL: COMMIT

=============================

Download full text (3.2 KiB)

As error indicates, the DML is skipping but DDL goes ahead and temporary table is created.

Also, following succeeds (also a workaround).

MySQL [test]> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
MySQL [test]> show create table test_bug;
+----------+------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------+------------------------------------------------------------------------------------------------------+
| test_bug | CREATE TEMPORARY TABLE `test_bug` (
  `id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+----------+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MySQL [test]> start transaction;
Query OK, 0 rows affected (0.01 sec)

MySQL [test]> insert into test_bug select * from test_bug_source;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0

MySQL [test]> commit;
Query OK, 0 rows affected (0.00 sec)

========================================

So the problem seems to occur when:

a) there is an explicit TEMPORARY TABLE

b) DML and DDL occur in same statement

Hence, following succeeds as well
========================================
MySQL [(none)]> drop database test;
Query OK, 1 row affected (0.01 sec)

MySQL [(none)]> create database test;
Query OK, 1 row affected (0.00 sec)

MySQL [(none)]> use test;
Database changed
MySQL [test]> DROP TABLE IF EXISTS test_bug_source;
Query OK, 0 rows affected, 1 warning (0.00 sec)

MySQL [test]> CREATE TABLE test_bug_source(`id` int);
Query OK, 0 rows affected (0.01 sec)

MySQL [test]>
MySQL [test]> INSERT INTO
    -> test_bug_source
    -> VALUES
    -> (1)
    -> ;
Query OK, 1 row affected (0.00 sec)

MySQL [test]> COMMIT;
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> DROP TEMPORARY TABLE IF EXISTS test_bug;
Query OK, 0 rows affected, 1 warning (0.00 sec)

MySQL [test]> CREATE TEMPORARY TABLE test_bug like test_bug_source;
Query OK, 0 rows affected (0.01 sec)

MySQL [test]> commit;
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> show create table test_bug;
+----------+------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------+------------------------------------------------------------------------------------------------------+
| test_bug | CREATE TEMPORARY TABLE `test_bug` (
  `id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+----------+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MySQL [test]> start transaction;
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> insert into test_bug select * from test_bug_source;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0

MySQL [t...

Read more...

In fact, when transactions are not used (ie. with autocommit) it crashes, will open separate issue for that.

summary: Deadlock when dropping and creating temporary table on Percona XtraDB
- Cluster 5.5.33
+ Cluster 5.5.33 | CTAS fails with explicit TEMPORARY tables
summary: - Deadlock when dropping and creating temporary table on Percona XtraDB
- Cluster 5.5.33 | CTAS fails with explicit TEMPORARY tables
+ CTAS fails with explicit TEMPORARY tables

The crash mentioned in #5 happens only when binlog_format is STMT, doesn't happen with ROW.

With ROW binlogging format, follow is emitted:

==============
131022 11:23:24 [Note] WSREP: skipping tmp table DML: THD: 3 tmp: 2 SQL: CREATE TEMPORARY TABLE test_bug AS
  SELECT
    *
  FROM
    test_bug_source
131022 11:23:24 [Note] WSREP: empty rbr buffer, query: CREATE TEMPORARY TABLE test_bug AS
  SELECT
    *
  FROM
    test_bug_source, affected rows: 1, changed tables: 1, sql_log_bin: 1, wsrep status (3 2 0)

=================

Actually with ROW binlog_format, I don't get the deadlock error either.

@Francesco (francesco-l)

Are you sure all nodes have binlog_format as ROW? I am not able to replicate this with ROW binlog_format.

Also, can you try with a more minimal config? like removing #SAFETY# parts etc.

Another thing is I tested with latest HEAD which has other fixes, we are doing an experimental release of this in a few hours, you can test with that as well.

tags: added: statement
summary: - CTAS fails with explicit TEMPORARY tables
+ CTAS fails with explicit TEMPORARY tables | STATEMENT binlog_format

GDB backtrace of the crash without autocommit. Note this happens only with STATEMENT binlog_format.

Francesco (francesco-l) wrote :

Raghavendra,

  thanks for you answer and for you help.

 - I have double checked and all the nodes have ROW binlog_format,
 - I have run the script taking out the #SAFETY# section on all the nodes, but I am still getting the deadlock error.

So you cannot reproduce maybe because of the conf or maybe because of the different Percona Cluster version, so:
 - can you give me a minimal configuration which I can try out? (maybe I should take out more sections than just the safety one?)
 - can you please tell me where I can find the experimental release?

Thank you!

@Francesco,

The experimental repo should have the RPM soon, but you can also download the RPMs from http://jenkins.percona.com/job/percona-xtradb-cluster-5.5-centos-param/BUILD_TYPE=release,Host=centos6-64/6/

(Note, they are not signed, hence for testing purposes only).

Steffen Boehme (boemm) wrote :

Hi,

I'm able to reproduce this on our percona cluster too ...
I run into a problem with "Deadlock found ..." while replicating from another outstanding server into my cluster (on node1).

After probing a while I did a search for this error and found this bug.

Our current setup is:

(master node) -> (cluster node1) <-> (cluster node2) <-> (cluster node3)
Means we have a 3 node percona cluster currently getting their updates only from a replication configured on node1.

While replicating data works fine for several days/weeks a few days ago I found this deadlock in the slave status output.

Restarting slave and even the cluster node did not change anything.
I also shut down all other nodes and tried to start the replication stand alone with the first node, but got the error in this case too.

After I found this bug report I tried the steps to reproduce the bug from comment #1 and was able to reproduce it.
I have installed version 5.5.34 of percona cluster.
All my nodes and the external master too runs with row based replication.

Since one told the error cannot be reproduced in version 5.5.31 I just downgraded to this version and really, there the example form #1 runs without error.

I'm just waiting for my replication to came up to the point of error to check, if the error still happens there.
Will inform you once I have checked this ...

Steffen Boehme (boemm) wrote :

Btw: I was able to reproduce with:

libmysqlclient18_1:5.5.34-rel32.0-591.precise
percona-xtradb-cluster-client-5.5_5.5.34-23.7.6-565.precise
percona-xtradb-cluster-common-5.5_5.5.34-23.7.6-565.precise
percona-xtradb-cluster-galera-2.x_162.precise
percona-xtradb-cluster-server-5.5_5.5.34-23.7.6-565.precise

and got it working with version:
libmysqlclient18_5.5.31-23.7.5-438.precise
percona-xtradb-cluster-client-5.5_5.5.31-23.7.5-438.precise
percona-xtradb-cluster-common-5.5_5.5.31-23.7.5-438.precise
percona-xtradb-cluster-galera-2.x_152.precise
percona-xtradb-cluster-server-5.5_5.5.31-23.7.5-438.precise

All on the same box with Ubuntu 12.04 server 64 bit (the packages were the amd64 versions).

If you need more info let me know!

Steffen Boehme (boemm) wrote :

After replication came up to the point were it failed before, it failed even with version 5.5.31 :-(.
So this might be another problem ...

@Steffen,

Do you have STATEMENT binlog_format enabled (as per the title)?

Francesco (francesco-l) wrote :

I tried with the new version 5.5.34 (same config as the first post) and I still get the deadlock

@Francesco,

Thanks. I will try to replicate this again on a clean VM with
your config.

In the meantime, can you attach pt-mysql-summary and "show global
variables" outputs? (latter, in case, they were changed at runtime)

Francesco (francesco-l) wrote :
Download full text (15.9 KiB)

[root@cluster1 ~]# pt-mysql-summary -password password
# Percona Toolkit MySQL Summary Report #######################
              System time | 2013-11-10 14:13:27 UTC (local TZ: GMT +0000)
# Instances ##################################################
  Port Data Directory Nice OOM Socket
  ===== ========================== ==== === ======
        /data/mysql/ 0 0 /data/mysql/mysql.sock
# MySQL Executable ###########################################
       Path to executable | /usr/sbin/mysqld
              Has symbols | No
# Report On Port 3306 ########################################
                     User | root@localhost
                     Time | 2013-11-10 14:13:26 (GMT)
                 Hostname | cluster1
                  Version | 5.5.34-log Percona XtraDB Cluster (GPL), wsrep_23.7.6.r3936
                 Built On | Linux i686
                  Started | 2013-11-10 14:03 (up 0+00:09:58)
                Databases | 4
                  Datadir | /data/mysql/
                Processes | 3 connected, 1 running
              Replication | Is not a slave, has 0 slaves connected
                  Pidfile | /data/mysql/mysql.pid (exists)
# Processlist ################################################

  Command COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  Query 1 1 0 0
  Sleep 2 0 1250 600

  User COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  root 1 1 0 0
  system user 2 0 0 0

  Host COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
                                        2 0 0 0
  localhost 1 1 0 0

  db COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  NULL 3 1 0 0

  State COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  NULL 1 0 0 0
  sleeping 1 1 0 0
  wsrep aborter idle 1 0 0 0

# Status Counters (Wait 10 Seconds) ##########################
Variable Per day Per second 12 secs
Aborted_connects 300
Bytes_received 900000 10 175
Bytes_sent 17500000 200 2000
Com_select 7000 1
Com_show_binlogs 300
Com_show_databases 300
Com_show_engine_status ...

Francesco (francesco-l) wrote :
Download full text (454.2 KiB)

mysql> show global
    -> variables;
+---------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+---------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...

summary: - CTAS fails with explicit TEMPORARY tables | STATEMENT binlog_format
+ CTAS fails with explicit TEMPORARY tables | binlogging enabled
Download full text (3.4 KiB)

I can confirm this. This happens only when binlogging is enabled.

========================================

mysql> use sbtest;
Database changed
mysql>
mysql>
mysql> DROP TABLE IF EXISTS test_bug_source;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE test_bug_source(`id` int);
Query OK, 0 rows affected (0.04 sec)

mysql>
mysql> INSERT INTO
    -> test_bug_source
    -> VALUES
    -> (1)
    -> ;
Query OK, 1 row affected (0.00 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>
mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> DROP TEMPORARY TABLE IF EXISTS test_bug;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TEMPORARY TABLE test_bug AS
    -> SELECT
    -> *
    -> FROM
    -> test_bug_source
    -> ;
Query OK, 1 row affected (0.05 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql> COMMIT;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
==========================================

This is how binlog looks:

========================================

use `sbtest`/*!*/;
SET TIMESTAMP=1384177232/*!*/;
SET @@session.pseudo_thread_id=4/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
DROP TABLE IF EXISTS `test_bug_source` /* generated by server */
/*!*/;
# at 236
#131111 19:10:32 server id 1 end_log_pos 339 Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1384177232/*!*/;
CREATE TABLE test_bug_source(`id` int)
/*!*/;
# at 339
#131111 19:10:32 server id 1 end_log_pos 409 Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1384177232/*!*/;
BEGIN
/*!*/;
# at 409
# at 465
#131111 19:10:32 server id 1 end_log_pos 465 Table_map: `sbtest`.`test_bug_source` mapped to number 33
#131111 19:10:32 server id 1 end_log_pos 499 Write_rows: table id 33 flags: STMT_END_F

BINLOG '
UN6AUhMBAAAAOAAAANEBAAAAACEAAAAAAAEABnNidGVzdAAPdGVzdF9idWdfc291cmNlAAEDAAE=
UN6AUhcBAAAAIgAAAPMBAAAAACEAAAAAAAEAAf/+AQAAAA==
'/*!*/;
# at 499
#131111 19:10:32 server id 1 end_log_pos 526 Xid = 14
COMMIT/*!*/;
# at 526
#131111 19:10:39 server id 1 end_log_pos 596 Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1384177239/*!*/;
BEGIN
/*!*/;
# at 596
#131111 19:10:39 server id 1 end_log_pos 737 Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1384177239/*!*/;
DROP TEMPORARY TABLE IF EXISTS `sbtest`.`test_bug` /* generated by server */
/*!*/;
# at 737
#131111 19:10:39 server id 1 end_log_pos 810 Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1384177239/*!*/;
ROLLBACK/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

==========...

Read more...

Few more things:

a) Tested with PS, works fine there.

b) Binlog for that:

BEGIN
/*!*/;
# at 586
#131111 19:38:58 server id 1 end_log_pos 723 Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1384178938/*!*/;
DROP TEMPORARY TABLE IF EXISTS `test`.`test_bug` /* generated by server */
/*!*/;
# at 723
#131111 19:39:00 server id 1 end_log_pos 750 Xid = 9
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

The difference in this case being a 'COMMIT' instead of
'ROLLBACK'.

This is what it looks like with wsrep-debug:

a)

==========================================================================
131111 20:12:19 [Note] WSREP: skipping tmp table DML: THD: 3 tmp: 2 SQL: CREATE TEMPORARY TABLE test_bug AS
  SELECT
    *
  FROM
    test_bug_source
131111 20:12:22 [Warning] WSREP: SQL statement was ineffective, THD: 3, buf: 141
QUERY: commit
 => Skipping replication
131111 20:12:22 [Note] WSREP: commit failed for reason: 3
131111 20:12:22 [Note] WSREP: conflict state: 0
131111 20:12:22 [Note] WSREP: cluster conflict due to certification failure for threads:
131111 20:12:22 [Note] WSREP: Victim thread:
   THD: 3, mode: local, state: executing, conflict: cert failure, seqno: -1
   SQL: commit
131111 20:12:22 [Note] WSREP: COMMIT failed, MDL released: 3
131111 20:12:22 [Note] WSREP: cert failure, thd: 3 is_AC: 0, retry: 0 - 1 SQL: commit

=========================================================================

As per "skipping tmp table DML: " , the CREATE TEMPORARY TABLE succeeds but the

summary: - CTAS fails with explicit TEMPORARY tables | binlogging enabled
+ CTAS fails with explicit TEMPORARY tables, binlogging enabled and
+ autocommit=0

As per "skipping tmp table DML: " , the CREATE TEMPORARY TABLE succeeds but the DML fails (ie. the table created will be empty).

a)

=== modified file 'Percona-Server/storage/innobase/handler/ha_innodb.cc'
--- Percona-Server/storage/innobase/handler/ha_innodb.cc 2013-10-27 16:53:20 +0000
+++ Percona-Server/storage/innobase/handler/ha_innodb.cc 2013-11-11 15:29:45 +0000
@@ -8303,8 +8303,8 @@
    bool key_appended = false;
  trx_t *trx = thd_to_trx(thd);

- if (table_share && table_share->tmp_table != NO_TMP_TABLE) {
- WSREP_DEBUG("skipping tmp table DML: THD: %lu tmp: %d SQL: %s",
+ if (table_share && table_share->tmp_table != NO_TMP_TABLE && thd->lex->sql_command != SQLCOM_CREATE_TABLE) {
+ WSREP_DEBUG("skipping tmp table DML: THD: %lu tmp: %d SQL: %s Command %d",
        wsrep_thd_thread_id(thd),
        table_share->tmp_table,
        (wsrep_thd_query(thd)) ?

fixes it.

But, the skipping DML clause was specifically added to prevent
memory leak reported in https://bugs.launchpad.net/codership-mysql/+bug/1112514

So, need to check if the regresses here.

The leak still happens, at least, the memory didn't get released after this

=============================
The bug is easy to reproduce. Here is a simplified version:

-- create source table and sample data
CREATE TABLE source (id BIGINT, data BIGINT, PRIMARY KEY(id, data), INDEX (data));
INSERT INTO source (id, data) VALUES (1, 1), (2, 2), (3, 2), (4, 2), (5, 3);

-- create temporary table and insert-select into it
CREATE TEMPORARY TABLE temp (id BIGINT, data BIGINT, PRIMARY KEY (id, data));
-- this leaks
INSERT INTO temp (id, data) SELECT id, data FROM source WHERE data = 2;

=================================================

when testing with 31 tables, the memory grew from 112 MB to 120
MB, and didn't get released. Need to check if this is due to
buffer pool or a genuine memory leak.

So, it (it being the test https://bugs.launchpad.net/codership-mysql/+bug/1112514/comments/1 ) indeeds leaks memory (tested with a smaller buffer pool), but so does Percona Server (or upstream if tested), hence, not something wsrep specific. Also, it leaks whether or not autocommit is enabled.

Actually, because of the condition - "sql_command != SQLCOM_CREATE_TABLE" - , any leaks, if present, shouldn't affect INSERT ... SELECT, hence shouldn't directly lp:1112514

CTAS in this case doesn't leak memory, tested. Hence, the fix has been committed.

@Steffen @Francesco

This is fixed now. The RPMs with fix are in experimental repo, can please test it and let us know how it goes.

Francesco (francesco-l) wrote :

Thank you Raghavendra, I'll give it a try this weekend and I'll let you know

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments