Loss of records with LOAD DATA INFILE in 5.6 with autocommit=0 + wsrep-load-data-splitting=ON | Crashes with UNIV_DEBUG

Bug #1281810 reported by Marco Tusa on 2014-02-18
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
Low
Seppo Jaakola
5.6
Low
Seppo Jaakola
Percona XtraDB Cluster
Status tracked in 5.6
5.5
Undecided
Unassigned
5.6
Undecided
Unassigned

Bug Description

All,
I had the following issue at customer site.
We were using MySQL 5.6 PXC (but that apply also to other distributions)
See below for full variable report.

We have exported the data out from the original server and then try to import it in the MySQL/Galera cluster (PXC implementation) using LOAD INTO.
What happened is that the data seems to load fine, and that the "WSREP: forced trx split for LOAD" works fine, but unfortunately this is not true.

Here and there we start to get "[Warning] MySQL is closing a connection that has an active InnoDB transaction. 4007 row modifications will roll back."
Performing a quick comparison between source and destination, I identify that we were loosing records in the import process.

Checking the warning I found that it was coming from:
 ha_innodb.cc line 4395

 /*****************************************************************//**
Frees a possible InnoDB trx object associated with the current THD.
@return 0 or error number */
static
int
innobase_close_connection(
...
<snip>
  if (trx_is_started(trx) && log_warnings) {

  sql_print_warning(
   "MySQL is closing a connection that has an active "
   "InnoDB transaction. "TRX_ID_FMT" row modifications "
   "will roll back.",
   trx->undo_no);
 }
 }

 innobase_rollback_trx(trx);

 trx_free_for_mysql(trx);

 DBUG_RETURN(0);
}

<snip>

In short this function is call to clean up pending Innodb operation.
It also perform the roll back of the undo segment as for the undo_no.

Given that it was clear why we were loosing records on the way.

Next step was to identify WHY.

I was able to reproduce the same behaviour on PXC and also on MySQL/Galera (codership distribution), I have prepare a small package with a perl file and 2 files of data to test (see attachments).

Anyhow, what I identify is that issue is present on the following condition:
 autocommit = 0;
 wsrep_load_data_splitting=1;

If I push the autocomit=1; the load will complete successfully and all the records will be present.
Same if I disable wsrep_load_data_splitting=0, adjusting the wsrep_max_ws_size to match my file chunk size.

So the issue seems related to how the split is managed inside the WSREP process, and probably to a missed call to innobase_commit in the function.

I tested it on 5.6 but not on 5.5.

Marco Tusa (marcotusa) wrote :
Marco Tusa (marcotusa) wrote :
Marco Tusa (marcotusa) wrote :

Just tested on 5.5 no issue.
this problem is present only in 5.6

summary: - Loss of records with LOAD INTO using trx split
+ Loss of records with LOAD INTO using trx split in 5.6
Download full text (7.9 KiB)

2014-02-18 16:45:14 14626 [Note] WSREP: TO END: 153
2014-02-18 16:45:24 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:26 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:28 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:31 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:33 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:35 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:38 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:39 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:42 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:44 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:47 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:49 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/backups/test4_14_2_2014/tbtest1#chunk_1.csv' IGNORE INTO TABLE test4.tbtest1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'
2014-02-18 16:45:50 14626 [Note] WSREP: forced trx split for LOAD: LOAD DATA INFILE '/home/mysql/bac...

Read more...

I can reproduce this with 5.6, in case of 5.5 UNIV_DEBUG, it crashed as http://paste.wnohang.net/37eb75, I will report it separately. Need to test with 5.5 release build.

On 5.5 release build, it loads 314007 rows fine.

With 5.6 debug build, hit this crash:

2014-02-19 16:57:06 7fe2047f8700 InnoDB: Assertion failure in thread 140608714802944 in file row0mysql.cc line 3249
InnoDB: Failing assertion: (trx)->start_file == 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.

with

MySQL [test4]> SELECT * FROM tbtest1 LIMIT 10004 INTO OUTFILE '/tmp/t2';
Query OK, 10004 rows affected (0.15 sec)

MySQL [test4]> truncate tbtest1;
Query OK, 0 rows affected (0.27 sec)

MySQL [test4]> load data infile '/tmp/t2' into table tbtest1;
Query OK, 10004 rows affected (2.51 sec)
Records: 10004 Deleted: 0 Skipped: 0 Warnings: 0

MySQL [test4]> select count(*) from tbtest1;
+----------+
| count(*) |
+----------+
| 10004 |
+----------+
1 row in set (0.04 sec)

MySQL [test4]> truncate tbtest1;
ERROR 2013 (HY000): Lost connection to MySQL server during query

Full session for previous crash: http://paste.wnohang.net/45c90e

Download full text (3.4 KiB)

Seeing another crash:

====================
MySQL [test4]> truncate tbtest1;
Query OK, 0 rows affected (0.03 sec)

MySQL [test4]> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

MySQL [test4]> load data infile '/tmp/t2' into table tbtest1;
Query OK, 10004 rows affected (1.95 sec)
Records: 10004 Deleted: 0 Skipped: 0 Warnings: 0

MySQL [test4]> set autocommit=1;
Query OK, 0 rows affected (0.00 sec)

MySQL [test4]> select count(*) from tbtest1;

===============================================
(gdb) bt
#0 0x00007ffff5f5e389 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff5f5f788 in abort () from /usr/lib/libc.so.6
#2 0x00007ffff5f574a6 in __assert_fail_base () from /usr/lib/libc.so.6
#3 0x00007ffff5f57552 in __assert_fail () from /usr/lib/libc.so.6
#4 0x000000000091bd47 in binlog_cache_data::finalize (this=this@entry=0x7fff5c01c840, thd=thd@entry=0x2e18230, end_event=end_event@entry=0x7fffbc5660b0)
    at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/binlog.cc:1187
#5 0x000000000092ac0b in MYSQL_BIN_LOG::commit (this=0x153a380 <mysql_bin_log>, thd=0x2e18230, all=<optimized out>) at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/binlog.cc:6419
#6 0x000000000064de53 in ha_commit_trans (thd=thd@entry=0x2e18230, all=all@entry=false, ignore_global_read_lock=ignore_global_read_lock@entry=false)
    at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/handler.cc:1521
#7 0x0000000000807718 in trans_commit_stmt (thd=thd@entry=0x2e18230) at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/transaction.cc:473
#8 0x000000000077cfe8 in mysql_execute_command (thd=thd@entry=0x2e18230) at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_parse.cc:5785
#9 0x0000000000780f68 in mysql_parse (thd=thd@entry=0x2e18230, rawbuf=rawbuf@entry=0x7fff5c006d50 "select count(*) from tbtest1", length=length@entry=28, parser_state=parser_state@entry=0x7fffbc5687e0)
    at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_parse.cc:7188
#10 0x000000000078161c in wsrep_mysql_parse (thd=thd@entry=0x2e18230, rawbuf=0x7fff5c006d50 "select count(*) from tbtest1", length=length@entry=28, parser_state=parser_state@entry=0x7fffbc5687e0)
    at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_parse.cc:6940
#11 0x0000000000783099 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x2e18230, packet=packet@entry=0x2f3e421 "", packet_length=packet_length@entry=28)
    at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_parse.cc:1630
#12 0x0000000000783833 in do_command (thd=0x2e18230) at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_parse.cc:1133
#13 0x000000000074e3e2 in do_handle_one_connection (thd_arg=thd_arg@entry=0x2e18230) at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_connect.cc:1557
#14 0x000000000074e520 in handle_one_connection (arg=arg@entry=0x2e18230) at /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/sql_connect.cc:1461
#15 0x0000000000bf53a3 in pfs_spawn_t...

Read more...

Also can reproduce the original issue with fewer rows - 10004

MySQL [test4]> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

MySQL [test4]> load data infile '/tmp/t2' into table tbtest1;
Query OK, 10004 rows affected (1.82 sec)
Records: 10004 Deleted: 0 Skipped: 0 Warnings: 0

MySQL [test4]> set autocommit=1;

(exit the client)

2014-02-19 17:06:16 19006 [Note] WSREP: forced trx split for LOAD: load data infile '/tmp/t2' into table tbtest1
[New Thread 0x7fff7effd700 (LWP 19096)]
[Thread 0x7fff7effd700 (LWP 19096) exited]
2014-02-19 17:06:22 19006 [Warning] MySQL is closing a connection that has an active InnoDB transaction. 4 row modifications will roll back.

Assertion failure in #9 is due to

[New Thread 0x7fff7effd700 (LWP 18525)]
[Thread 0x7fff7effd700 (LWP 18525) exited]
mysqld: /media/Tintin/Work/code/percona-xtradb-cluster/pxc56/Percona-Server/sql/binlog.cc:1187: int binlog_cache_data::finalize(THD*, Log_event*): Assertion `!flags.finalized' failed.

With wsrep-load-data-splitting=OFF, I don't see crashes of #7 and #9.

With binlogging enabled (all earlier tests were with binlogging disabled) but wsrep-load-data-splitting=ON (the default), it doesn't crash anywhere but I still see

 2014-02-19 17:06:22 19006 [Warning] MySQL is closing a connection that has an active InnoDB transaction. 4 row modifications will roll back.

=================================================================

To conclude,

When wsrep-load-data-splitting=ON

a) Binlogging being disabled leads to crash with UNIV_DEBUG builds and lost rows.

b) With binlogging enabled, it doesn't crash but it still loses rows when connection is closed.

When wsrep-load-data-splitting=OFF

there are no issues at all.

autocommit=0 is required for one of the crashes and losing the rows.

summary: - Loss of records with LOAD INTO using trx split in 5.6
+ Loss of records with LOAD DATA INFILE in 5.6 with autocommit=0 + wsrep-
+ load-data-splitting=ON | Crashes with UNIV_DEBUG

I have disabled wsrep-load-data-splitting by default in 5.6 tree for now.

Changed in codership-mysql:
assignee: nobody → Seppo Jaakola (seppo-jaakola)
status: New → In Progress
Changed in codership-mysql:
importance: Undecided → Low
milestone: none → 5.6.16-25.3
Seppo Jaakola (seppo-jaakola) wrote :

The problem happens because after load data splitting happens, multi-statement transaction (MST) is not properly cleaned up, and new MST is not initiated.

But, otoh, load data transaction splitting should not be used inside transactions. LOAD would do do implicit commit and application is hardly prepared for that. Therefore the fix for this issue should be just to not perform load data splitting, if LOAD is issued inside of a transaction.

Seppo Jaakola (seppo-jaakola) wrote :

Here's a simple method for regression testing:

create table t (i int primary key auto_increment, s int);

insert into t values (1,0);
insert into t values (2,0);

insert into t(s) select s from t; -- repeat this as long until you get > 10K rows in t

select * from t into outfile '/tmp/t';

truncate t;

load data infile '/tmp/t' into table t; .. this should be split and replicated in two batches

truncate t;

begin;
load data infile '/tmp/t' into table t; -- this should be not split, but replicated as one transactiion
commit;

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers