Node consistency issues with load data infile

Bug #1282258 reported by Raghavendra D Prabhu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
New
Undecided
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Won't Fix
Critical
Unassigned
5.6
Incomplete
Critical
Unassigned

Bug Description

With load-data grammar (yy/zz in bzr+ssh://bazaar.launchpad.net/~raghavendra-prabhu/randgen/pxc/), the second node fails quickly after startup with node consistency issues.

====

# 2014-02-20T02:10:31 [13479] Started periodic reporting process...
# 2014-02-20T02:10:31 [13479] ErrorLogAlarm Reporter will monitor the log file /tmp/node1/node0/data/../mysql.err
# 2014-02-20T02:10:51 [13479] ALARM from ErrorLogAlarm reporter: Pattern '^Error:|^ERROR|\[ERROR\]|allocated at line|missing DBUG_RETURN|^safe_mutex:|Invalid.*old.*table or database|InnoDB: Warning|InnoDB: Error:|InnoDB: Operating system error|Error while setting value|\[Warning\] Invalid' was found in error log. Matching line was:
# 2014-02-20T02:10:51 [13479] 2014-02-20 02:10:46 12873 [ERROR] WSREP: Certification failed for TO isolated action: source: d0a495fd-99a5-11e3-af23-73475c49a6e2 version: 3 local: 1 state: CERTIFYING flags: 65 conn_id: 20 trx_id: -1 seqnos (l: 1065, g: 1016, s: 1009, d: -1, ts: 198596209302899)
sh: line 1: 12929 Aborted (core dumped) "/pxc56/bin/mysqld" "--no-defaults" "--basedir=/pxc56" "--datadir=/tmp/node1/node1/data" "--lc-messages-dir=/pxc56/share/mysql" "--character-sets-dir=/pxc56/share/mysql/charsets" "--tmpdir=/tmp/node1/node1/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19301" "--socket=/tmp/node1/node1/mysql.sock" "--pid-file=/tmp/node1/node1/mysql.pid" "--general-log" "--general-log-file=/tmp/node1/node1/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --binlog-format=row --wsrep_node_incoming_address=127.0.0.1 --wsrep_node_address=127.0.0.1 --wsrep_cluster_address='gcomm://127.0.0.1:4844?gmcast.listen_addr=tcp://127.0.0.1:4845&pc.ignore_sb=true' --wsrep_sst_receive_address=127.0.0.1:4848 --innodb_buffer_pool_size=500M --query_cache_type=0 --wsrep_slave_threads=8 --query_cache_size=0 --innodb_flush_log_at_trx_commit=0 --wsrep-provider=/pxc56/lib/libgalera_smm.so --innodb_flush_method=O_DIRECT --wsrep-load-data-splitting=OFF >> "/tmp/node1/node1/mysql.err" 2>&1
# 2014-02-20T02:13:02 [13485] Query: LOAD DATA INFILE '/tmp/gentest13485.tmp' INTO TABLE `table5000_innodb_default_int` failed: 13 Can't get stat of '/tmp/gentest13485.tmp' (Errcode: 2 - No such file or directory)
# 2014-02-20T02:13:18 [13483] Query: LOAD DATA INFILE '/tmp/gentest13483.tmp' INTO TABLE `table500_innodb_compressed_key_pk_parts_4_char_10` failed: 13 Can't get stat of '/tmp/gentest13483.tmp' (Errcode: 2 - No such file or directory)
^C# 2014-02-20T02:13:51 [13481] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13505] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13496] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13507] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13503] Child worker process completed successfully.
#

====================

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Complete logs with datadir of both nodes (with data files removed but logs kept).

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Download full text (3.8 KiB)

The second node fails as:
========================================================

 2014-02-20 02:18:31 16336 [Note] WSREP: Synchronized with group, ready for connections
 2014-02-20 02:18:31 16336 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
 2014-02-20 02:18:31 16336 [Note] /pxc56/bin/mysqld: ready for connections.
 Version: '5.6.15-25.3-log' socket: '/tmp/node1/node1/mysql.sock' port: 19301 Percona XtraDB Cluster (GPL) 5.6.15-25.3, Revision 723, wsrep_25.3.r723
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [Warning] WSREP: Failed to apply app buffer: seqno: 4092, status: 1
          at galera/src/trx_handle.cpp:apply():340
 Retrying 2th time
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [Warning] WSREP: Failed to apply app buffer: seqno: 4092, status: 1
          at galera/src/trx_handle.cpp:apply():340
 Retrying 3th time
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [Warning] WSREP: Failed to apply app buffer: seqno: 4092, status: 1
          at galera/src/trx_handle.cpp:apply():340
 Retrying 4th time
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [ERROR] WSREP: Failed to apply trx: source: 2d4b9ea9-99a7-11e3-94e6-3abed63932d9 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 30 trx_id: 132887 seqnos (l: 4143, g: 4092, s: 4081 , d: 4075, ts: 199093692186419)
 2014-02-20 02:19:03 16336 [ERROR] WSREP: Failed to apply trx 4092 4 times
 2014-02-20 02:19:03 16336 [ERROR] WSREP: Node consistency compromized, aborting...
 2014-02-20 02:19:03 16336 [Note] WSREP: Closing send monitor...
 2014-02-20 02:19:03 16336 [Note] WSREP: Closed send monitor.
 2014-02-20 02:19:03 16336 [Note] WSREP: gcomm: terminating thread
 2014-02-20 02:19:03 16336 [Note] WSREP: gcomm: joining thread
 2014...

Read more...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Grammar files invoked as:

perl runall-new.pl --basedir=/pxc56 --vardir=/tmp/node1 --galera=ms --grammar=conf/galera/galera_stress-load-data.yy --gendata=conf/galera/galera_stress-dml.zz --mysqld=--innodb_buffer_pool_size=500M --mysqld=--query_cache_type=0 --mysqld=--wsrep_slave_threads=8 --mysqld=--query_cache_size=0 --mysqld=--innodb_flush_log_at_trx_commit=0 --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --threads=16 --queries=30000 --mysqld=--wsrep-provider=/pxc56/lib/libgalera_smm.so --seed=time --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--wsrep-load-data-splitting=OFF

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Can you try this with latest 5.6 and see if issue is reproducible.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-929

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.