Node consistency issues with load data infile

Bug #1282258 reported by Raghavendra D Prabhu on 2014-02-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Undecided
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Won't Fix
Critical
Unassigned
5.6
Incomplete
Critical
Unassigned

Bug Description

With load-data grammar (yy/zz in bzr+ssh://bazaar.launchpad.net/~raghavendra-prabhu/randgen/pxc/), the second node fails quickly after startup with node consistency issues.

====

# 2014-02-20T02:10:31 [13479] Started periodic reporting process...
# 2014-02-20T02:10:31 [13479] ErrorLogAlarm Reporter will monitor the log file /tmp/node1/node0/data/../mysql.err
# 2014-02-20T02:10:51 [13479] ALARM from ErrorLogAlarm reporter: Pattern '^Error:|^ERROR|\[ERROR\]|allocated at line|missing DBUG_RETURN|^safe_mutex:|Invalid.*old.*table or database|InnoDB: Warning|InnoDB: Error:|InnoDB: Operating system error|Error while setting value|\[Warning\] Invalid' was found in error log. Matching line was:
# 2014-02-20T02:10:51 [13479] 2014-02-20 02:10:46 12873 [ERROR] WSREP: Certification failed for TO isolated action: source: d0a495fd-99a5-11e3-af23-73475c49a6e2 version: 3 local: 1 state: CERTIFYING flags: 65 conn_id: 20 trx_id: -1 seqnos (l: 1065, g: 1016, s: 1009, d: -1, ts: 198596209302899)
sh: line 1: 12929 Aborted (core dumped) "/pxc56/bin/mysqld" "--no-defaults" "--basedir=/pxc56" "--datadir=/tmp/node1/node1/data" "--lc-messages-dir=/pxc56/share/mysql" "--character-sets-dir=/pxc56/share/mysql/charsets" "--tmpdir=/tmp/node1/node1/tmp" "--core-file" "--max-allowed-packet=128Mb" "--port=19301" "--socket=/tmp/node1/node1/mysql.sock" "--pid-file=/tmp/node1/node1/mysql.pid" "--general-log" "--general-log-file=/tmp/node1/node1/mysql.log" --wsrep_sst_method=rsync --innodb_autoinc_lock_mode=2 --default-storage-engine=InnoDB --binlog-format=row --wsrep_node_incoming_address=127.0.0.1 --wsrep_node_address=127.0.0.1 --wsrep_cluster_address='gcomm://127.0.0.1:4844?gmcast.listen_addr=tcp://127.0.0.1:4845&pc.ignore_sb=true' --wsrep_sst_receive_address=127.0.0.1:4848 --innodb_buffer_pool_size=500M --query_cache_type=0 --wsrep_slave_threads=8 --query_cache_size=0 --innodb_flush_log_at_trx_commit=0 --wsrep-provider=/pxc56/lib/libgalera_smm.so --innodb_flush_method=O_DIRECT --wsrep-load-data-splitting=OFF >> "/tmp/node1/node1/mysql.err" 2>&1
# 2014-02-20T02:13:02 [13485] Query: LOAD DATA INFILE '/tmp/gentest13485.tmp' INTO TABLE `table5000_innodb_default_int` failed: 13 Can't get stat of '/tmp/gentest13485.tmp' (Errcode: 2 - No such file or directory)
# 2014-02-20T02:13:18 [13483] Query: LOAD DATA INFILE '/tmp/gentest13483.tmp' INTO TABLE `table500_innodb_compressed_key_pk_parts_4_char_10` failed: 13 Can't get stat of '/tmp/gentest13483.tmp' (Errcode: 2 - No such file or directory)
^C# 2014-02-20T02:13:51 [13481] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13505] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13496] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13507] Child worker process completed successfully.
# 2014-02-20T02:13:51 [13503] Child worker process completed successfully.
#

====================

Complete logs with datadir of both nodes (with data files removed but logs kept).

Download full text (3.8 KiB)

The second node fails as:
========================================================

 2014-02-20 02:18:31 16336 [Note] WSREP: Synchronized with group, ready for connections
 2014-02-20 02:18:31 16336 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
 2014-02-20 02:18:31 16336 [Note] /pxc56/bin/mysqld: ready for connections.
 Version: '5.6.15-25.3-log' socket: '/tmp/node1/node1/mysql.sock' port: 19301 Percona XtraDB Cluster (GPL) 5.6.15-25.3, Revision 723, wsrep_25.3.r723
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [Warning] WSREP: Failed to apply app buffer: seqno: 4092, status: 1
          at galera/src/trx_handle.cpp:apply():340
 Retrying 2th time
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [Warning] WSREP: Failed to apply app buffer: seqno: 4092, status: 1
          at galera/src/trx_handle.cpp:apply():340
 Retrying 3th time
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [Warning] WSREP: Failed to apply app buffer: seqno: 4092, status: 1
          at galera/src/trx_handle.cpp:apply():340
 Retrying 4th time
 2014-02-20 02:19:03 16336 [ERROR] Slave SQL: Could not execute Update_rows event on table test.table1_innodb_int_autoinc; Can't find record in 'table1_innodb_int_autoinc', Error_code: 1032; handler error HA_ERR_K EY_NOT_FOUND; the event's master log FIRST, end_log_pos 204, Error_code: 1032
 2014-02-20 02:19:03 16336 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 4092
 2014-02-20 02:19:03 16336 [ERROR] WSREP: Failed to apply trx: source: 2d4b9ea9-99a7-11e3-94e6-3abed63932d9 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 30 trx_id: 132887 seqnos (l: 4143, g: 4092, s: 4081 , d: 4075, ts: 199093692186419)
 2014-02-20 02:19:03 16336 [ERROR] WSREP: Failed to apply trx 4092 4 times
 2014-02-20 02:19:03 16336 [ERROR] WSREP: Node consistency compromized, aborting...
 2014-02-20 02:19:03 16336 [Note] WSREP: Closing send monitor...
 2014-02-20 02:19:03 16336 [Note] WSREP: Closed send monitor.
 2014-02-20 02:19:03 16336 [Note] WSREP: gcomm: terminating thread
 2014-02-20 02:19:03 16336 [Note] WSREP: gcomm: joining thread
 2014...

Read more...

Grammar files invoked as:

perl runall-new.pl --basedir=/pxc56 --vardir=/tmp/node1 --galera=ms --grammar=conf/galera/galera_stress-load-data.yy --gendata=conf/galera/galera_stress-dml.zz --mysqld=--innodb_buffer_pool_size=500M --mysqld=--query_cache_type=0 --mysqld=--wsrep_slave_threads=8 --mysqld=--query_cache_size=0 --mysqld=--innodb_flush_log_at_trx_commit=0 --reporter=Shutdown,Backtrace,ErrorLog,ErrorLogAlarm,Deadlock --threads=16 --queries=30000 --mysqld=--wsrep-provider=/pxc56/lib/libgalera_smm.so --seed=time --mysqld=--innodb_flush_method=O_DIRECT --mysqld=--wsrep-load-data-splitting=OFF

Krunal Bauskar (krunal-bauskar) wrote :

Can you try this with latest 5.6 and see if issue is reproducible.

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-929

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers