Slave assertion in unpack_row with ROLLBACK TO SAVEPOINT in error handler

Bug #1438990 reported by markus_albe on 2015-04-01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Vlad Lesin
5.6
Fix Released
High
Vlad Lesin
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
5.5
New
Undecided
Unassigned
5.6
Fix Released
Undecided
Raghavendra D Prabhu

Bug Description

Customer has 5.6.22 25-3.9 cluster crashing every so often, all nodes crash except the designated master.

...
Retrying 3th time
2015-03-13 17:42:13 50436 [ERROR] Slave SQL: Could not execute Write_rows event on table db_1.table_1; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log FIRST, end_log_pos 1514, Error_code: 1105
2015-03-13 17:42:13 50436 [Warning] WSREP: RBR event 17 Write_rows apply warning: 168, 5671546
2015-03-13 17:42:13 50436 [Warning] WSREP: Failed to apply app buffer: seqno: 5671546, status: 1
at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2015-03-13 17:42:13 50436 [ERROR] Slave SQL: Could not execute Write_rows event on table db_1.table_1; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log FIRST, end_log_pos 1514, Error_code: 1105
2015-03-13 17:42:13 50436 [Warning] WSREP: RBR event 17 Write_rows apply warning: 168, 5671546
2015-03-13 17:42:13 50436 [ERROR] WSREP: Failed to apply trx: source: 8d873190-abe7-11e4-ae7e-7212ae965514 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 6253259 trx_id: 12769086 seqnos (l: 2006, g: 5671546, s: 5671545, d: 5671541, ts: 20149745388720581)
2015-03-13 17:42:13 50436 [ERROR] WSREP: Failed to apply trx 5671546 4 times
2015-03-13 17:42:13 50436 [ERROR] WSREP: Node consistency compromized, aborting...

Alexey suggested the problem could have been related to https://github.com/codership/galera/issues/185 and we recommended the customer to upgrade to 3.9, which they did, but problem still persists:

2015-03-20 15:10:39 43760 [ERROR] Slave SQL: Could not execute Write_rows event on table db_2.table_1; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log FIRST, end_log_pos 1492, Error_code: 1105
2015-03-20 15:10:39 43760 [Warning] WSREP: RBR event 17 Write_rows apply warning: 168, 6041375
2015-03-20 15:10:39 43760 [Warning] WSREP: Failed to apply app buffer: seqno: 6041375, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time
2015-03-20 15:10:39 43760 [ERROR] Slave SQL: Could not execute Write_rows event on table db_2.table_1; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log FIRST, end_log_pos 1492, Error_code: 1105
2015-03-20 15:10:39 43760 [Warning] WSREP: RBR event 17 Write_rows apply warning: 168, 6041375
2015-03-20 15:10:39 43760 [Warning] WSREP: Failed to apply app buffer: seqno: 6041375, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2015-03-20 15:10:39 43760 [ERROR] Slave SQL: Could not execute Write_rows event on table db_2.table_1; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log FIRST, end_log_pos 1492, Error_code: 1105
2015-03-20 15:10:39 43760 [Warning] WSREP: RBR event 17 Write_rows apply warning: 168, 6041375
2015-03-20 15:10:39 43760 [Warning] WSREP: Failed to apply app buffer: seqno: 6041375, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2015-03-20 15:10:39 43760 [ERROR] Slave SQL: Could not execute Write_rows event on table db_2.table_1; Unknown error, Error_code: 1105; handler error HA_ERR_GENERIC; the event's master log FIRST, end_log_pos 1492, Error_code: 1105
2015-03-20 15:10:39 43760 [Warning] WSREP: RBR event 17 Write_rows apply warning: 168, 6041375
2015-03-20 15:10:39 43760 [ERROR] WSREP: Failed to apply trx: source: 8beb12a2-cf23-11e4-bed5-db7878247e1d version: 3 local: 0 state: APPLYING flags: 1 conn_id: 3805 trx_id: 13793615 seqnos (l: 6584, g: 6041375, s: 6041374, d: 6041361, ts: 20745451759206532)
2015-03-20 15:10:39 43760 [ERROR] WSREP: Failed to apply trx 6041375 4 times
2015-03-20 15:10:39 43760 [ERROR] WSREP: Node consistency compromized, aborting...

Different database, same table; Sveta worked with Teemu and they discussed about this bit:

#150320 21:10:39 server id 102 end_log_pos 80 Query thread_id=3805 exec_time=0 error_code=0
SET TIMESTAMP=1426878639/*!*/;
SET @@session.pseudo_thread_id=3805/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1574961152/*!*/;
SET @@session.auto_increment_increment=5, @@session.auto_increment_offset=4/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 187
#150320 21:10:39 server id 102 end_log_pos 172 Table_map: `db_2`.`table_2` mapped to number 305
# at 279
#150320 21:10:39 server id 102 end_log_pos 243 Table_map: `db_2`.`table_1` mapped to number 146
# at 350
#150320 21:10:39 server id 102 end_log_pos 317 Table_map: `db_2`.`table_3` mapped to number 272
# at 424
#150320 21:10:39 server id 102 end_log_pos 388 Table_map: `db_2`.`table_4` mapped to number 270
ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 157, event_type: 31
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.
DELIMITER ;
# End of log file

And Teemu asked for transaction from surviving node binlog, and here it is:

# at 3037980
#150320 16:10:39 server id 102 end_log_pos 3038064 CRC32 0x15d044e7 Query thread_id=3805 exec_time=0 error_code=0
SET TIMESTAMP=1426878639/*!*/;
BEGIN
/*!*/;
# at 3038064
# at 3038160
# at 3038235
# at 3038313
# at 3038388
#150320 16:10:39 server id 102 end_log_pos 3038160 CRC32 0x211b066d Table_map: `db_2`.`table_2` mapped to number 305
#150320 16:10:39 server id 102 end_log_pos 3038235 CRC32 0xdcf4126d Table_map: `db_2`.`table_1` mapped to number 146
#150320 16:10:39 server id 102 end_log_pos 3038313 CRC32 0xa6302dee Table_map: `db_2`.`table_3` mapped to number 272
#150320 16:10:39 server id 102 end_log_pos 3038388 CRC32 0xa9265597 Table_map: `db_2`.`table_4` mapped to number 270
#150320 16:10:39 server id 102 end_log_pos 3038549 CRC32 0x3905db2a Update_rows: table id 305 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAAYAAAANBbLgAAADEBAAAAAAMACUJHUkNBbHBoYQARbWFuaWZlc3RfcmVxdWVzdHMA
EwMDAxISAxIPEg8SAQMPAwMDAwMLAAAAHgAAcBcAcBdQNgZtBhsh
r3AMVRNmAAAASwAAABtcLgAAAJIAAAAAAAMACUJHUkNBbHBoYQAKZXZlbnRfbG9ncwANAw8DAwMD
AwEBEhIDAwT9AgAAehRtEvTc
r3AMVRNmAAAATgAAAGlcLgAAABABAAAAAAMACUJHUkNBbHBoYQAMZXZlbnRfdmFsdWVzAAoDAwMP
Dw8SEgMDCP0C/QL9AgAAlALuLTCm
r3AMVRNmAAAASwAAALRcLgAAAA4BAAAAAAMACUJHUkNBbHBoYQAMZXJyb3JfbG9nZ2VyAAgDDw8P
EhIDAwj9Av0C/QIAAKCXVSap
r3AMVR9mAAAAoQAAAFVdLgAAADEBAAAAAAEAAgAT////////QDb6lgQAAJsEAAABAAAAmZWpMpOZ
lakyngEAAAAGQ1JFQVRFmZWpMpQAUVMAAAMAAAABAAAAAgAAAAA2+pYEAACbBAAAAQAAAJmVqTKT
mZWpMp4BAAAAmZWpMqcGQ1JFQVRFmZWpMpQAUVMAAAMAAAABAAAAAgAAACrbBTk=
'/*!*/;
### UPDATE `db_2`.`table_2`
### WHERE
### @1=1174
### @2=1179
### @3=1
### @4='2015-03-20 19:10:19'
### @5='2015-03-20 19:10:30'
### @6=1
### @7=NULL
### @8='CREATE'
### @9='2015-03-20 19:10:20'
### @10=NULL
### @11=NULL
### @12=0
### @13=NULL
### @14=NULL
### @15=21329
### @16=3
### @17=1
### @18=NULL
### @19=2
### SET
### @1=1174
### @2=1179
### @3=1
### @4='2015-03-20 19:10:19'
### @5='2015-03-20 19:10:30'
### @6=1
### @7='2015-03-20 19:10:39'
### @8='CREATE'
### @9='2015-03-20 19:10:20'
### @10=NULL
### @11=NULL
### @12=0
### @13=NULL
### @14=NULL
### @15=21329
### @16=3
### @17=1
### @18=NULL
### @19=2
# at 3038549
# at 3038623
#150320 16:10:39 server id 102 end_log_pos 3038623 CRC32 0x0a4e0b13 Table_map: `db_2`.`table_5` mapped to number 124
#150320 16:10:39 server id 102 end_log_pos 3038719 CRC32 0x44c43bb0 Update_rows: table id 124 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAASgAAAJ9dLgAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAABMLTgo=
r3AMVR9mAAAAYAAAAP9dLgAAAHwAAAAAAAEAAgAG///AGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKU
mqj/fvvAGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKUmZWpMqiwO8RE
'/*!*/;
### UPDATE `db_2`.`table_5`
### WHERE
### @1=1819
### @2=1174
### @3='CREATE'
### @4=582
### @5='2015-03-20 19:10:20'
### @6='2099-12-31 23:59:59'
### SET
### @1=1819
### @2=1174
### @3='CREATE'
### @4=582
### @5='2015-03-20 19:10:20'
### @6='2015-03-20 19:10:40'
# at 3038719
# at 3038793
#150320 16:10:39 server id 102 end_log_pos 3038793 CRC32 0x51a07ebd Table_map: `db_2`.`table_5` mapped to number 124
#150320 16:10:39 server id 102 end_log_pos 3038855 CRC32 0x7e9602b5 Write_rows: table id 124 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAASgAAAEleLgAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAAL1+oFE=
r3AMVR5mAAAAPgAAAIdeLgAAAHwAAAAAAAEAAgAG/8AlBwAAlgQAAANORVdGAgAAmZWpMqiaqP9+
+7UCln4=
'/*!*/;
### INSERT INTO `db_2`.`table_5`
### SET
### @1=1829
### @2=1174
### @3='NEW'
### @4=582
### @5='2015-03-20 19:10:40'
### @6='2099-12-31 23:59:59'
# at 3038855
# at 3038951
# at 3039026
# at 3039104
# at 3039179
# at 3039342
#150320 16:10:39 server id 102 end_log_pos 3039446 CRC32 0x8d0e6222 Query thread_id=3805 exec_time=0 error_code=0
SET TIMESTAMP=1426878639/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SAVEPOINT `save_point_name`
/*!*/;
# at 3039446
# at 3039540
#150320 16:10:39 server id 102 end_log_pos 3039540 CRC32 0x695e518a Write_rows: table id 146
#150320 16:10:39 server id 102 end_log_pos 3039613 CRC32 0xc4574a36 Write_rows: table id 272 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAAYAAAAOdeLgAAADEBAAAAAAMACUJHUkNBbHBoYQARbWFuaWZlc3RfcmVxdWVzdHMA
EwMDAxISAxIPEg8SAQMPAwMDAwMLAAAAHgAAcBcAcBdQNgYSp5oZ
r3AMVRNmAAAASwAAADJfLgAAAJIAAAAAAAMACUJHUkNBbHBoYQAKZXZlbnRfbG9ncwANAw8DAwMD
AwEBEhIDAwT9AgAAehRlxMpq
r3AMVRNmAAAATgAAAIBfLgAAABABAAAAAAMACUJHUkNBbHBoYQAMZXZlbnRfdmFsdWVzAAoDAwMP
Dw8SEgMDCP0C/QL9AgAAlAKmZiGf
r3AMVRNmAAAASwAAAMtfLgAAAA4BAAAAAAMACUJHUkNBbHBoYQAMZXJyb3JfbG9nZ2VyAAgDDw8P
EhIDAwj9Av0C/QIAAKCb/IcL
r3AMVR9mAAAAowAAAG5gLgAAADEBAAAAAAAAAgAT////////ADb6lgQAAJsEAAABAAAAmZWpMpOZ
lakyngEAAACZlakypwZDUkVBVEWZlakylABRUwAAAwAAAAEAAAACAAAAADb6lgQAAJsEAAABAAAA
mZWpMpOZlakyngEAAACZlakypwNORVeZlakyqABRUwAAAwAAAAEAAAACAAAAd9WEUw==
### UPDATE `db_2`.`table_2`
### WHERE
### @1=1174
### @2=1179
### @3=1
### @4='2015-03-20 19:10:19'
### @5='2015-03-20 19:10:30'
### @6=1
### @7='2015-03-20 19:10:39'
### @8='CREATE'
### @9='2015-03-20 19:10:20'
### @10=NULL
### @11=NULL
### @12=0
### @13=NULL
### @14=NULL
### @15=21329
### @16=3
### @17=1
### @18=NULL
### @19=2
### SET
### @1=1174
### @2=1179
### @3=1
### @4='2015-03-20 19:10:19'
### @5='2015-03-20 19:10:30'
### @6=1
### @7='2015-03-20 19:10:39'
### @8='NEW'
### @9='2015-03-20 19:10:40'
### @10=NULL
### @11=NULL
### @12=0
### @13=NULL
### @14=NULL
### @15=21329
### @16=3
### @17=1
### @18=NULL
### @19=2
r3AMVR5mAAAAXgAAADRhLgAAAJIAAAAAAAAAAgAN//9A9B0AAAAXAG5ld19jbGllbnRfcmVxdWVz
dDoxMTc0AgAAAAMAAAABAAAAAQAAAAABmZWpMqcCAAAAilFeaQ==
### INSERT INTO `db_2`.`table_1`
### SET
### @1=29
### @2='new_client_request:1174'
### @3=2
### @4=3
### @5=1
### @6=1
### @7=NULL
### @8=0
### @9=1
### @10='2015-03-20 19:10:39'
### @11=NULL
### @12=2
### @13=NULL
r3AMVR5mAAAASQAAAH1hLgAAABABAAAAAAEAAgAK//+U/h0AAAAdAAAACgByZXF1ZXN0X2lkBAAx
MTc0mZWpMqcCAAAANkpXxA==
'/*!*/;
### INSERT INTO `db_2`.`table_3`
### SET
### @1=29
### @2=29
### @3=NULL
### @4='request_id'
### @5=NULL
### @6='1174'
### @7='2015-03-20 19:10:39'
### @8=NULL
### @9=2
### @10=NULL
# at 3039613
#150320 16:10:39 server id 102 end_log_pos 3039644 CRC32 0x4c709170 Xid = 6041375
COMMIT/*!*/;
# at 3039644

Teemu Ollakka: "the healthy transaction seems to report crc32 checksums but the parsed GRA file does not".

And here is the parsed GRA:

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#140113 17:12:42 server id 3 end_log_pos 120 CRC32 0xe6f1f7fe Start: binlog v 4, server v 5.6.15-log created 140113 17:12:42 at startup
ROLLBACK/*!*/;
BINLOG '
qjrUUg8DAAAAdAAAAHgAAAAAAAQANS42LjE1LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAACqOtRSEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAf73
8eY=
'/*!*/;
# at 120
#140113 17:12:43 server id 3 end_log_pos 143 CRC32 0x26e85337 Stop
# at 143
#150320 16:10:39 server id 102 end_log_pos 80 CRC32 0x4e494745 Query thread_id=3805 exec_time=0 error_code=0
SET TIMESTAMP=1426878639/*!*/;
SET @@session.pseudo_thread_id=3805/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1574961152/*!*/;
SET @@session.auto_increment_increment=5, @@session.auto_increment_offset=4/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
B
/*!*/;
# at 223
# at 315
# at 386
# at 460
# at 531
#150320 16:10:39 server id 102 end_log_pos 172 CRC32 0x06365017 Table_map: `db_2`.`table_2` mapped to number 305
#150320 16:10:39 server id 102 end_log_pos 243 CRC32 0x147a0000 Table_map: `db_2`.`table_1` mapped to number 146
#150320 16:10:39 server id 102 end_log_pos 317 CRC32 0x02940000 Table_map: `db_2`.`table_3` mapped to number 272
#150320 16:10:39 server id 102 end_log_pos 388 CRC32 0xa0000002 Table_map: `db_2`.`table_4` mapped to number 270
#150320 16:10:39 server id 102 end_log_pos 545 CRC32 0x00000002 Update_rows: table id 305 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAAXAAAAKwAAAAAADEBAAAAAAMACUJHUkNBbHBoYQARbWFuaWZlc3RfcmVxdWVzdHMA
EwMDAxISAxIPEg8SAQMPAwMDAwMLAAAAHgAAcBcAcBdQNgY=
r3AMVRNmAAAARwAAAPMAAAAAAJIAAAAAAAMACUJHUkNBbHBoYQAKZXZlbnRfbG9ncwANAw8DAwMD
AwEBEhIDAwT9AgAAehQ=
r3AMVRNmAAAASgAAAD0BAAAAABABAAAAAAMACUJHUkNBbHBoYQAMZXZlbnRfdmFsdWVzAAoDAwMP
Dw8SEgMDCP0C/QL9AgAAlAI=
r3AMVRNmAAAARwAAAIQBAAAAAA4BAAAAAAMACUJHUkNBbHBoYQAMZXJyb3JfbG9nZ2VyAAgDDw8P
EhIDAwj9Av0C/QIAAKA=
r3AMVR9mAAAAnQAAACECAAAAADEBAAAAAAEAAgAT////////QDb6lgQAAJsEAAABAAAAmZWpMpOZ
lakyngEAAAAGQ1JFQVRFmZWpMpQAUVMAAAMAAAABAAAAAgAAAAA2+pYEAACbBAAAAQAAAJmVqTKT
mZWpMp4BAAAAmZWpMqcGQ1JFQVRFmZWpMpQAUVMAAAMAAAABAAAAAgAAAA==
'/*!*/;
### UPDATE `db_2`.`table_2`
### WHERE
### @1=1174 /* INT meta=0 nullable=0 is_null=0 */
### @2=1179 /* INT meta=0 nullable=0 is_null=0 */
### @3=1 /* INT meta=0 nullable=0 is_null=0 */
### @4='2015-03-20 19:10:19' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:30' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @6=1 /* INT meta=0 nullable=0 is_null=0 */
### @7=NULL /* INT meta=0 nullable=1 is_null=1 */
### @8='CREATE' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @9='2015-03-20 19:10:20' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @10=NULL /* DATETIME(0) meta=6000 nullable=1 is_null=1 */
### @11=NULL /* DATETIME(0) meta=0 nullable=1 is_null=1 */
### @12=0 /* TINYINT meta=0 nullable=0 is_null=0 */
### @13=NULL /* TINYINT meta=0 nullable=1 is_null=1 */
### @14=NULL /* TINYINT meta=6000 nullable=1 is_null=1 */
### @15=21329 /* INT meta=0 nullable=0 is_null=0 */
### @16=3 /* INT meta=0 nullable=0 is_null=0 */
### @17=1 /* INT meta=0 nullable=0 is_null=0 */
### @18=NULL /* INT meta=0 nullable=1 is_null=1 */
### @19=2 /* INT meta=0 nullable=1 is_null=0 */
### SET
### @1=1174 /* INT meta=0 nullable=0 is_null=0 */
### @2=1179 /* INT meta=0 nullable=0 is_null=0 */
### @3=1 /* INT meta=0 nullable=0 is_null=0 */
### @4='2015-03-20 19:10:19' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:30' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @6=1 /* INT meta=0 nullable=0 is_null=0 */
### @7='2015-03-20 19:10:39' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @8='CREATE' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @9='2015-03-20 19:10:20' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @10=NULL /* DATETIME(0) meta=6000 nullable=1 is_null=1 */
### @11=NULL /* DATETIME(0) meta=0 nullable=1 is_null=1 */
### @12=0 /* TINYINT meta=0 nullable=0 is_null=0 */
### @13=NULL /* TINYINT meta=0 nullable=1 is_null=1 */
### @14=NULL /* TINYINT meta=6000 nullable=1 is_null=1 */
### @15=21329 /* INT meta=0 nullable=0 is_null=0 */
### @16=3 /* INT meta=0 nullable=0 is_null=0 */
### @17=1 /* INT meta=0 nullable=0 is_null=0 */
### @18=NULL /* INT meta=0 nullable=1 is_null=1 */
### @19=***Corrupted replication event was detected. Not printing the value***
# at 688
# at 758
#150320 16:10:39 server id 102 end_log_pos 615 CRC32 0x00000000 Table_map: `db_2`.`table_5` mapped to number 124
#150320 16:10:39 server id 102 end_log_pos 707 CRC32 0xa832a995 Update_rows: table id 124 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAARgAAAGcCAAAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAAA==
r3AMVR9mAAAAXAAAAMMCAAAAAHwAAAAAAAEAAgAG///AGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKU
mqj/fvvAGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKUmZWpMqg=
'/*!*/;
### UPDATE `db_2`.`table_5`
### WHERE
### @1=1819 /* INT meta=0 nullable=0 is_null=0 */
### @2=1174 /* INT meta=0 nullable=0 is_null=0 */
### @3='CREATE' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @4=582 /* INT meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:20' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @6='2099-12-31 23:59:59' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### SET
### @1=1819 /* INT meta=0 nullable=0 is_null=0 */
### @2=1174 /* INT meta=0 nullable=0 is_null=0 */
### @3='CREATE' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @4=582 /* INT meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:20' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @6=***Corrupted replication event was detected. Not printing the value***
# at 850
# at 920
#150320 16:10:39 server id 102 end_log_pos 777 CRC32 0x00000000 Table_map: `db_2`.`table_5` mapped to number 124
#150320 16:10:39 server id 102 end_log_pos 835 CRC32 0xfb7effa8 Write_rows: table id 124 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAARgAAAAkDAAAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAAA==
r3AMVR5mAAAAOgAAAEMDAAAAAHwAAAAAAAEAAgAG/8AlBwAAlgQAAANORVdGAgAAmZWpMqiaqP9+
+w==
'/*!*/;
### INSERT INTO `db_2`.`table_5`
### SET
### @1=1829 /* INT meta=0 nullable=0 is_null=0 */
### @2=1174 /* INT meta=0 nullable=0 is_null=0 */
### @3='NEW' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @4=582 /* INT meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:40' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @6=***Corrupted replication event was detected. Not printing the value***
# at 978
# at 1070
# at 1141
# at 1215
# at 1286
# at 1445
#150320 16:10:39 server id 102 end_log_pos 1402 CRC32 0x60676e69 Query thread_id=3805 exec_time=0 error_code=0
SET TIMESTAMP=1426878639/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SAVEPOINT `event_logg
/*!*/;
# at 1545
# at 1635
#150320 16:10:39 server id 102 end_log_pos 1492 CRC32 0x00000002 Write_rows: table id 146
#150320 16:10:39 server id 102 end_log_pos 1561 CRC32 0x00000002 Write_rows: table id 272 flags: STMT_END_F

BINLOG '
r3AMVRNmAAAAXAAAAJ8DAAAAADEBAAAAAAMACUJHUkNBbHBoYQARbWFuaWZlc3RfcmVxdWVzdHMA
EwMDAxISAxIPEg8SAQMPAwMDAwMLAAAAHgAAcBcAcBdQNgY=
r3AMVRNmAAAARwAAAOYDAAAAAJIAAAAAAAMACUJHUkNBbHBoYQAKZXZlbnRfbG9ncwANAw8DAwMD
AwEBEhIDAwT9AgAAehQ=
r3AMVRNmAAAASgAAADAEAAAAABABAAAAAAMACUJHUkNBbHBoYQAMZXZlbnRfdmFsdWVzAAoDAwMP
Dw8SEgMDCP0C/QL9AgAAlAI=
r3AMVRNmAAAARwAAAHcEAAAAAA4BAAAAAAMACUJHUkNBbHBoYQAMZXJyb3JfbG9nZ2VyAAgDDw8P
EhIDAwj9Av0C/QIAAKA=
r3AMVR9mAAAAnwAAABYFAAAAADEBAAAAAAAAAgAT////////ADb6lgQAAJsEAAABAAAAmZWpMpOZ
lakyngEAAACZlakypwZDUkVBVEWZlakylABRUwAAAwAAAAEAAAACAAAAADb6lgQAAJsEAAABAAAA
mZWpMpOZlakyngEAAACZlakypwNORVeZlakyqABRUwAAAwAAAAEAAAACAAAA
### UPDATE `db_2`.`table_2`
### WHERE
### @1=1174 /* INT meta=0 nullable=0 is_null=0 */
### @2=1179 /* INT meta=0 nullable=0 is_null=0 */
### @3=1 /* INT meta=0 nullable=0 is_null=0 */
### @4='2015-03-20 19:10:19' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:30' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @6=1 /* INT meta=0 nullable=0 is_null=0 */
### @7='2015-03-20 19:10:39' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @8='CREATE' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @9='2015-03-20 19:10:20' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @10=NULL /* DATETIME(0) meta=6000 nullable=1 is_null=1 */
### @11=NULL /* DATETIME(0) meta=0 nullable=1 is_null=1 */
### @12=0 /* TINYINT meta=0 nullable=0 is_null=0 */
### @13=NULL /* TINYINT meta=0 nullable=1 is_null=1 */
### @14=NULL /* TINYINT meta=6000 nullable=1 is_null=1 */
### @15=21329 /* INT meta=0 nullable=0 is_null=0 */
### @16=3 /* INT meta=0 nullable=0 is_null=0 */
### @17=1 /* INT meta=0 nullable=0 is_null=0 */
### @18=NULL /* INT meta=0 nullable=1 is_null=1 */
### @19=2 /* INT meta=0 nullable=1 is_null=0 */
### SET
### @1=1174 /* INT meta=0 nullable=0 is_null=0 */
### @2=1179 /* INT meta=0 nullable=0 is_null=0 */
### @3=1 /* INT meta=0 nullable=0 is_null=0 */
### @4='2015-03-20 19:10:19' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @5='2015-03-20 19:10:30' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @6=1 /* INT meta=0 nullable=0 is_null=0 */
### @7='2015-03-20 19:10:39' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
### @8='NEW' /* VARSTRING(30) meta=30 nullable=0 is_null=0 */
### @9='2015-03-20 19:10:40' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @10=NULL /* DATETIME(0) meta=6000 nullable=1 is_null=1 */
### @11=NULL /* DATETIME(0) meta=0 nullable=1 is_null=1 */
### @12=0 /* TINYINT meta=0 nullable=0 is_null=0 */
### @13=NULL /* TINYINT meta=0 nullable=1 is_null=1 */
### @14=NULL /* TINYINT meta=6000 nullable=1 is_null=1 */
### @15=21329 /* INT meta=0 nullable=0 is_null=0 */
### @16=3 /* INT meta=0 nullable=0 is_null=0 */
### @17=1 /* INT meta=0 nullable=0 is_null=0 */
### @18=NULL /* INT meta=0 nullable=1 is_null=1 */
### @19=***Corrupted replication event was detected. Not printing the value***
r3AMVR5mAAAAWgAAANQFAAAAAJIAAAAAAAAAAgAN//9A9B0AAAAXAG5ld19jbGllbnRfcmVxdWVz
dDoxMTc0AgAAAAMAAAABAAAAAQAAAAABmZWpMqcCAAAA
### INSERT INTO `db_2`.`table_1`
### SET
### @1=29 /* INT meta=0 nullable=0 is_null=0 */
### @2='new_client_request:1174' /* VARSTRING(765) meta=765 nullable=1 is_null=0 */
### @3=2 /* INT meta=0 nullable=0 is_null=0 */
### @4=3 /* INT meta=0 nullable=1 is_null=0 */
### @5=1 /* INT meta=0 nullable=1 is_null=0 */
### @6=1 /* INT meta=0 nullable=1 is_null=0 */
### @7=NULL /* INT meta=0 nullable=1 is_null=1 */
### @8=0 /* TINYINT meta=0 nullable=0 is_null=0 */
### @9=1 /* TINYINT meta=0 nullable=0 is_null=0 */
### @10='2015-03-20 19:10:39' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @11=NULL /* DATETIME(0) meta=0 nullable=1 is_null=1 */
### @12=***Corrupted replication event was detected. Not printing the value***
r3AMVR5mAAAARQAAABkGAAAAABABAAAAAAEAAgAK//+U/h0AAAAdAAAACgByZXF1ZXN0X2lkBAAx
MTc0mZWpMqcCAAAA
'/*!*/;
### INSERT INTO `db_2`.`table_3`
### SET
### @1=29 /* INT meta=0 nullable=0 is_null=0 */
### @2=29 /* INT meta=0 nullable=0 is_null=0 */
### @3=NULL /* INT meta=0 nullable=1 is_null=1 */
### @4='request_id' /* VARSTRING(765) meta=765 nullable=0 is_null=0 */
### @5=NULL /* VARSTRING(765) meta=765 nullable=1 is_null=1 */
### @6='1174' /* VARSTRING(765) meta=765 nullable=0 is_null=0 */
### @7='2015-03-20 19:10:39' /* DATETIME(0) meta=0 nullable=0 is_null=0 */
### @8=NULL /* DATETIME(0) meta=0 nullable=1 is_null=1 */
### @9=***Corrupted replication event was detected. Not printing the value***
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

So, the BINLOG strings of the corrupt Xid are all literally "cut" or corrupt
by 8 bytes:

In binlog:
BINLOG '
r3AMVRNmAAAASgAAAJ9dLgAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAABMLTgo=
r3AMVR9mAAAAYAAAAP9dLgAAAHwAAAAAAAEAAgAG///AGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKU
mqj/fvvAGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKUmZWpMqiwO8RE
'/*!*/;

In GRA:
BINLOG '
r3AMVRNmAAAARgAAAGcCAAAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAAA==
r3AMVR9mAAAAXAAAAMMCAAAAAHwAAAAAAAEAAgAG///AGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKU
mqj/fvvAGwcAAJYEAAAGQ1JFQVRFRgIAAJmVqTKUmZWpMqg=
'/*!*/;

You can see the last few bytes "iwO8RE" were cut and instead we see "g="

Here's another example:

From binlog:
BINLOG '
r3AMVRNmAAAASgAAAEleLgAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAAL1+oFE=
r3AMVR5mAAAAPgAAAIdeLgAAAHwAAAAAAAEAAgAG/8AlBwAAlgQAAANORVdGAgAAmZWpMqiaqP9+
+7UCln4=
'/*!*/;

From GRA:
BINLOG '
r3AMVRNmAAAARgAAAAkDAAAAAHwAAAAAAAEACUJHUkNBbHBoYQARbWFuaWZlc3Rfc3RhdHVzZXMA
BgMDDwMSEgQeAAAAAA==
r3AMVR5mAAAAOgAAAEMDAAAAAHwAAAAAAAEAAgAG/8AlBwAAlgQAAANORVdGAgAAmZWpMqiaqP9+
+w==

You can see "+7UCln4=" was transformed to "+w=="

And one more:

In binlog
r3AMVR5mAAAAXgAAADRhLgAAAJIAAAAAAAAAAgAN//9A9B0AAAAXAG5ld19jbGllbnRfcmVxdWVz
dDoxMTc0AgAAAAMAAAABAAAAAQAAAAABmZWpMqcCAAAAilFeaQ==

In GRA
r3AMVR5mAAAAWgAAANQFAAAAAJIAAAAAAAAAAgAN//9A9B0AAAAXAG5ld19jbGllbnRfcmVxdWVz
dDoxMTc0AgAAAAMAAAABAAAAAQAAAAABmZWpMqcCAAAA

Here the whole "ilFeaQ==" is missing in gra

> Teemu Ollakka: we might have asymmetric settings

I verified my.cnf from each node and only found small divergences in one of
the node's .cnf where character-sets-dir, sync_binlog and
innodb_flush_log_at_trx_commit where different than in the rest of the nodes.

And the my.cnf from one of the nodes:

[client]
port = 4301
socket = /customrootdir/bin/mysql/sockets/mysql.sock.4301
character-sets-dir=/customrootdir/bin/mysql/mysql5.5/share/charsets

[mysqld]
port = 4301
socket = /customrootdir/bin/mysql/sockets/mysql.sock.4301
datadir = /customrootdir/mysqldata1/4301/myisam
tmpdir = /customrootdir/mysqldata1/tmp
slave-load-tmpdir=/customrootdir/mysqldata1/tmp
log-bin=/customrootdir/mysqllogs1/mysql/4301/binlogs/4301_bin
relay-log=/customrootdir/mysqllogs1/mysql/4301/binlogs/4301_relay_bin
relay-log-index=/customrootdir/mysqllogs1/mysql/4301/binlogs/4301_relay_bin
max_binlog_size = 1G
binlog_format = ROW
expire_logs_days = 15
sync_binlog = 1
slow-query-log = 1
slow_query_log_file = slow_query.log
long_query_time = 2
log_warnings = 2
character-set-server=utf8
collation_server='utf8_general_ci'
skip-external-locking
back_log = 500
max_connections = 500
max_connect_errors = 500
table_open_cache = 4000
table_definition_cache=4000
open-files-limit = 24000
interactive_timeout = 3600
wait_timeout = 3600
net_read_timeout = 120
net_write_timeout = 240
max_allowed_packet = 2048M
binlog_cache_size = 8M
max_heap_table_size = 128M
tmp_table_size = 128M
read_buffer_size = 1M
read_rnd_buffer_size = 1M
sort_buffer_size = 1M
join_buffer_size = 1M
thread_cache_size = 250
large-pages
query_cache_type = 0
query_cache_size = 0
query_cache_limit = 2M
thread_pool_size=36
server-id = 202
auto_increment_offset=2
auto_increment_increment=2
key_buffer_size = 256M
bulk_insert_buffer_size = 32M
myisam_sort_buffer_size = 1M
myisam-recover = BACKUP,FORCE
default-storage-engine = InnoDB
innodb_file_per_table = 1
innodb_support_xa = 0
innodb_additional_mem_pool_size = 32M
innodb_buffer_pool_size = 20G
innodb_buffer_pool_instances = 4
innodb_data_home_dir = /customrootdir/mysqldata1/4301/innodb
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_files_in_group = 2
innodb_open_files = 16000
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 32M
innodb_log_file_size = 1536M
innodb_log_group_home_dir=/customrootdir/mysqldata1/4301/innodb
innodb_lock_wait_timeout = 50
innodb_io_capacity = 10000
innodb_thread_concurrency = 0
innodb_write_io_threads = 32
innodb_read_io_threads = 32
innodb_file_format = 'Barracuda'
innodb_purge_threads = 1
log-bin-trust-function-creators=1
metadata_locks_hash_instances=256
innodb_checksum_algorithm=crc32
innodb_monitor_enable=%
innodb_lru_scan_depth=2000
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_undo_directory=/customrootdir/mysqldata1/4301/innodb
innodb_undo_tablespaces=2
innodb_undo_logs=8
innodb_stats_on_metadata=0
innodb_sync_array_size=16
slave-parallel-workers=8
slave_pending_jobs_size_max=1G
wsrep_provider=/customrootdir/bin/mysql/pxc5.6/lib/libgalera_smm.so
wsrep_provider_options="gcache.size = 10G;gmcast.listen_addr=tcp://0.0.0.0:4401;ist.recv_addr=node_dc1_02:4501;gmcast.segment = 2;" wsrep_cluster_address=gcomm://node_dc1_02:4401,node_dc2_02:4401,node_dc1_03:4401,node_dc2_03:4401
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_cluster_name=mysqd2_4301
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=user:pwd
wsrep_node_address=node_dc1_02
wsrep_slave_threads=10

[mysqldump]
quick
max_allowed_packet = 64M

[mysql]
no-auto-rehash

[mysqld_safe]
core-file-size=2G

[mysqlhotcopy]
interactive-timeout

[sst]
transferfmt=nc

mysqlbinlog --verbose GRA_2_6041375.binlog |& grep sql_mode
SET @@session.sql_mode=1574961152/*!*/;
SET @@session.sql_mode=1073741824/*!*/;

Not sure if relevant, but they switch sql mode through the transaction...

Teemu Ollakka (teemu-ollakka) wrote :

It seems that the log events stored in wsrep write sets won't contain checksums since crc:s are computed only when writing IO cache to binlog (see MYSQL_BIN_LOG::do_write_cache()). This in turn means that:

* Parsing GRA files using binlog header and mysqlbinlog will display corrupted replication events if the binlog header has checksum setting on even if the events in GRA file are perfectly valid.
* The cause of slave failure is still unknown.

markus_albe (markus-albe) wrote :
Download full text (66.6 KiB)

Settings from the node acting as master:

mysql> show global variables\G
*************************** 1. row ***************************
Variable_name: auto_increment_increment
        Value: 2
*************************** 2. row ***************************
Variable_name: auto_increment_offset
        Value: 1
*************************** 3. row ***************************
Variable_name: autocommit
        Value: ON
*************************** 4. row ***************************
Variable_name: automatic_sp_privileges
        Value: ON
*************************** 5. row ***************************
Variable_name: back_log
        Value: 500
*************************** 6. row ***************************
Variable_name: basedir
        Value: /customrootdir/bin/mysql/pxc5.6
*************************** 7. row ***************************
Variable_name: big_tables
        Value: OFF
*************************** 8. row ***************************
Variable_name: bind_address
        Value: *
*************************** 9. row ***************************
Variable_name: binlog_cache_size
        Value: 8388608
*************************** 10. row ***************************
Variable_name: binlog_checksum
        Value: CRC32
*************************** 11. row ***************************
Variable_name: binlog_direct_non_transactional_updates
        Value: OFF
*************************** 12. row ***************************
Variable_name: binlog_error_action
        Value: IGNORE_ERROR
*************************** 13. row ***************************
Variable_name: binlog_format
        Value: ROW
*************************** 14. row ***************************
Variable_name: binlog_max_flush_queue_time
        Value: 0
*************************** 15. row ***************************
Variable_name: binlog_order_commits
        Value: ON
*************************** 16. row ***************************
Variable_name: binlog_row_image
        Value: FULL
*************************** 17. row ***************************
Variable_name: binlog_rows_query_log_events
        Value: OFF
*************************** 18. row ***************************
Variable_name: binlog_stmt_cache_size
        Value: 32768
*************************** 19. row ***************************
Variable_name: binlogging_impossible_mode
        Value: IGNORE_ERROR
*************************** 20. row ***************************
Variable_name: block_encryption_mode
        Value: aes-128-ecb
*************************** 21. row ***************************
Variable_name: bulk_insert_buffer_size
        Value: 33554432
*************************** 22. row ***************************
Variable_name: character_set_client
        Value: utf8
*************************** 23. row ***************************
Variable_name: character_set_connection
        Value: utf8
*************************** 24. row ***************************
Variable_name: character_set_database
        Value: utf8
*************************** 25. row ***************************
Variable_name: character_set_filesystem
        Value: binary
*************************** 26. row ***************************
Variable_name: character_set_results
      ...

Seppo Jaakola (seppo-jaakola) wrote :

Error logs in two nodes show some weird issues with nc command used in SST joiner role:

2015-03-20 14:52:39 38864 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '######' --auth '####' --datadir '/bb/mysqldata1/4301/myisam/' --defaults-file '/bb/bin/mysql/environment/5.6/mysql_4301.cnf' --parent '38864' '' '
WSREP_SST: [INFO] Streaming with xbstream (20150320 14:52:40.225)
WSREP_SST: [INFO] Using netcat as streamer (20150320 14:52:40.228)
usage: nc [-46DdhklnrStUuvzC] [-i interval] [-p source_port]
   [-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_version]
   [-x proxy_address[:port]] [hostname] [port[s]]
 Command Summary:
  -4 Use IPv4
  -6 Use IPv6
  -D Enable the debug socket option
  -d Detach from stdin
  -h This help text
  -i secs Delay interval for lines sent, ports scanned
  -k Keep inbound sockets open for multiple connects
  -l Listen mode, for inbound connects
  -n Suppress name/port resolutions
  -p port Specify local port for remote connects
  -r Randomize remote ports
   -S Enable the TCP MD5 signature option
  -s addr Local source address
  -T ToS Set IP Type of Service
  -C Send CRLF as line-ending
  -t Answer TELNET negotiation
  -U Use UNIX domain socket
  -u UDP mode
  -v Verbose
  -w secs Timeout for connects and final net reads
  -X proto Proxy protocol: "4", "5" (SOCKS) or "connect"
  -x addr[:port] Specify proxy address and port
  -z Zero-I/O mode [used for scanning]
 Port numbers can be individual or ranges: lo-hi [inclusive]
WSREP_SST: [INFO] Evaluating timeout -s9 100 nc -dl 4444 | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20150320 14:52:40.256)
...

no longer affects: percona-xtradb-cluster
tags: added: i53148
tags: added: upstream
summary: - Corrupt replication events
+ Slave assertion in unpack_row with ROLLBACK TO SAVEPOINT in error
+ handler

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-888

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.