Bug #1251137 “Galera Replication from 5.6 node to 5.5 node fail...” : Bugs : MySQL patches by Codership

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14:

#1

5.6 node is on:

rpm -qa | grep -i percona
Percona-XtraDB-Cluster-client-56-5.6.14-rel25.1.1.rhel6.x86_64
Percona-XtraDB-Cluster-galera-56-3.1-1.169.rhel6.x86_64
Percona-XtraDB-Cluster-shared-56-5.6.14-rel25.1.1.rhel6.x86_64
Percona-Server-shared-51-5.1.72-rel14.10.597.rhel6.x86_64
percona-xtrabackup-2.1.5-680.rhel6.x86_64
Percona-XtraDB-Cluster-server-56-5.6.14-rel25.1.1.rhel6.x86_64
percona-testing-0.0-1.noarch

(Server version: 5.6.14-56 Percona XtraDB Cluster (GPL), Release 25.1, Revision 557, wsrep_25.1.r4019)

5.5 node is on:

rpm -qa | grep -i percona
percona-xtrabackup-test-2.0.2-461.rhel6.x86_64
Percona-XtraDB-Cluster-client-5.5.34-25.9.575.rhel6.x86_64
Percona-Server-shared-compat-5.5.34-rel32.0.591.rhel6.x86_64
percona-toolkit-2.1.3-2.noarch
Percona-XtraDB-Cluster-galera-2.8-1.165.rhel6.x86_64
Percona-XtraDB-Cluster-shared-5.5.34-25.9.575.rhel6.x86_64
Percona-XtraDB-Cluster-server-5.5.34-25.9.575.rhel6.x86_64
Percona-XtraDB-Cluster-debuginfo-5.5.34-25.9.575.rhel6.x86_64
Percona-XtraDB-Cluster-galera-debuginfo-2.8-1.162.rhel6.x86_64
percona-testing-0.0-1.noarch
percona-xtrabackup-2.1.5-680.rhel6.x86_64
Percona-XtraDB-Cluster-test-5.5.34-25.9.575.rhel6.x86_64
Percona-XtraDB-Cluster-devel-5.5.34-25.9.575.rhel6.x86_64

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14:

#2

Galera on 3.1

2013-11-14 12:26:38 7110 [Note] WSREP: wsrep_load(): Galera 3.1(r169) by Codership Oy <email address hidden> loaded successfully.

on 2.8:

131114 11:05:56 [Note] WSREP: wsrep_load(): Galera 2.8(r165) by Codership Oy <email address hidden> loaded successfully.

summary:

- Galera Replication from 5.5 to 5.6 fails
+ Galera Replication between 5.6 and 5.5 fails

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14: Re: Galera Replication between 5.6 and 5.5 fails

#3

5.6 config:
==========================================================================
[mysqld]
datadir=/var/lib/mysql

#log_slave_updates

server-id=341
#log_bin = /var/lib/mysql/mysql-bin.log

binlog_format = ROW
innodb_buffer_pool_size = 100M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 512M
innodb_file_per_table = 1
wsrep-node-address=10.0.2.153

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider_options = "socket.checksum = 1"

wsrep_slave_threads=2
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc1

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

5.5 config:
============================================================
[mysqld]
datadir=/var/lib/mysql

server-id=248

binlog_format = ROW
thread_stack = 256K
thread_cache_size = 512
tmp_table_size = 32M
max_heap_table_size = 32M
max_connections = 10000
open-files-limit = 65535
table_open_cache = 8192
table_definition_cache = 8192
key_buffer_size = 64M
innodb_buffer_pool_size = 500M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 512M
innodb_file_per_table = 1
wsrep-node-address=10.0.2.154

loose-query_response_time_stats

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so

wsrep_slave_threads=2
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc2

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

========================================================================================

As you can see I had binlogging enabled earlier on nodes but disabled it, even then it fails.

5.6 config:
==========================================================================
[mysqld]
datadir=/var/lib/mysql

#log_slave_updates

server-id=341
#log_bin                 = /var/lib/mysql/mysql-bin.log

binlog_format           = ROW
innodb_buffer_pool_size         = 100M
innodb_flush_log_at_trx_commit  = 2
innodb_flush_method             = O_DIRECT
innodb_log_files_in_group       = 2
innodb_log_file_size            = 512M
innodb_file_per_table           = 1
wsrep-node-address=10.0.2.153

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider_options = "socket.checksum = 1"

wsrep_slave_threads=2
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc1

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

5.5 config:
============================================================
[mysqld]
datadir=/var/lib/mysql

server-id=248

binlog_format           = ROW
thread_stack            = 256K
thread_cache_size       = 512
tmp_table_size          = 32M
max_heap_table_size     = 32M
max_connections         = 10000
open-files-limit        = 65535
table_open_cache        = 8192
table_definition_cache  = 8192
key_buffer_size         = 64M
innodb_buffer_pool_size         = 500M
innodb_flush_log_at_trx_commit  = 2
innodb_flush_method             = O_DIRECT
innodb_log_files_in_group       = 2
innodb_log_file_size            = 512M
innodb_file_per_table           = 1
wsrep-node-address=10.0.2.154

loose-query_response_time_stats

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so

wsrep_slave_threads=2
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc2

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

========================================================================================

As you can see I had binlogging enabled earlier on nodes but disabled it, even then it fails.

Changed in percona-xtradb-cluster:
importance:	Undecided → High

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2013-11-14:

#4

quick manual testing suggest that replication from MySQL 5.5 node to MySQL 5.6 node works.
But trying to replicate from 5.6 -> 5.5 will cause immediate crash.

So it looks like migrating to 5.6 cluster would be possible by allowing writes to 5.5 nodes only, until all nodes have been upgraded to 5.6 level

Changed in codership-mysql:
assignee:	nobody → Seppo Jaakola (seppo-jaakola)

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2013-11-14:

#5

MySQL 5.6 -> 5.5 replication is violated if 5.6 node uses gtid, binlog checksums or new ROW event formats. These can be prevented by configuring 5.6 node with:

log_bin_use_v1_row_events=1
gtid_mode=0
binlog_checksum=NONE

With this configuration, at least basic 5.6 -> 5.5 replication seems to work. But more testing is needed...

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14:

#6

Download full text (9.3 KiB)

While the configuration worked with this:

=================
mysql> use sbtest;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table xyz (x int(11) auto_increment primary key);
Query OK, 0 rows affected (0.09 sec)

mysql> insert into xyz values (NULL);
Query OK, 1 row affected (0.01 sec)

mysql> insert into xyz values (NULL);
Query OK, 1 row affected (0.00 sec)

mysql> insert into xyz values (NULL);
Query OK, 1 row affected (0.02 sec)
========================================================

A sysbench workload didn't go well.

sysbench --test=./oltp.lua --db-driver=mysql --mysql-engine-trx=yes --mysql-table-engine=innodb --mysql-user=root --mysql-password=test --oltp-table-size=30000 --num-threads=16 --init-rng=on --max-requests=0 --oltp-auto-inc=off --max-time=30000 --max-requests=300000 run
sysbench 0.5: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored

Threads started!

ALERT: failed to execute MySQL query: `INSERT INTO sbtest1 (id, k, c, pad) VALUES (14851, 14946, '24726585045-91236881311-60534147758-46321582953-08975433339-12295930364-91635364131-39593067613-88729288733-07642591607', '15735895994-49834668127-10360676632-98841189449-62687644138')`:
ALERT: Error 1062 Duplicate entry '14851' for key 'PRIMARY'
FATAL: failed to execute function `event': (null)
^C

5.6 node
==============
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21...

While the configuration worked with this:

=================
mysql> use sbtest;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table xyz (x int(11) auto_increment primary key);
Query OK, 0 rows affected (0.09 sec)

mysql> insert into xyz values (NULL);
Query OK, 1 row affected (0.01 sec)

mysql> insert into xyz values (NULL);
Query OK, 1 row affected (0.00 sec)

mysql> insert into xyz values (NULL);
Query OK, 1 row affected (0.02 sec)
========================================================

A sysbench workload didn't go well.

sysbench --test=./oltp.lua --db-driver=mysql --mysql-engine-trx=yes --mysql-table-engine=innodb  --mysql-user=root --mysql-password=test --oltp-table-size=30000 --num-threads=16 --init-rng=on --max-requests=0 --oltp-auto-inc=off --max-time=30000 --max-requests=300000 run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored

Threads started!

ALERT: failed to execute MySQL query: `INSERT INTO sbtest1 (id, k, c, pad) VALUES (14851, 14946, '24726585045-91236881311-60534147758-46321582953-08975433339-12295930364-91635364131-39593067613-88729288733-07642591607', '15735895994-49834668127-10360676632-98841189449-62687644138')`:
ALERT: Error 1062 Duplicate entry '14851' for key 'PRIMARY'
FATAL: failed to execute function `event': (null)
^C

5.6 node
==============
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-14 21:20:32 6370 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3

5.5 node
==============================

131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 2th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 3th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 4th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 5th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 6th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 7th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 8th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 9th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [Warning] WSREP: Failed to apply app buffer: seqno: 29, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 10th time
131114 21:20:31 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'id' cannot be null, Error_code: 1048; Can't find record in 'sbtest1', Error_code: 1032; Column 'id' cannot be null, Error_code: 1048; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 286, Error_code: 1032
131114 21:20:31 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 29
131114 21:20:31 [ERROR] WSREP: Failed to apply trx: source: d070e33c-4d43-11e3-b9af-1e084137cd34 version: 2 local: 0 state: APPLYING flags: 129 conn_id: 13 trx_id: 2362 seqnos (l: 27, g: 29, s: 28, d: 28, ts: 1384444230986129849)
131114 21:20:31 [ERROR] WSREP: Failed to apply trx 29 10 times

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14:

#7

#3 is for 5.6 to 5.5 replication.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14:

#8

Download full text (5.4 KiB)

a) s/#3/#6/ in previous comment.

b) Tested 5.5 --------> 5.6, works even with multiple sysbench threads.

c) 5.6 --------> 5.5 OTOH fails even with sysbench of 1 thread. (so multi-threading is not an issue here):

^[[A131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 2th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 3th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 4th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 5th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 6th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the ...

a) s/#3/#6/ in previous comment.

b) Tested 5.5 --------> 5.6, works even with multiple sysbench threads.

c) 5.6 --------> 5.5 OTOH fails even with sysbench of 1 thread. (so multi-threading is not an issue here):

^[[A131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 2th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 3th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 4th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 5th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 6th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 7th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 8th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 9th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970
131114 21:57:46 [Warning] WSREP: Failed to apply app buffer: seqno: 9970, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 10th time
131114 21:57:46 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '15073' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131114 21:57:46 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 9970

================================================

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-14:

#9

sysbench --test=./oltp.lua --db-driver=mysql --mysql-engine-trx=yes --mysql-table-engine=innodb --mysql-user=root --mysql-password=test --oltp-table-size=30000 --num-threads=1 --init-rng=on --max-requests=0 --oltp-auto-inc=off --max-time=30000 --max-requests=30 run

was the one used for in #8's 5.6 -----> 5.5 replication test.

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2013-11-15:

#10

Replication in direction 5.5 -> 5.6 can also crash if parallel applying is enabled, following error message follows (in 5.6 node that is):

2013-11-15 12:29:34 30316 [ERROR] WSREP: Trx 27782 tries to abort slave trx 27783. This could be caused by:
        1) unsupported configuration options combination, please check documentation.
        2) a bug in the code.
        3) a database corruption.

The affected table has both primary key and unique key. Dependency calculation goes obviously wrong in 5.6 node.
(same load between 5.5 nodes goes fine)

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-20:

#11

Even with PA off, I get this on 5.5 node:

131120 21:49:33 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '14991' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131120 21:49:33 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 3051
131120 21:49:33 [Warning] WSREP: Failed to apply app buffer: seqno: 3051, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 2th time
131120 21:49:33 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '14991' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131120 21:49:33 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 3051
131120 21:49:33 [Warning] WSREP: Failed to apply app buffer: seqno: 3051, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 3th time
131120 21:49:33 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '14991' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131120 21:49:33 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 3051
131120 21:49:33 [Warning] WSREP: Failed to apply app buffer: seqno: 3051, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 4th time

=================================================================================

5.6 node cnf:

[mysqld]
datadir=/var/lib/mysql
binlog_format = ROW
innodb_buffer_pool_size = 100M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 512M
innodb_file_per_table = 1

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider_options = "socket.checksum=1"

wsrep_slave_threads=1
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc1

log_bin_use_v1_row_events=1
gtid_mode=0
binlog_checksum=NONE

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

5.5 node cnf:
[mysqld]
datadir=/var/lib/mysql
binlog_format = ROW
innodb_buffer_pool_size = 100M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 512M
innodb_file_per_table = 1

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so

wsrep_slave_threads=1
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc2

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

Even with PA off, I get this on 5.5 node:

131120 21:49:33 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '14991' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131120 21:49:33 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 3051
131120 21:49:33 [Warning] WSREP: Failed to apply app buffer: seqno: 3051, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 2th time
131120 21:49:33 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '14991' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131120 21:49:33 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 3051
131120 21:49:33 [Warning] WSREP: Failed to apply app buffer: seqno: 3051, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 3th time
131120 21:49:33 [ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Column 'k' cannot be null, Error_code: 1048; Duplicate entry '14991' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 174, Error_code: 1062
131120 21:49:33 [Warning] WSREP: RBR event 3 Update_rows apply warning: 121, 3051
131120 21:49:33 [Warning] WSREP: Failed to apply app buffer: seqno: 3051, status: 1
         at galera/src/replicator_smm.cpp:apply_wscoll():57
Retrying 4th time

=================================================================================

5.6 node cnf:

[mysqld]
datadir=/var/lib/mysql
binlog_format           = ROW
innodb_buffer_pool_size         = 100M
innodb_flush_log_at_trx_commit  = 2
innodb_flush_method             = O_DIRECT
innodb_log_files_in_group       = 2
innodb_log_file_size            = 512M
innodb_file_per_table           = 1

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider_options = "socket.checksum=1"

wsrep_slave_threads=1
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc1

log_bin_use_v1_row_events=1
gtid_mode=0
binlog_checksum=NONE

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

5.5 node cnf:
[mysqld]
datadir=/var/lib/mysql
binlog_format           = ROW
innodb_buffer_pool_size         = 100M
innodb_flush_log_at_trx_commit  = 2
innodb_flush_method             = O_DIRECT
innodb_log_files_in_group       = 2
innodb_log_file_size            = 512M
innodb_file_per_table           = 1

wsrep_cluster_address='gcomm://Pxc1,Pxc2'
wsrep_provider=/usr/lib64/libgalera_smm.so

wsrep_slave_threads=1
wsrep_cluster_name=PXC
wsrep_sst_method=xtrabackup-v2
wsrep_node_name=Pxc2

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

[client]
user=root
password=test

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-11-21:

#12

To add to #11,

on 5.6 node, even with compat config options and PA off, I see

2013-11-21 10:08:29 3140 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3
2013-11-21 10:08:29 3140 [Warning] WSREP: trx protocol version: 2 does not match certification protocol version: 3

So, there may be some protocol level violation that we are facing here.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-12-06:

#13

For #12 ,

from galera::Certification::do_test in certification.cpp:

    if (trx->version() != version_)
    {
        log_warn << "trx protocol version: "
                 << trx->version()
                 << " does not match certification protocol version: "
                 << version_;
        return TEST_FAILED;
    }

Does this test need to be relaxed/modified for cross-version
replication?

From

"2013-11-21 10:08:29 3140 [Warning] WSREP: trx protocol version:
2 does not match certification protocol version: 3"

it looks like trx protocol 2 (of 5.5 node) is not compatible with
certification protocol 3 (of Galera 3 on PXC 5.6 node). Is this
right?

Seppo Jaakola (seppo-jaakola) on 2014-01-09

summary:

- Galera Replication between 5.6 and 5.5 fails
+ Galera Replication from 5.6 node to 5.5 node fails

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2014-01-09:

#14

Changed the title to:
"Galera Replication from 5.6 node to 5.5 node fails"

This bug is used to track issues with replication from 5.6 node to 5.5 node. There is a separate bug to track issues with replication in opposite direction: https://bugs.launchpad.net/codership-mysql/+bug/1267494

Note that 5.6 -> 5.5 replication is not critical for migration to 5.6 cluster. The migration can work by using one 5.5 master while upgrading all slaves to 5.6 level, and for this process only 5.5 -> 5.6 replication will be needed

5.6 -> 5.5 replication will be needed only if the 5.6 migration needs to happen in multi-master mode, or there is need to maintain hybrid 5.6-5.6 cluster for long term

Changed in codership-mysql:
status:	New → In Progress
importance:	Undecided → Medium

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2014-01-09:

#15

Tested 5.6 -> 5.5 replication (Galera 3.1), with the required compatibility configuration:

log_bin_use_v1_row_events=1
gtid_mode=0
binlog_checksum=NONE

And can see that 5.5 node crashes at slave applying for:

140109 15:42:46 [ERROR] Slave SQL: Column 5 of table 'test.comm00' cannot be converted from type '<unknown type>' to type 'timestamp', Error_code: 1677

This happens with sqlgen load, which updates table with timestamp datatype.

I tried the same load using MySQL replication from 5.6 to 5.5 node, and same error happens there as well. So we probably have a MySQL bug to deal with. However, this may not be worth fixing, if migration is the only target here.

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2014-01-21:

#16

Looks like http://bugs.mysql.com/bug.php?id=70085 ?

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2014-01-24:

#17

Yes, http://bugs.mysql.com/bug.php?id=70085 is the only remaining issue preventing replication in 5.6 -> 5.5 direction.
Fixing this is not seen as priority, as there is a working migration path to 5.6 based cluster, regardless of this bug.

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-18:

#18

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-983

	Status	Importance	Assigned to
MySQL patches by Codership	In Progress	Medium	Seppo Jaakola
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC	Status tracked in 5.6
5.5	Invalid	High	Unassigned
5.6	Fix Committed	Undecided	Unassigned

MySQL patches by Codership

Galera Replication from 5.6 node to 5.5 node fails

Bug Description

Other bug subscribers

Remote bug watches