Bug #1308016 “Duplicate UK values in READ-COMMITTED (again)” : Bugs : Percona Server moved to https://jira.percona.com/projects/PS

Revision history for this message

Przemek (pmalkowski) wrote on 2014-04-15:

#1

The test case I used from upstream bug report:

drop table if exists t1;
create table t1(a tinyint not null, b tinyint not null, who int, rep_count int, trx_count int, trx_started int, primary key(b), unique key(a)) engine=innodb;

drop procedure if exists p1;
delimiter $
create procedure p1(me int)
begin
  declare continue handler for 1062 begin end;
  declare continue handler for 1213 begin end;
  set transaction isolation level repeatable read;
  set @trx_count:=0;
  set @trx_count:=0;
  set @replace_count:=0;
  set @cnt_a:=null,@cnt_b:=null,@a:=null;
  repeat
    if rand() > 0.5 then
      if @trx_started = 0 then
        set @trx_count:=(@trx_count + 1);
      end if;
      set @trx_started:=@replace_count;
      start transaction;
    end if;
    if rand() > 0.5 then
      set @replace_count:=(@replace_count + 1);
      replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*rand()),
         me, @replace_count, @trx_count, @trx_started);
    end if;
    select count(*) cnt_a,a into
      @cnt_a,@a from t1 group by a having cnt_a > 1 limit 1;
    select count(*) cnt_b,b into
      @cnt_b,@b from t1 group by a having cnt_b > 1 limit 1;
    if (@cnt_a is not null) || (@cnt_b is not null) then
     select * from t1;
    end if;
    set @cnt_a:=null,@cnt_b:=null,@a:=null;
    if rand() > 0.5 then
      set @trx_started:=0;
      commit;
    end if;
  until 1=2 end repeat;
end $
delimiter ;

session1 > call p1(77);
session2 > call p1(33);
session3 > call p1(55);
etc.

Przemek (pmalkowski) on 2014-04-15

tags:

added: i40914

Revision history for this message

fabe (fabe-e) wrote on 2014-04-28:

#2

Same thing happened to us today on production servers, two out of tree nodes crashed at the same time, while first node remained working!!!

installed versions:
Percona-Server-shared-51.x86_64 5.1.72-rel14.10.597.rhel6
Percona-XtraDB-Cluster-client.x86_64 1:5.5.34-23.7.6.565.rhel6
Percona-XtraDB-Cluster-galera.x86_64 2.8-1.162.rhel6
Percona-XtraDB-Cluster-server.x86_64 1:5.5.34-23.7.6.565.rhel6
Percona-XtraDB-Cluster-shared.x86_64 1:5.5.34-23.7.6.565.rhel6

140428 14:51:19 [ERROR] Slave SQL: Could not execute Write_rows event on table x.y; Duplicate entry ‘167176' for key 'subscriber_id_UNIQUE', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 311, Error_code: 1062
140428 14:51:19 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 93581360
140428 14:51:19 [ERROR] WSREP: Failed to apply trx: source: d1d7af33-9e01-11e3-a5c8-0ef8a35689df version: 2 local: 0 state: APPLYING flags: 1 conn_id: 8418444 trx_id: 227697579 seqnos (l: 17217951, g: 93581360, s: 93581336, d: 93581359, ts: 1398685878256910366)
140428 14:51:19 [ERROR] WSREP: Failed to apply app buffer: seqno: 93581360, status: WSREP_FATAL
at galera/src/replicator_smm.cpp:apply_wscoll():52
at galera/src/replicator_smm.cpp:apply_trx_ws():118
140428 14:51:19 [ERROR] WSREP: Node consistency compromized, aborting...

Any info/help on this would be great.

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2014-04-30:

#3

Note that this is related (if not a duplicate of): https://bugs.launchpad.net/codership-mysql/+bug/1299116

Revision history for this message

Nilnandan Joshi (nilnandan-joshi) wrote on 2014-07-18:

#4

Tried to reproduce with Experimental Repo (PXC 5.6.19) and two different isolation levels (repeatable-read, read-committed)
But unable to get that error.

With Repeatable-read, it was working smoothly, while it was slow with read-committed and found below things in innodb status.

------------------------
LATEST DETECTED DEADLOCK
------------------------
2014-07-18 13:00:47 7f9681936700
*** (1) TRANSACTION:
TRANSACTION 1067388, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 2, OS thread handle 0x7f96819b8700, query id 7453285 localhost root update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 4 n bits 80 index `a` of table `test`.`t1` trx id 1067388 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 1067383, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2
MySQL thread id 4, OS thread handle 0x7f9681936700, query id 7453297 localhost root update
replace into t1(a,b,who,rep_count,trx_count,trx_started)
values(floor(3*rand()), floor(3*rand()),
me, @replace_count, @trx_count, @trx_started)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 6 page no 4 n bits 80 index `a` of table `test`.`t1` trx id 1067383 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `PRIMARY` of table `test`.`t1` trx id 1067383 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 1070024
Purge done for trx's n:o < 1070023 undo n:o < 0 state: running but idle
History list length 2200
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 7, OS thread handle 0x7f96818f5700, query id 7474337 localhost root init
show engine innodb status
---TRANSACTION 1070017, not started flushing log
MySQL thread id 4, OS thread handle 0x7f9681936700, query id 7474306 localhost root closing tables
---TRANSACTION 1070022, not started flushing log
mysql tables in use 1, locked 1
MySQL thread id 2, OS thread handle 0x7f96819b8700, query id 7474336 localhost root query end
----------------------------
END OF INNODB MONITOR OUTPUT
============================

Tried to reproduce with Experimental Repo (PXC 5.6.19) and two different isolation levels (repeatable-read, read-committed)
But unable to get that error.

With Repeatable-read, it was working smoothly, while it was slow with read-committed and found below things in innodb status.

------------------------
LATEST DETECTED DEADLOCK
------------------------
2014-07-18 13:00:47 7f9681936700
*** (1) TRANSACTION:
TRANSACTION 1067388, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 2, OS thread handle 0x7f96819b8700, query id 7453285 localhost root update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 4 n bits 80 index `a` of table `test`.`t1` trx id 1067388 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 1067383, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2
MySQL thread id 4, OS thread handle 0x7f9681936700, query id 7453297 localhost root update
replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*rand()),
         me, @replace_count, @trx_count, @trx_started)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 6 page no 4 n bits 80 index `a` of table `test`.`t1` trx id 1067383 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `PRIMARY` of table `test`.`t1` trx id 1067383 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 1070024
Purge done for trx's n:o < 1070023 undo n:o < 0 state: running but idle
History list length 2200
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 7, OS thread handle 0x7f96818f5700, query id 7474337 localhost root init
show engine innodb status
---TRANSACTION 1070017, not started flushing log
MySQL thread id 4, OS thread handle 0x7f9681936700, query id 7474306 localhost root closing tables
---TRANSACTION 1070022, not started flushing log
mysql tables in use 1, locked 1
MySQL thread id 2, OS thread handle 0x7f96819b8700, query id 7474336 localhost root query end
----------------------------
END OF INNODB MONITOR OUTPUT
============================

Revision history for this message

Przemek (pmalkowski) wrote on 2014-07-18:

#5

I am able to reproduce on version: 5.6.19-67.0-56 Percona XtraDB Cluster (GPL), Release rel67.0, Revision 796, WSREP version 25.6, wsrep_25.6.r4096, Galera 3.6(r3a949e6), but so far only in READ-COMMITTED isolation level.

I am running 3 or 4 simultaneous sessions where called the p1 procedure with different value on first PXC node.
This is the error I am getting after a while on rest of the nodes:

2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [ERROR] WSREP: Failed to apply trx: source: f4bac56f-0e51-11e4-af0f-2e6e7974831c version: 3 local: 0 state: APPLYING flags: 1 conn_id: 26 trx_id: 1197054 seqnos (l: 222107, g: 219790, s: 219789, d: 219789, ts: 2668662170318334)
2014-07-18 10:45:08 8036 [ERROR] WSREP: Failed to apply trx 219790 4 times
2014-07-18 10:45:08 8036 [ERROR] WSREP: Node consistency compromized, aborting...

For some reason, I cannot reproduce the same on PXC 5.5.34 and Galera 2.8.

I am able to reproduce on version: 5.6.19-67.0-56 Percona XtraDB Cluster (GPL), Release rel67.0, Revision 796, WSREP version 25.6, wsrep_25.6.r4096, Galera 3.6(r3a949e6), but so far only in READ-COMMITTED isolation level.

I am running 3 or 4 simultaneous sessions where called the p1 procedure with different value on first PXC node.
This is the error I am getting after a while on rest of the nodes:

2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [ERROR] WSREP: Failed to apply trx: source: f4bac56f-0e51-11e4-af0f-2e6e7974831c version: 3 local: 0 state: APPLYING flags: 1 conn_id: 26 trx_id: 1197054 seqnos (l: 222107, g: 219790, s: 219789, d: 219789, ts: 2668662170318334)
2014-07-18 10:45:08 8036 [ERROR] WSREP: Failed to apply trx 219790 4 times
2014-07-18 10:45:08 8036 [ERROR] WSREP: Node consistency compromized, aborting...

For some reason, I cannot reproduce the same on PXC 5.5.34 and Galera 2.8.

Raghavendra D Prabhu (raghavendra-prabhu) on 2014-07-18

summary:

- Duplicate UK values can be replicated under concurrent wokload
+ Duplicate UK values can be replicated under concurrent wokload with
+ READ-COMMITTED

Peter Schwaller (peter-schwaller) on 2014-07-23

summary:

- Duplicate UK values can be replicated under concurrent wokload with
+ Duplicate UK values can be replicated under concurrent workload with
READ-COMMITTED

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-07-23: Re: Duplicate UK values can be replicated under concurrent workload with READ-COMMITTED

#6

@Przemek

Can you test with release PXC 5.6.19? (since the one you tested with r4096 seems
to be a bit old).

Revision history for this message

Przemek (pmalkowski) wrote on 2014-07-23:

#7

Sure, upgraded all 3 nodes to:
percona30 mysql> select @@version,@@version_comment;
+----------------+---------------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+----------------+---------------------------------------------------------------------------------------------------+
| 5.6.19-67.0-56 | Percona XtraDB Cluster (GPL), Release rel67.0, Revision 824, WSREP version 25.6, wsrep_25.6.r4111 |
+----------------+---------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And did the same test - called the procedure in 4 different sessions on node1, after a while nodes 2 and 3 shutdown with:

2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [ERROR] WSREP: Failed to apply trx: source: 54df79b2-127b-11e4-b7f0-c3402efb9419 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 13 trx_id: 1988541 seqnos (l: 73248, g: 371009, s: 371007, d: 371008, ts: 3124540080922232)
2014-07-23 17:23:02 4187 [ERROR] WSREP: Failed to apply trx 371009 4 times
2014-07-23 17:23:02 4187 [ERROR] WSREP: Node consistency compromized, aborting...

Sure, upgraded all 3 nodes to:
percona30 mysql> select @@version,@@version_comment;
+----------------+---------------------------------------------------------------------------------------------------+
| @@version      | @@version_comment                                                                                 |
+----------------+---------------------------------------------------------------------------------------------------+
| 5.6.19-67.0-56 | Percona XtraDB Cluster (GPL), Release rel67.0, Revision 824, WSREP version 25.6, wsrep_25.6.r4111 |
+----------------+---------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And did the same test - called the procedure in 4 different sessions on node1, after a while nodes 2 and 3 shutdown with:

2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [ERROR] WSREP: Failed to apply trx: source: 54df79b2-127b-11e4-b7f0-c3402efb9419 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 13 trx_id: 1988541 seqnos (l: 73248, g: 371009, s: 371007, d: 371008, ts: 3124540080922232)
2014-07-23 17:23:02 4187 [ERROR] WSREP: Failed to apply trx 371009 4 times
2014-07-23 17:23:02 4187 [ERROR] WSREP: Node consistency compromized, aborting...

Revision history for this message

Przemek (pmalkowski) wrote on 2014-07-28:

#8

I cannot reproduce the same HA_ERR_FOUND_DUPP_KEY error on PXC 5.5.37 using the above stored procedure method.

server55n1 mysql> select @@version,@@version_comment;
+----------------+-----------------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+----------------+-----------------------------------------------------------------------------------------------------+
| 5.5.37-35.0-55 | Percona XtraDB Cluster (GPL), Release rel35.0, Revision 756, WSREP version 25.10, wsrep_25.10.r3985 |
+----------------+-----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-10-10:

#9

@Przemek,

Will it be possible to test with 5.6.21 PXC in experimental repository? There are upstream fixes which may have fixed it.

Revision history for this message

Przemek (pmalkowski) wrote on 2014-11-10:

#10

I was able to reproduce on:
percona1 mysql> select @@version,@@version_comment\G
*************************** 1. row ***************************
@@version: 5.6.21-69.0-56
@@version_comment: Percona XtraDB Cluster (GPL), Release rel69.0, Revision 910, WSREP version 25.8, wsrep_25.8.r4126
1 row in set (0.00 sec)

by running 3 parallel sessions on writer node. Replicated nodes failed with:

2014-11-10 12:37:34 19254 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '0' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-11-10 12:37:34 19254 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 299546
2014-11-10 12:37:34 19254 [ERROR] WSREP: Failed to apply trx: source: 8b37e730-68cb-11e4-8de0-333e4b88f012 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 7 trx_id: 1138584 seqnos (l: 259350, g: 299546, s: 299544, d: 299545, ts: 7858353781585)
2014-11-10 12:37:34 19254 [ERROR] WSREP: Failed to apply trx 299546 4 times
2014-11-10 12:37:34 19254 [ERROR] WSREP: Node consistency compromized, aborting...

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-11-10:

#11

@Przemek,

Even though upstream bug http://bugs.mysql.com/bug.php?id=69979 has been marked 'not a bug'. However, a related bug http://bugs.mysql.com/bug.php?id=73170 has been fixed in 5.6.21.

Now, if it is possible to repeat the SP testcase with Oracle MySQL and/or PS, can you report them upstream?

Revision history for this message

Nilnandan Joshi (nilnandan-joshi) wrote on 2014-11-17:

#12

Download full text (3.4 KiB)

Hi Raghu,

I have tried to reproduce the same with SP testcase + PS 5.6.21/MySQL 5.6.21 community + 3 parallel sessions. but I could not able to reproduce it.

On PS 5.6.21,

On MySQL 5.6.21,

Hi Raghu,

I have tried to reproduce the same with SP testcase + PS 5.6.21/MySQL 5.6.21 community + 3 parallel sessions. but I could not able to reproduce  it.

On PS 5.6.21,

On MySQL 5.6.21,

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-11-18:

#13

This bug is still alive, at least for Percona Server.

@Nil,

This bug is not present in single node version of PXC either.

I believe the upstream fix fixed it only for non-replication cases.

With PS master-master replication and writes to one node with 3 parallel
sessions of that stored procedure, I can replicate this issue.

2014-11-18 06:11:52 22338 [ERROR] Slave SQL: Could not execute Write_rows event on table sp2.t1; Duplicate entry '0' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log archie-bin.000052, end_log_pos 63335202, Error_code: 1062
2014-11-18 06:11:52 22338 [Warning] Slave: Duplicate entry '0' for key 'a' Error_code: 1062
2014-11-18 06:11:52 22338 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'archie-bin.000052' position 63334974

@Valerii, @Nil,

I suggest further testing with PS and MySQL, and please file upstream bug report
if/when required.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-11-18:

#14

Make sure to run the sp for about 8-10 minutes as stated in the upstream bug report.

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2014-11-19:

#15

Raghu, if it's not 69979 you are seeing in the server (by your suggestion for further testing and possibly upstream bug reporting), then the bug should be New for server.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-12-10:

#16

@Laurynas,

I had marked it as confirmed since I was able to reproduce this on 5.6 PS.

In any case, I had discussed with Valerii over further testing and upstream
reporting.

@Valerii, any updates? on this

Raghavendra D Prabhu (raghavendra-prabhu) on 2014-12-11

summary:	Duplicate UK values can be replicated under concurrent workload with - READ-COMMITTED + READ-COMMITTED with multi-node configuration
summary:	Duplicate UK values can be replicated under concurrent workload with - READ-COMMITTED with multi-node configuration + READ-COMMITTED

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2014-12-12:

#17

Raghu,

As Nil tried to explain you in a separate email thread, so far he was not able to reproduce with PS or upstream MySQL 5.6.21. I am not able to check myself at the moment properly, as I am busy with non-bug processing related things this week.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2015-05-04:

#18

Setting http://bugs.mysql.com/bug.php?id=76927 as the upstream bug, because the currently linked bug #69979 was discussing phantom duplicate values in REPEATABLE READ and was closed as Not a Bug.

Bug #76927 is about real duplicate values with the READ COMMITTED isolation level and/or innodb_unsafe_locks_for_binlog=1.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2015-05-04:

#19

I could reproduce bug #76927 with MySQL 5.6.11, 5.6.24, PS 5.6.23 and PXC 5.6.22. For MySQL and PS UK index corruptions are silent. For PXC the test case also breaks other cluster nodes as described in this bug report.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2015-05-04:

#20

Copying steps to reproduce from the upstream bug:

Start the server with --transaction-isolation="read-committed" and/or
--innodb-locks-unsafe-for-binlog=1

Create the following table and store procedure:

---

drop table if exists t1;
create table t1(a int not null, b int not null, who int, primary key(b), unique key(a)) engine=innodb;

drop procedure if exists p1;
delimiter $
create procedure p1(me int)
l1:
  begin
  declare continue handler for 1062 begin end;
  declare continue handler for 1213 begin end;
  set @cnt_a:=null;
  repeat
    select count(*) cnt_a into
         @cnt_a from t1 group by a having cnt_a > 1 limit 1;
    if @cnt_a is not null then
       select * from t1;
       leave l1;
    end if;
    replace into t1(a,b,who) values(floor(3*rand()), floor(3*rand()), me);
  until 1=2 end repeat;
end$
delimiter ;

---

Then execute in 3 different sessions:

session1> call p1(1); check table t1;
session2> call p1(2); check table t1;
session3> call p1(3); check table t1;

summary:

- Duplicate UK values can be replicated under concurrent workload with
- READ-COMMITTED
+ Duplicate UK values in READ-COMMITTED (again)

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2015-05-26:

#21

https://github.com/percona/percona-server/pull/83

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2015-05-26:

#22

https://github.com/percona/percona-xtradb-cluster/pull/21

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-18:

#23

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1102

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

#24

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1494

	Status	Importance	Assigned to	Milestone
MySQL Server	Unknown	Unknown	mysql-bugs #76927
MySQL patches by Codership	New	Undecided	Unassigned
Percona Server moved to https://jira.percona.com/projects/PS	Fix Released	Medium	Alexey Kopytov	Percona Server moved to https://jira.percona.com/projects/PS 5.6.25-73.0
5.1	Won't Fix	Undecided	Unassigned
5.5	Triaged	Medium	Unassigned
5.6	Fix Released	Medium	Alexey Kopytov	Percona Server moved to https://jira.percona.com/projects/PS 5.6.25-73.0
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC	Status tracked in 5.6
5.5	Triaged	Medium	Unassigned
5.6	Fix Released	Medium	Alexey Kopytov	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC 5.6.24-25.11

Percona Server moved to https://jira.percona.com/projects/PS

Duplicate UK values in READ-COMMITTED (again)

Bug Description

Other bug subscribers

Remote bug watches