Duplicate UK values in READ-COMMITTED (again)

Bug #1308016 reported by Przemek
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
MySQL patches by Codership
New
Undecided
Unassigned
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Alexey Kopytov
5.1
Won't Fix
Undecided
Unassigned
5.5
Triaged
Medium
Unassigned
5.6
Fix Released
Medium
Alexey Kopytov
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Triaged
Medium
Unassigned
5.6
Fix Released
Medium
Alexey Kopytov

Bug Description

On a table having PK and UK keys defined, it is possible to crash nodes with consistency errors or lock whole cluster for writes.
This is a result of InnoDB behaviour as reported in upstream MySQL bug: bugs.mysql.com/bug.php?id=69979

The good thing is that it is harder to break Galera cluster under the same conditions then normal asynchronous replication. For example, I am not able to break PXC cluster with just two concurrent sessions where the p1() procedure from Kevin Lewis' test case is run. But with 3 or more, running on the same node, PXC crashes due to consistency compromised. This happens faster in read-committed isolation level, but happens also with repeatable-read. The result error is like this:

2014-04-15 13:19:53 9570 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-04-15 13:19:53 9570 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 649369
2014-04-15 13:19:53 9570 [ERROR] WSREP: Failed to apply trx: source: fe30ab01-c48a-11e3-95e6-933a2f300241 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 5 trx_id: 2442782 seqnos (l: 302939, g: 649369, s: 649368, d: 649368, ts: 6527240877214333)
2014-04-15 13:19:53 9570 [ERROR] WSREP: Failed to apply trx 649369 4 times
2014-04-15 13:19:53 9570 [ERROR] WSREP: Node consistency compromized, aborting...

When I call the procedure from two different nodes in repeatable-read - the cluster gets locked for writes and the only way to fix is to kill -9 mysqld on one of the nodes. Example hanged state:

percona1 mysql> show processlist;
+----+-------------+----------------+------+---------+------+---------------------------+-------------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+-------------+----------------+------+---------+------+---------------------------+-------------------------------------------+-----------+---------------+
| 1 | system user | | NULL | Sleep | 3905 | wsrep aborter idle | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 508 | System lock | NULL | 0 | 0 |
| 3 | system user | | NULL | Sleep | 508 | committed 356488 | NULL | 0 | 0 |
| 4 | cmon | percona5:48670 | NULL | Sleep | 79 | | NULL | 1 | 1 |
| 8 | root | localhost | test | Killed | 266 | wsrep in pre-commit stage | insert into t1 values (22,22,22,22,22,22) | 0 | 0 |
| 9 | root | localhost | test | Killed | 508 | wsrep in pre-commit stage | start transaction | 0 | 16 |
| 15 | root | localhost | test | Query | 0 | init | show processlist | 0 | 0 |
+----+-------------+----------------+------+---------+------+---------------------------+-------------------------------------------+-----------+---------------+
7 rows in set (0.00 sec)

Unfortunately upstream bug is marked as "not a bug", but maybe there is a way to fix that in Galera replication?

Tags: innodb i40914
Revision history for this message
Przemek (pmalkowski) wrote :

The test case I used from upstream bug report:

drop table if exists t1;
create table t1(a tinyint not null, b tinyint not null, who int, rep_count int, trx_count int, trx_started int, primary key(b), unique key(a)) engine=innodb;

drop procedure if exists p1;
delimiter $
create procedure p1(me int)
begin
  declare continue handler for 1062 begin end;
  declare continue handler for 1213 begin end;
  set transaction isolation level repeatable read;
  set @trx_count:=0;
  set @trx_count:=0;
  set @replace_count:=0;
  set @cnt_a:=null,@cnt_b:=null,@a:=null;
  repeat
    if rand() > 0.5 then
      if @trx_started = 0 then
        set @trx_count:=(@trx_count + 1);
      end if;
      set @trx_started:=@replace_count;
      start transaction;
    end if;
    if rand() > 0.5 then
      set @replace_count:=(@replace_count + 1);
      replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*rand()),
         me, @replace_count, @trx_count, @trx_started);
    end if;
    select count(*) cnt_a,a into
      @cnt_a,@a from t1 group by a having cnt_a > 1 limit 1;
    select count(*) cnt_b,b into
      @cnt_b,@b from t1 group by a having cnt_b > 1 limit 1;
    if (@cnt_a is not null) || (@cnt_b is not null) then
     select * from t1;
    end if;
    set @cnt_a:=null,@cnt_b:=null,@a:=null;
    if rand() > 0.5 then
      set @trx_started:=0;
      commit;
    end if;
  until 1=2 end repeat;
end $
delimiter ;

session1 > call p1(77);
session2 > call p1(33);
session3 > call p1(55);
etc.

Przemek (pmalkowski)
tags: added: i40914
Revision history for this message
fabe (fabe-e) wrote :

Same thing happened to us today on production servers, two out of tree nodes crashed at the same time, while first node remained working!!!

installed versions:
Percona-Server-shared-51.x86_64 5.1.72-rel14.10.597.rhel6
Percona-XtraDB-Cluster-client.x86_64 1:5.5.34-23.7.6.565.rhel6
Percona-XtraDB-Cluster-galera.x86_64 2.8-1.162.rhel6
Percona-XtraDB-Cluster-server.x86_64 1:5.5.34-23.7.6.565.rhel6
Percona-XtraDB-Cluster-shared.x86_64 1:5.5.34-23.7.6.565.rhel6

140428 14:51:19 [ERROR] Slave SQL: Could not execute Write_rows event on table x.y; Duplicate entry ‘167176' for key 'subscriber_id_UNIQUE', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 311, Error_code: 1062
140428 14:51:19 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 93581360
140428 14:51:19 [ERROR] WSREP: Failed to apply trx: source: d1d7af33-9e01-11e3-a5c8-0ef8a35689df version: 2 local: 0 state: APPLYING flags: 1 conn_id: 8418444 trx_id: 227697579 seqnos (l: 17217951, g: 93581360, s: 93581336, d: 93581359, ts: 1398685878256910366)
140428 14:51:19 [ERROR] WSREP: Failed to apply app buffer: seqno: 93581360, status: WSREP_FATAL
         at galera/src/replicator_smm.cpp:apply_wscoll():52
         at galera/src/replicator_smm.cpp:apply_trx_ws():118
140428 14:51:19 [ERROR] WSREP: Node consistency compromized, aborting...

Any info/help on this would be great.

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Note that this is related (if not a duplicate of): https://bugs.launchpad.net/codership-mysql/+bug/1299116

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :

Tried to reproduce with Experimental Repo (PXC 5.6.19) and two different isolation levels (repeatable-read, read-committed)
But unable to get that error.

With Repeatable-read, it was working smoothly, while it was slow with read-committed and found below things in innodb status.

------------------------
LATEST DETECTED DEADLOCK
------------------------
2014-07-18 13:00:47 7f9681936700
*** (1) TRANSACTION:
TRANSACTION 1067388, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 2, OS thread handle 0x7f96819b8700, query id 7453285 localhost root update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 4 n bits 80 index `a` of table `test`.`t1` trx id 1067388 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 1067383, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2
MySQL thread id 4, OS thread handle 0x7f9681936700, query id 7453297 localhost root update
replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*rand()),
         me, @replace_count, @trx_count, @trx_started)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 6 page no 4 n bits 80 index `a` of table `test`.`t1` trx id 1067383 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `PRIMARY` of table `test`.`t1` trx id 1067383 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 1070024
Purge done for trx's n:o < 1070023 undo n:o < 0 state: running but idle
History list length 2200
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 7, OS thread handle 0x7f96818f5700, query id 7474337 localhost root init
show engine innodb status
---TRANSACTION 1070017, not started flushing log
MySQL thread id 4, OS thread handle 0x7f9681936700, query id 7474306 localhost root closing tables
---TRANSACTION 1070022, not started flushing log
mysql tables in use 1, locked 1
MySQL thread id 2, OS thread handle 0x7f96819b8700, query id 7474336 localhost root query end
----------------------------
END OF INNODB MONITOR OUTPUT
============================

Revision history for this message
Przemek (pmalkowski) wrote :

I am able to reproduce on version: 5.6.19-67.0-56 Percona XtraDB Cluster (GPL), Release rel67.0, Revision 796, WSREP version 25.6, wsrep_25.6.r4096, Galera 3.6(r3a949e6), but so far only in READ-COMMITTED isolation level.

I am running 3 or 4 simultaneous sessions where called the p1 procedure with different value on first PXC node.
This is the error I am getting after a while on rest of the nodes:

2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [Warning] WSREP: Failed to apply app buffer: seqno: 219790, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-18 10:45:08 8036 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-07-18 10:45:08 8036 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 219790
2014-07-18 10:45:08 8036 [ERROR] WSREP: Failed to apply trx: source: f4bac56f-0e51-11e4-af0f-2e6e7974831c version: 3 local: 0 state: APPLYING flags: 1 conn_id: 26 trx_id: 1197054 seqnos (l: 222107, g: 219790, s: 219789, d: 219789, ts: 2668662170318334)
2014-07-18 10:45:08 8036 [ERROR] WSREP: Failed to apply trx 219790 4 times
2014-07-18 10:45:08 8036 [ERROR] WSREP: Node consistency compromized, aborting...

For some reason, I cannot reproduce the same on PXC 5.5.34 and Galera 2.8.

summary: - Duplicate UK values can be replicated under concurrent wokload
+ Duplicate UK values can be replicated under concurrent wokload with
+ READ-COMMITTED
summary: - Duplicate UK values can be replicated under concurrent wokload with
+ Duplicate UK values can be replicated under concurrent workload with
READ-COMMITTED
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote : Re: Duplicate UK values can be replicated under concurrent workload with READ-COMMITTED

@Przemek

Can you test with release PXC 5.6.19? (since the one you tested with r4096 seems
to be a bit old).

Revision history for this message
Przemek (pmalkowski) wrote :

Sure, upgraded all 3 nodes to:
percona30 mysql> select @@version,@@version_comment;
+----------------+---------------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+----------------+---------------------------------------------------------------------------------------------------+
| 5.6.19-67.0-56 | Percona XtraDB Cluster (GPL), Release rel67.0, Revision 824, WSREP version 25.6, wsrep_25.6.r4111 |
+----------------+---------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

And did the same test - called the procedure in 4 different sessions on node1, after a while nodes 2 and 3 shutdown with:

2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [Warning] WSREP: Failed to apply app buffer: seqno: 371009, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2014-07-23 17:23:02 4187 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 219, Error_code: 1062
2014-07-23 17:23:02 4187 [Warning] WSREP: RBR event 4 Write_rows apply warning: 121, 371009
2014-07-23 17:23:02 4187 [ERROR] WSREP: Failed to apply trx: source: 54df79b2-127b-11e4-b7f0-c3402efb9419 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 13 trx_id: 1988541 seqnos (l: 73248, g: 371009, s: 371007, d: 371008, ts: 3124540080922232)
2014-07-23 17:23:02 4187 [ERROR] WSREP: Failed to apply trx 371009 4 times
2014-07-23 17:23:02 4187 [ERROR] WSREP: Node consistency compromized, aborting...

Revision history for this message
Przemek (pmalkowski) wrote :

I cannot reproduce the same HA_ERR_FOUND_DUPP_KEY error on PXC 5.5.37 using the above stored procedure method.

server55n1 mysql> select @@version,@@version_comment;
+----------------+-----------------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+----------------+-----------------------------------------------------------------------------------------------------+
| 5.5.37-35.0-55 | Percona XtraDB Cluster (GPL), Release rel35.0, Revision 756, WSREP version 25.10, wsrep_25.10.r3985 |
+----------------+-----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

server55n1 mysql> show status like 'wsrep_provider_version';
+------------------------+------------+
| Variable_name | Value |
+------------------------+------------+
| wsrep_provider_version | 2.10(r175) |
+------------------------+------------+
1 row in set (0.00 sec)

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

@Przemek,

Will it be possible to test with 5.6.21 PXC in experimental repository? There are upstream fixes which may have fixed it.

Revision history for this message
Przemek (pmalkowski) wrote :

I was able to reproduce on:
percona1 mysql> select @@version,@@version_comment\G
*************************** 1. row ***************************
        @@version: 5.6.21-69.0-56
@@version_comment: Percona XtraDB Cluster (GPL), Release rel69.0, Revision 910, WSREP version 25.8, wsrep_25.8.r4126
1 row in set (0.00 sec)

by running 3 parallel sessions on writer node. Replicated nodes failed with:

2014-11-10 12:37:34 19254 [ERROR] Slave SQL: Could not execute Write_rows event on table test.t1; Duplicate entry '0' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 169, Error_code: 1062
2014-11-10 12:37:34 19254 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 299546
2014-11-10 12:37:34 19254 [ERROR] WSREP: Failed to apply trx: source: 8b37e730-68cb-11e4-8de0-333e4b88f012 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 7 trx_id: 1138584 seqnos (l: 259350, g: 299546, s: 299544, d: 299545, ts: 7858353781585)
2014-11-10 12:37:34 19254 [ERROR] WSREP: Failed to apply trx 299546 4 times
2014-11-10 12:37:34 19254 [ERROR] WSREP: Node consistency compromized, aborting...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

@Przemek,

Even though upstream bug http://bugs.mysql.com/bug.php?id=69979 has been marked 'not a bug'. However, a related bug http://bugs.mysql.com/bug.php?id=73170 has been fixed in 5.6.21.

Now, if it is possible to repeat the SP testcase with Oracle MySQL and/or PS, can you report them upstream?

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :
Download full text (3.4 KiB)

Hi Raghu,

I have tried to reproduce the same with SP testcase + PS 5.6.21/MySQL 5.6.21 community + 3 parallel sessions. but I could not able to reproduce it.

On PS 5.6.21,

mysql> show processlist;
+----+------+-----------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| 45 | root | localhost | test | Query | 0 | init | show processlist | 0 | 0 |
| 46 | root | localhost | test | Query | 0 | query end | replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*ra | 224 | 29262 |
| 48 | root | localhost | test | Sleep | 19 | | NULL | 0 | 0 |
| 53 | root | localhost | test | Query | 0 | update | replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*ra | 150 | 17067 |
| 54 | root | localhost | test | Query | 0 | query end | replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*ra | 179 | 25116 |
+----+------+-----------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+-----------+---------------+
5 rows in set (0.00 sec)

On MySQL 5.6.21,

mysql> show processlist;
+----+------+-----------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+
| 1 | root | localhost | test | Sleep | 109 | | NULL |
| 2 | root | localhost | test | Query | 0 | init | show processlist |
| 3 | root | localhost | test | Query | 0 | query end | replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*ra |
| 4 | root | localhost | test | Query | 0 | query end | replace into t1(a,b,who,rep_count,trx_count,trx_started)
         values(floor(3*rand()), floor(3*ra |
| 5 | root | localhost | test | ...

Read more...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

This bug is still alive, at least for Percona Server.

@Nil,

This bug is not present in single node version of PXC either.

I believe the upstream fix fixed it only for non-replication cases.

With PS master-master replication and writes to one node with 3 parallel
sessions of that stored procedure, I can replicate this issue.

2014-11-18 06:11:52 22338 [ERROR] Slave SQL: Could not execute Write_rows event on table sp2.t1; Duplicate entry '0' for key 'a', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log archie-bin.000052, end_log_pos 63335202, Error_code: 1062
2014-11-18 06:11:52 22338 [Warning] Slave: Duplicate entry '0' for key 'a' Error_code: 1062
2014-11-18 06:11:52 22338 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'archie-bin.000052' position 63334974

@Valerii, @Nil,

I suggest further testing with PS and MySQL, and please file upstream bug report
if/when required.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Make sure to run the sp for about 8-10 minutes as stated in the upstream bug report.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Raghu, if it's not 69979 you are seeing in the server (by your suggestion for further testing and possibly upstream bug reporting), then the bug should be New for server.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

@Laurynas,

I had marked it as confirmed since I was able to reproduce this on 5.6 PS.

In any case, I had discussed with Valerii over further testing and upstream
reporting.

@Valerii, any updates? on this

summary: Duplicate UK values can be replicated under concurrent workload with
- READ-COMMITTED
+ READ-COMMITTED with multi-node configuration
summary: Duplicate UK values can be replicated under concurrent workload with
- READ-COMMITTED with multi-node configuration
+ READ-COMMITTED
Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Raghu,

As Nil tried to explain you in a separate email thread, so far he was not able to reproduce with PS or upstream MySQL 5.6.21. I am not able to check myself at the moment properly, as I am busy with non-bug processing related things this week.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

Setting http://bugs.mysql.com/bug.php?id=76927 as the upstream bug, because the currently linked bug #69979 was discussing phantom duplicate values in REPEATABLE READ and was closed as Not a Bug.

Bug #76927 is about real duplicate values with the READ COMMITTED isolation level and/or innodb_unsafe_locks_for_binlog=1.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

I could reproduce bug #76927 with MySQL 5.6.11, 5.6.24, PS 5.6.23 and PXC 5.6.22. For MySQL and PS UK index corruptions are silent. For PXC the test case also breaks other cluster nodes as described in this bug report.

Revision history for this message
Alexey Kopytov (akopytov) wrote :

Copying steps to reproduce from the upstream bug:

Start the server with --transaction-isolation="read-committed" and/or
--innodb-locks-unsafe-for-binlog=1

Create the following table and store procedure:

---

drop table if exists t1;
create table t1(a int not null, b int not null, who int, primary key(b), unique key(a)) engine=innodb;

drop procedure if exists p1;
delimiter $
create procedure p1(me int)
l1:
  begin
  declare continue handler for 1062 begin end;
  declare continue handler for 1213 begin end;
  set @cnt_a:=null;
  repeat
    select count(*) cnt_a into
         @cnt_a from t1 group by a having cnt_a > 1 limit 1;
    if @cnt_a is not null then
       select * from t1;
       leave l1;
    end if;
    replace into t1(a,b,who) values(floor(3*rand()), floor(3*rand()), me);
  until 1=2 end repeat;
end$
delimiter ;

---

Then execute in 3 different sessions:

session1> call p1(1); check table t1;
session2> call p1(2); check table t1;
session3> call p1(3); check table t1;

summary: - Duplicate UK values can be replicated under concurrent workload with
- READ-COMMITTED
+ Duplicate UK values in READ-COMMITTED (again)
Revision history for this message
Alexey Kopytov (akopytov) wrote :
Revision history for this message
Alexey Kopytov (akopytov) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1102

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1494

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.