Concurrent duplicate inserts can violate a unique key constraint in InnoDB tables

Bug #1322559 reported by Maciej Dobrzanski
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Unassigned
5.1
Invalid
Undecided
Unassigned
5.5
Fix Released
High
Unassigned
5.6
Fix Released
High
Unassigned

Bug Description

After upgrading the replication master from Percona Server from 5.1 to the latest 5.5, every now and then replication on the slaves fails with a duplicate entry error. We have identified that the reason for that are duplicate inserts allowed by the master.

The phenomenon occurs in multiple tables, but here's one example:

*************************** 1. row ***************************
       Table: example
Create Table: CREATE TABLE `example` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) NOT NULL,
  `item_id` int(11) NOT NULL,
  `created_at` datetime NOT NULL,
  `remote_addr` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_example_on_user_id_and_item_id` (`user_id`,`item_id`),
  KEY `index_example_on_item_id` (`item_id`)
) ENGINE=InnoDB AUTO_INCREMENT=21479683 DEFAULT CHARSET=utf8

If we query by a unique key columns, the result is correct, which means that the unique key itself does not contain duplicate entries:

root@localhost [production]> select * from example where user_id = 81825835 AND item_id = 28676798;
+----------+----------+----------+---------------------+----------------+
| id | user_id | item_id | created_at | remote_addr |
+----------+----------+----------+---------------------+----------------+
| 19282128 | 81825835 | 28676798 | 2014-04-01 03:02:02 | 177.35.148.167 |
+----------+----------+----------+---------------------+----------------+

However, all other indexes - primary and secondary - do:

root@localhost [production]> select * from example force index (index_example_on_item_id) where user_id = 81825835 AND item_id = 28676798;
+----------+----------+----------+---------------------+----------------+
| id | user_id | item_id | created_at | remote_addr |
+----------+----------+----------+---------------------+----------------+
| 19282128 | 81825835 | 28676798 | 2014-04-01 03:02:02 | 177.35.148.167 |
| 19282129 | 81825835 | 28676798 | 2014-04-01 03:02:02 | 177.35.148.167 |
+----------+----------+----------+---------------------+----------------+

The second query returned two rows with the same pair of values in (user_id, item_id), which means the unique constraint was violated.

We have not yet been able to reproduce this outside of production, but in production it happens on a fairly regular basis. Nevertheless we identified a possible pattern leading to the problem that is true regardless of the table that becomes affected. The pattern is a sequence of queries starting with an INSERT, followed by a DELETE and two concurrent INSERTs. According to the binary log, out of the two concurrent inserts, the one that is assigned an earlier auto_inc value, always finishes second. According to our observations, the insert preceding the delete does not always happen, so we think it may be irrelevant to the entire problem.

Here is a binary log snippet from one of the incidents:

# at 3999763
#140516 14:13:14 server id 20117 end_log_pos 3999848 Query thread_id=3139252 exec_time=0 error_code=0
SET TIMESTAMP=1400249594/*!*/;
BEGIN
/*!*/;
### INSERT INTO production.example
### SET
### @1=877218430
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 3999986
#140516 14:13:14 server id 20117 end_log_pos 4000013 Xid = 11255723612
COMMIT/*!*/;

--

# at 4050167
#140516 14:13:14 server id 20117 end_log_pos 4050252 Query thread_id=3150908 exec_time=0 error_code=0
SET TIMESTAMP=1400249594/*!*/;
BEGIN
/*!*/;
### DELETE FROM production.example
### WHERE
### @1=877218430
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 4050390
#140516 14:13:14 server id 20117 end_log_pos 4050417 Xid = 11255726215
COMMIT/*!*/;

--

# at 4088418
#140516 14:13:14 server id 20117 end_log_pos 4088503 Query thread_id=3134564 exec_time=0 error_code=0
SET TIMESTAMP=1400249594/*!*/;
BEGIN
/*!*/;
### INSERT INTO production.example
### SET
### @1=877218459 <--- later auto_inc value committed first
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 4088641
#140516 14:13:14 server id 20117 end_log_pos 4088668 Xid = 11255727401
COMMIT/*!*/;

--

# at 4088668
#140516 14:13:14 server id 20117 end_log_pos 4088753 Query thread_id=3156683 exec_time=0 error_code=0
SET TIMESTAMP=1400249594/*!*/;
BEGIN
/*!*/;
### INSERT INTO production.example
### SET
### @1=877218458 <--- earlier auto_inc value committed second
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 4088891
#140516 14:13:14 server id 20117 end_log_pos 4088918 Xid = 11255727407
COMMIT/*!*/;

All the events above occurred within one second, however in other cases we also saw a distance of several seconds between the delete and the concurrent inserts. We don't know for certain whether the distance was only because of delayed commits or if any of the inserts was actually ran later, but we suspect that the first inserts waits on some lock and then the second insert appears sometime later and is allowed to finish faster (because it doesn't see those lock(s)?). I base this claim purely on the fact that when we saw a delay between the delete and the inserts, there was a fairly large gap in the auto_inc values assigned to each insert (i.e. they were not consecutive as in the example above).

All transactions use READ COMMITTED isolation. We also ruled out the possibility of unique_checks being disabled. We also tried different options for change_bufferring, but that didn't have any effect.

It's possible that there's some kind of race condition around locking a delete-marked record in the unique index that allows two concurrent threads to assume it's okay to proceed.

Revision history for this message
Maciej Dobrzanski (mushu) wrote :

The DELETE query uses primary key to remove rows.

tags: removed: 5.5
Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

I wonder is it a duplicate of/related to http://bugs.mysql.com/bug.php?id=69979 (interesting discussion at the end of this report that ended up as "Not a bug").

Changed in percona-server:
status: New → Incomplete
Revision history for this message
Maciej Dobrzanski (mushu) wrote :

I saw http://bugs.mysql.com/bug.php?id=69979 the other day and it might be somehow related, although there are many differences.

In that bug the problem is that under certain circumstances an open transaction might see duplicate data, however that problem (whether it's a bug or a "feature") is confined to that one transaction only. In our case the duplicate data is persistently stored in a table for all transactions to see and both conflicting modifications end up as events in the binary log. Please also note that in our case the application uses READ COMMITTED and not REPEATABLE READ.

Changed in percona-server:
status: Incomplete → New
Revision history for this message
Tamas Szasz (tszasz) wrote :

Same happened in our database too.
The environment what we are using is:
RHEL 6.5
Server version: 5.5.35-33.0-log Percona Server (GPL), Release rel33.0, Revision 611
1 master + 3 slaves.

Received replication error few days ago. After investigating the error, we found out, that we have 2 rows with the same UNIQUE KEY in the master database.
Affected table and data:
show create table registeredUsers\G
*************************** 1. row ***************************
       Table: registeredUsers
Create Table: CREATE TABLE `registeredUsers` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `userId` varchar(32) NOT NULL,
  `client` varchar(200) NOT NULL,
  `osVersion` varchar(10) NOT NULL,
  `deviceGroup` varchar(50) NOT NULL,
  `registrationDate` datetime NOT NULL,
  `lastAction` datetime NOT NULL,
  `cultureLanguage` varchar(2) NOT NULL,
  `cultureRegion` varchar(2) NOT NULL,
  `cultureCode` varchar(5) NOT NULL DEFAULT 'de-de',
  `lastPush` datetime DEFAULT NULL,
  `pushToken` mediumtext,
  `permaToken` varchar(74) DEFAULT NULL,
  `accessCount` int(11) NOT NULL DEFAULT '0',
  `access` varchar(1) NOT NULL DEFAULT '1',
  `provider` varchar(255) NOT NULL,
  `providerTld` varchar(5) NOT NULL,
  `environment` tinyint(1) DEFAULT '0',
  `udidMd5` varchar(32) NOT NULL,
  `development` tinyint(1) DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `userId_2` (`userId`),
  UNIQUE KEY `permaAccessToken_2` (`permaToken`),
  KEY `client` (`client`),
  KEY `lastAction` (`lastAction`),
  KEY `deviceGroup` (`deviceGroup`),
  KEY `osVersion` (`osVersion`),
  KEY `cultureCode` (`cultureCode`),
  KEY `udidMd5` (`udidMd5`)
) ENGINE=InnoDB AUTO_INCREMENT=38587059 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

SELECT id, userId FROM registeredUsers WHERE userid >= '748ec561dbc733452bfd697076787ef9' LIMIT 2;
+----------+----------------------------------+
| id | userId |
+----------+----------------------------------+
| 38245456 | 748ec561dbc733452bfd697076787ef9 |
| 38245457 | 748ec561dbc733452bfd697076787ef9 |
+----------+----------------------------------+
2 rows in set (0.00 sec)

Revision history for this message
Ben Kochie (bjk) wrote :

Thanks for the additional report. Can you share the insert/update/delete/select rate(s) for your master? Rough numbers are fine.

Also, would you share the EXPLAIN for the SELECT query, I would like to know what index is being used.

Revision history for this message
Tamas Szasz (tszasz) wrote :

Hi,

Here are the requested information:

The EXPLAIN for the last query:
mysql> explain SELECT id, userId FROM registeredUsers WHERE userid >= '748ec561dbc733452bfd697076787ef9' LIMIT 2;
+----+-------------+-----------------+-------+---------------+----------+---------+------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+---------------+----------+---------+------+---------+--------------------------+
| 1 | SIMPLE | registeredUsers | range | userId_2 | userId_2 | 98 | NULL | 3361005 | Using where; Using index |
+----+-------------+-----------------+-------+---------------+----------+---------+------+---------+--------------------------+
1 row in set (0.03 sec)

On the attached image you can find all the queries exported from our ganglia.

Revision history for this message
Ben Kochie (bjk) wrote :

That's interesting, and slightly different. Possibly due to the way you select with >=. In our case, the SELECT returns only one row, but if you force index to another index both rows are returned. This shows the UNIQUE KEY index to be correctly unique, but the table data is not. I will have to try a few things.

On the traffic level, we have a similar peak time traffic rate of ~500-700 mutations and a few thousand selects/sec. (when most of these events are occurring)

Revision history for this message
Tamas Szasz (tszasz) wrote :

If we use only = in the query, it returns only one row. For me it make sense because in normal case you have only one row with the same id, so the engine, after finding the first row doesn't go to find the next one because must only one (it's an equal sign on a key), right? Using >= is slightly different because you ask for a value AND any greater than that value, so the engine will not stop for looking matches after the first match.

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

I wonder is this related to upstream http://bugs.mysql.com/bug.php?id=73170 recently fixed (with fixed versions not yet released)? Had anybody noted this in PS older than 5.5.32+?

Revision history for this message
Ben Kochie (bjk) wrote :

We upgraded directly from PS 5.1 to 5.5.33 back in March 2014. We immediately started having occurrences of the bug. I can test with the upstream fix as soon as there is a PS release of 5.5.40 available.

Revision history for this message
Tamas Szasz (tszasz) wrote :

Only happened for us one time, so cannot say anything. It is true that we use the same infrastructure for more then 1,5 years and happened only now, so it can be that the bug arrived with version 5.5.32+

Revision history for this message
Ben Kochie (bjk) wrote :

Oracle has posted a patch to the upstream bug.

http://bugs.mysql.com/bug.php?id=73170

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

The symptoms seem to match those of http://bugs.mysql.com/bug.php?id=73170 ("[1 Jul 21:05] Inaam Rana" comment in http://bugs.mysql.com/bug.php?id=68021).

Linking this bug to upstream 73170 for now, and the fix should be shipped in 5.5.40/5.6.21. Please let us know whether the upgrade does not resolve the issue, or if you have other reasons to think it's not 73170 you are experiencing.

tags: added: innodb upstream
Revision history for this message
Ben Kochie (bjk) wrote :

Yes, I think so to. As soon as percona-server has a release with the fix I can start testing.

Revision history for this message
ruiaylin (782082893-9) wrote :

I met this problem yesterday , one master ---> three slave , my env is CentOS 6.4 , Percona Server 5.6.19 .
Application operate master int this order . insert one row , and delete this row .
 I get three slave's alert message (sql thread is not running ) at the same time . replication error is " duplicate key .... "

i see this table has primary key and one unique key . and the error comes from unique key .
i can't get the data from the table because of delete operations , but i get the detail from binlog , i found that there are two insert into the table , and the unique key column is the same data . one more thing , the first data's auto_increment id is bigger(191) than the second one (190 ) , but the Xid of the first is bigger than the second one . so i guess the transaction order in binlog file is reversed . And now i am going to upgrade my version to Persona server 5.6.25 version .

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-789

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.