Concurrent duplicate inserts can violate a unique key constraint in InnoDB tables
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL Server |
Unknown
|
Unknown
|
|||
Percona Server moved to https://jira.percona.com/projects/PS |
Fix Released
|
High
|
Unassigned | ||
5.1 |
Invalid
|
Undecided
|
Unassigned | ||
5.5 |
Fix Released
|
High
|
Unassigned | ||
5.6 |
Fix Released
|
High
|
Unassigned |
Bug Description
After upgrading the replication master from Percona Server from 5.1 to the latest 5.5, every now and then replication on the slaves fails with a duplicate entry error. We have identified that the reason for that are duplicate inserts allowed by the master.
The phenomenon occurs in multiple tables, but here's one example:
*******
Table: example
Create Table: CREATE TABLE `example` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`item_id` int(11) NOT NULL,
`created_at` datetime NOT NULL,
`remote_addr` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_
KEY `index_
) ENGINE=InnoDB AUTO_INCREMENT=
If we query by a unique key columns, the result is correct, which means that the unique key itself does not contain duplicate entries:
root@localhost [production]> select * from example where user_id = 81825835 AND item_id = 28676798;
+------
| id | user_id | item_id | created_at | remote_addr |
+------
| 19282128 | 81825835 | 28676798 | 2014-04-01 03:02:02 | 177.35.148.167 |
+------
However, all other indexes - primary and secondary - do:
root@localhost [production]> select * from example force index (index_
+------
| id | user_id | item_id | created_at | remote_addr |
+------
| 19282128 | 81825835 | 28676798 | 2014-04-01 03:02:02 | 177.35.148.167 |
| 19282129 | 81825835 | 28676798 | 2014-04-01 03:02:02 | 177.35.148.167 |
+------
The second query returned two rows with the same pair of values in (user_id, item_id), which means the unique constraint was violated.
We have not yet been able to reproduce this outside of production, but in production it happens on a fairly regular basis. Nevertheless we identified a possible pattern leading to the problem that is true regardless of the table that becomes affected. The pattern is a sequence of queries starting with an INSERT, followed by a DELETE and two concurrent INSERTs. According to the binary log, out of the two concurrent inserts, the one that is assigned an earlier auto_inc value, always finishes second. According to our observations, the insert preceding the delete does not always happen, so we think it may be irrelevant to the entire problem.
Here is a binary log snippet from one of the incidents:
# at 3999763
#140516 14:13:14 server id 20117 end_log_pos 3999848 Query thread_id=3139252 exec_time=0 error_code=0
SET TIMESTAMP=
BEGIN
/*!*/;
### INSERT INTO production.example
### SET
### @1=877218430
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 3999986
#140516 14:13:14 server id 20117 end_log_pos 4000013 Xid = 11255723612
COMMIT/*!*/;
--
# at 4050167
#140516 14:13:14 server id 20117 end_log_pos 4050252 Query thread_id=3150908 exec_time=0 error_code=0
SET TIMESTAMP=
BEGIN
/*!*/;
### DELETE FROM production.example
### WHERE
### @1=877218430
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 4050390
#140516 14:13:14 server id 20117 end_log_pos 4050417 Xid = 11255726215
COMMIT/*!*/;
--
# at 4088418
#140516 14:13:14 server id 20117 end_log_pos 4088503 Query thread_id=3134564 exec_time=0 error_code=0
SET TIMESTAMP=
BEGIN
/*!*/;
### INSERT INTO production.example
### SET
### @1=877218459 <--- later auto_inc value committed first
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 4088641
#140516 14:13:14 server id 20117 end_log_pos 4088668 Xid = 11255727401
COMMIT/*!*/;
--
# at 4088668
#140516 14:13:14 server id 20117 end_log_pos 4088753 Query thread_id=3156683 exec_time=0 error_code=0
SET TIMESTAMP=
BEGIN
/*!*/;
### INSERT INTO production.example
### SET
### @1=877218458 <--- earlier auto_inc value committed second
### @2=80246747
### @3=146918708
### @4=2014-05-16 14:13:14
### @5='160.252.189.81'
# at 4088891
#140516 14:13:14 server id 20117 end_log_pos 4088918 Xid = 11255727407
COMMIT/*!*/;
All the events above occurred within one second, however in other cases we also saw a distance of several seconds between the delete and the concurrent inserts. We don't know for certain whether the distance was only because of delayed commits or if any of the inserts was actually ran later, but we suspect that the first inserts waits on some lock and then the second insert appears sometime later and is allowed to finish faster (because it doesn't see those lock(s)?). I base this claim purely on the fact that when we saw a delay between the delete and the inserts, there was a fairly large gap in the auto_inc values assigned to each insert (i.e. they were not consecutive as in the example above).
All transactions use READ COMMITTED isolation. We also ruled out the possibility of unique_checks being disabled. We also tried different options for change_bufferring, but that didn't have any effect.
It's possible that there's some kind of race condition around locking a delete-marked record in the unique index that allows two concurrent threads to assume it's okay to proceed.
tags: | removed: 5.5 |
Changed in percona-server: | |
status: | Incomplete → New |
The DELETE query uses primary key to remove rows.