TokuDB Hot Backup inconsistency with tokudb_commit_sync disabled
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS | Status tracked in 5.7 | |||||
5.6 |
Fix Released
|
High
|
Vlad Lesin | |||
5.7 |
Fix Released
|
High
|
Vlad Lesin |
Bug Description
We use as a Slave with RBR
mysql Ver 14.14 Distrib 5.6.27-75.0, for Linux (x86_64) using 6.0
replicating from Master:
mysql Ver 14.14 Distrib 5.5.30, for Linux (x86_64) using readline 5.1
For backup purpose we use tokudb_backup plugin on the Slave box.
If tokudb_commit_sync configured:
tokudb_commit_sync = 1
we have no issues creating backup, restoring it and reconnecting it back to the Master using Replication position saved in relay-log.info file.
If tokudb_commit_sync configured:
tokudb_commit_sync = 0
tokudb_
we can create backup, restore it, but Replication FAILS reporting missing records.
ERROR LOG:
2016-01-11 13:46:56 6043 [ERROR] Slave SQL: Could not execute Update_rows_v1 event on table DB1.RunID; Can't find record in 'RunID', Error_code: 1032; handler error HA_ERR_
2016-01-11 13:46:56 6043 [Warning] Slave: Can't find record in 'RunID' Error_code: 1032
2016-01-11 13:51:14 6987 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB1.WebPageResu
2016-01-11 13:51:14 6987 [Warning] Slave: Can't find record in 'WebPageResultData' Error_code: 1032
tags: | added: i64508 |
description: | updated |
tags: | added: tokubackup tokudb |
I was able to reproduce this in simplified test. commit_ sync=0 fsync_log_ period = 1000
Using slave with this parameters:
tokudb_
tokudb_
Ran this simple write stress on the master:
$ for i in {1..10000};do echo "insert into test.toku2 values ($i,'AAA' )"|rsandbox_ Percona- Server- 5_6_28/ m; echo "update test.toku2 set a='bbb' where id=$i"| rsandbox_ Percona- Server- 5_6_28/ m; echo "delete from test.toku2 where id=$i"| rsandbox_ Percona- Server- 5_6_28/ m; done
During the test, triggered backup on the slave:
set tokudb_ backup_ dir='/home/ przemyslaw. malkowski/ sandboxes/ rsandbox_ Percona- Server- 5_6_28/ node2/backup/ 1';
Once backup done, stopped the slave, and rewritten the data with the backup one, then resumed replication. After the slave catch up, there is data consistency issue:
master [localhost] {msandbox} (percona) > select * from test.toku2;
Empty set (0.01 sec)
slave2 [localhost] {msandbox} (test) > select * from test.toku2;
+-----+------+
| id | a |
+-----+------+
| 679 | bbb |
+-----+------+
1 row in set (0.00 sec)