pt-archiver --bulk-insert data loss

Bug #820079 reported by Daniel Nichter
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
Undecided
Unassigned

Bug Description

Using --bulk-insert deletes more rows than it inserts, losing data.

Originally reported at http://code.google.com/p/maatkit/issues/detail?id=1260

Changed in percona-toolkit:
assignee: Daniel Nichter (daniel-nichter) → nobody
Changed in percona-toolkit:
importance: Critical → High
Revision history for this message
veerabahu (moonpearl-vegtime-2000) wrote :

mysql> show create Table Test;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Test | CREATE TABLE `Test` (
  `PKey` varchar(30) NOT NULL DEFAULT '',
  `Time` double DEFAULT NULL,
  PRIMARY KEY (`PKey`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Note: Test_Archive also has exactly the same structure.

mysql> select * from Test;
+------+------+
| PKey | Time |
+------+------+
| d | 7 |
| a | 10 |
| b | 9 |
| c | 8 |
| z | 1 |
| x | 2 |
| v | 3 |
| u | 4 |
+------+------+
8 rows in set (0.00 sec)

mysql> select * from Test_Archive;
Empty set (0.00 sec)

Now executed,

mk-archiver -u xxxx -p xxxx --source h=localhost,D=xxxxxx,t=Test --dest h=localhost,D=xxxxxx,t=Test_Archive --where="Time < 8" --limit=2 --bulk-insert --bulk-delete

After that,

 mysql> select * from Test;
+------+------+
| PKey | Time |
+------+------+
| a | 10 |
| b | 9 |
| c | 8 |
| x | 2 |
| v | 3 |
+------+------+
5 rows in set (0.00 sec)

mysql> select * from Test_Archive;
+------+------+
| PKey | Time |
+------+------+
| d | 7 |
| z | 1 |
+------+------+
2 rows in set (0.00 sec)

Data loss ...Attached the output of this as debug.out and the data used as data.sql

On my initial analysis,
   The delete query is taking Pkey corresponding to last seen time column of first_sql. The same pKey is used in the condition of get_next sql query. However this doesn't need to be correct across all the queries.
  Note that things will work fine if the data set is slightly modified i.e. replace the position of d and z.
  The problem also occurs when the primary key is a alpha numeric string.
  Note that things will work fine if i=Time is set in DSN. However in our real scenario this can't be set due to variety of reasons.

IMO, we could do order by of both PKey and Time i.e. primary key and the column on which condition is made or give a option to specify the column name that has to be used for select (get_next) and delete (del row sql) queries eventhough they are not indexes.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

I'm going to confirm this even though I haven't taken the time to actually verify it. It seems there are enough reports that if we can't repro it, we are wrong.

description: updated
Changed in percona-toolkit:
status: New → Confirmed
Changed in percona-toolkit:
importance: High → Undecided
Brian Fraser (fraserbn)
Changed in percona-toolkit:
assignee: nobody → Brian Fraser (fraserbn)
Changed in percona-toolkit:
assignee: Brian Fraser (fraserbn) → nobody
Revision history for this message
Frank Cizmich (frank-cizmich) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-866

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.