pt-archiver --bulk-insert data loss

Reported by Daniel Nichter on 2011-08-02
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit
Undecided
Unassigned

Bug Description

Using --bulk-insert deletes more rows than it inserts, losing data.

Originally reported at http://code.google.com/p/maatkit/issues/detail?id=1260

Changed in percona-toolkit:
assignee: Daniel Nichter (daniel-nichter) → nobody
Changed in percona-toolkit:
importance: Critical → High

mysql> show create Table Test;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Test | CREATE TABLE `Test` (
  `PKey` varchar(30) NOT NULL DEFAULT '',
  `Time` double DEFAULT NULL,
  PRIMARY KEY (`PKey`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Note: Test_Archive also has exactly the same structure.

mysql> select * from Test;
+------+------+
| PKey | Time |
+------+------+
| d | 7 |
| a | 10 |
| b | 9 |
| c | 8 |
| z | 1 |
| x | 2 |
| v | 3 |
| u | 4 |
+------+------+
8 rows in set (0.00 sec)

mysql> select * from Test_Archive;
Empty set (0.00 sec)

Now executed,

mk-archiver -u xxxx -p xxxx --source h=localhost,D=xxxxxx,t=Test --dest h=localhost,D=xxxxxx,t=Test_Archive --where="Time < 8" --limit=2 --bulk-insert --bulk-delete

After that,

 mysql> select * from Test;
+------+------+
| PKey | Time |
+------+------+
| a | 10 |
| b | 9 |
| c | 8 |
| x | 2 |
| v | 3 |
+------+------+
5 rows in set (0.00 sec)

mysql> select * from Test_Archive;
+------+------+
| PKey | Time |
+------+------+
| d | 7 |
| z | 1 |
+------+------+
2 rows in set (0.00 sec)

Data loss ...Attached the output of this as debug.out and the data used as data.sql

On my initial analysis,
   The delete query is taking Pkey corresponding to last seen time column of first_sql. The same pKey is used in the condition of get_next sql query. However this doesn't need to be correct across all the queries.
  Note that things will work fine if the data set is slightly modified i.e. replace the position of d and z.
  The problem also occurs when the primary key is a alpha numeric string.
  Note that things will work fine if i=Time is set in DSN. However in our real scenario this can't be set due to variety of reasons.

IMO, we could do order by of both PKey and Time i.e. primary key and the column on which condition is made or give a option to specify the column name that has to be used for select (get_next) and delete (del row sql) queries eventhough they are not indexes.

Baron Schwartz (baron-xaprb) wrote :

I'm going to confirm this even though I haven't taken the time to actually verify it. It seems there are enough reports that if we can't repro it, we are wrong.

description: updated
Changed in percona-toolkit:
status: New → Confirmed
Changed in percona-toolkit:
importance: High → Undecided
Brian Fraser (fraserbn) on 2012-11-12
Changed in percona-toolkit:
assignee: nobody → Brian Fraser (fraserbn)
Changed in percona-toolkit:
assignee: Brian Fraser (fraserbn) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments