Comment 1 for bug 820079

Revision history for this message
veerabahu (moonpearl-vegtime-2000) wrote :

mysql> show create Table Test;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Test | CREATE TABLE `Test` (
  `PKey` varchar(30) NOT NULL DEFAULT '',
  `Time` double DEFAULT NULL,
  PRIMARY KEY (`PKey`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Note: Test_Archive also has exactly the same structure.

mysql> select * from Test;
+------+------+
| PKey | Time |
+------+------+
| d | 7 |
| a | 10 |
| b | 9 |
| c | 8 |
| z | 1 |
| x | 2 |
| v | 3 |
| u | 4 |
+------+------+
8 rows in set (0.00 sec)

mysql> select * from Test_Archive;
Empty set (0.00 sec)

Now executed,

mk-archiver -u xxxx -p xxxx --source h=localhost,D=xxxxxx,t=Test --dest h=localhost,D=xxxxxx,t=Test_Archive --where="Time < 8" --limit=2 --bulk-insert --bulk-delete

After that,

 mysql> select * from Test;
+------+------+
| PKey | Time |
+------+------+
| a | 10 |
| b | 9 |
| c | 8 |
| x | 2 |
| v | 3 |
+------+------+
5 rows in set (0.00 sec)

mysql> select * from Test_Archive;
+------+------+
| PKey | Time |
+------+------+
| d | 7 |
| z | 1 |
+------+------+
2 rows in set (0.00 sec)

Data loss ...Attached the output of this as debug.out and the data used as data.sql

On my initial analysis,
   The delete query is taking Pkey corresponding to last seen time column of first_sql. The same pKey is used in the condition of get_next sql query. However this doesn't need to be correct across all the queries.
  Note that things will work fine if the data set is slightly modified i.e. replace the position of d and z.
  The problem also occurs when the primary key is a alpha numeric string.
  Note that things will work fine if i=Time is set in DSN. However in our real scenario this can't be set due to variety of reasons.

IMO, we could do order by of both PKey and Time i.e. primary key and the column on which condition is made or give a option to specify the column name that has to be used for select (get_next) and delete (del row sql) queries eventhough they are not indexes.