Comment 6 for bug 1732165

Revision history for this message
Sveta Smirnova (svetasmirnova) wrote :

It seems what DBI delegates proper handling of data types to database driver and DBD/mysql.pm has this code:

 520 elsif ($basetype =~ /^(binary|varbinary)/)
 521 {
 522 $info->{COLUMN_SIZE} = $type_params[0];
 523 # SQL_BINARY & SQL_VARBINARY are tempting here but don't match the
 524 # semantics for mysql (not hex). SQL_CHAR & SQL_VARCHAR are correct here.
 525 $info->{DATA_TYPE} = ($basetype eq 'binary') ? SQL_CHAR() : SQL_VARCHAR();
 526 }

And SQL_CHAR/SQL_VARCHAR have character set defined (unlike binary)

So partially working workaround could be:

1. Create custom checksum table which will use binary character set instead of utf8:

CREATE TABLE `checksums` (
  `db` binary(64) NOT NULL,
  `tbl` binary(64) NOT NULL,
  `chunk` int(11) NOT NULL,
  `chunk_time` float DEFAULT NULL,
  `chunk_index` varbinary(200) DEFAULT NULL,
  `lower_boundary` blob,
  `upper_boundary` blob,
  `this_crc` binary(40) NOT NULL,
  `this_cnt` int(11) NOT NULL,
  `master_crc` binary(40) DEFAULT NULL,
  `master_cnt` int(11) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`db`,`tbl`,`chunk`),
  KEY `ts_db_tbl` (`ts`,`db`,`tbl`)
) ENGINE=InnoDB DEFAULT CHARSET=binary;

2. Run pt-table-checksum with binary character set:

~/build/percona-toolkit/bin/pt-table-checksum --set-vars innodb_lock_wait_timeout=50 --tables=db1.t1 --empty-replicate-table --recursion-method=dsn=h=127.0.0.1,P=13001,u=root,D=percona,t=dsns --max-load="Threads_connected=1300, Threads_running=20" --progress="time,5" --chunk-time=0.005 --max-lag=1 --pause-file="/tmp/ptchecksum.pause" --no-check-binlog-format --no-check-replication-filters --chunk-size=10 h=127.0.0.1,P=13001,u=root,A=binary

3. It will hang:

Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain

But it will found correct differences and won't corrupt data:

mysql> select hex(lower_boundary), length(lower_boundary), hex(upper_boundary), length(upper_boundary) from checksums;
+----------------------------------+------------------------+----------------------------------+------------------------+
| hex(lower_boundary) | length(lower_boundary) | hex(upper_boundary) | length(upper_boundary) |
+----------------------------------+------------------------+----------------------------------+------------------------+
| 10553C504DD011E2BA213859F9E35A10 | 16 | 18553C504DD011E2BA213859F9E35A18 | 16 |
| 19553C504DD011E2BA213859F9E35A19 | 16 | 27553C504DD011E2BA213859F9E35A27 | 16 |
| 28553C504DD011E2BA213859F9E35A28 | 16 | 36553C504DD011E2BA213859F9E35A36 | 16 |
| 37553C504DD011E2BA213859F9E35A37 | 16 | 45553C504DD011E2BA213859F9E35A45 | 16 |
| 46553C504DD011E2BA213859F9E35A46 | 16 | 5553C504DD011E2BA213859F9E35A500 | 16 |
| 55553C504DD011E2BA213859F9E35A55 | 16 | 64553C504DD011E2BA213859F9E35A64 | 16 |
| 6553C504DD011E2BA213859F9E35A600 | 16 | 73553C504DD011E2BA213859F9E35A73 | 16 |
| 74553C504DD011E2BA213859F9E35A74 | 16 | 78553C504DD011E2BA213859F9E35A78 | 16 |
| 79553C504DD011E2BA213859F9E35A79 | 16 | 87553C504DD011E2BA213859F9E35A87 | 16 |
| 88553C504DD011E2BA213859F9E35A88 | 16 | 96553C504DD011E2BA213859F9E35A96 | 16 |
| 97553C504DD011E2BA213859F9E35A97 | 16 | 99553C504DD011E2BA213859F9E35A99 | 16 |
| NULL | NULL | 10553C504DD011E2BA213859F9E35A10 | 16 |
| 99553C504DD011E2BA213859F9E35A99 | 16 | NULL | NULL |
+----------------------------------+------------------------+----------------------------------+------------------------+
13 rows in set (0.00 sec)

Test for the first range:

mysql> select count(*) from db1.t1 where guid > unhex('10553C504DD011E2BA213859F9E35A10') and guid < unhex('18553C504DD011E2BA213859F9E35A18');
+----------+
| count(*) |
+----------+
| 8 |
+----------+
1 row in set (0.00 sec)

Carlos, I am still not 100% sure this is DBD::mysql bug, but if you think so its bug database is at https://rt.cpan.org/Public/Dist/Display.html?Name=DBD-mysql