pt-table-sync and text columns with just whitespace

Bug #930693 reported by Rob Tinsley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Brian Fraser

Bug Description

I was having trouble syncing some tables with text columns, some of which are just whitespace.

pt-table-sync recognised that rows in the two databases were different, but the UPDATE command it was generating was setting the text column to '' (the empty string) when it should have been ' ' or "\r\n" (crlf).

The problem appears to be in this section of code ...

            if ( $self->{hex_blob}
                 && $tbl_struct->{type_for}->{$col} =~ m/blob|text|binary/ ) {
              $col = "IF(`$col`='', '', CONCAT('0x', HEX(`$col`))) AS `$col`";
            }

... where the IF condition is satisfied more often than I expected (note: MySQL really isn't my thing) ...

mysql> SHOW VARIABLES LIKE 'version%';
+-------------------------+------------------------------------------------------------+
| Variable_name | Value |
+-------------------------+------------------------------------------------------------+
| version | 5.0.77-log |
| version_bdb | Sleepycat Software: Berkeley DB 4.1.24: (January 29, 2009) |
| version_comment | Source distribution |
| version_compile_machine | i686 |
| version_compile_os | redhat-linux-gnu |
+-------------------------+------------------------------------------------------------+
5 rows in set (0.00 sec)

mysql> DESCRIBE the_table;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
: :
| message | text | NO | | NULL | |
: :
+-------------+--------------+------+-----+---------+----------------+
11 rows in set (0.00 sec)

mysql> SELECT id,message,LENGTH(message),HEX(message),IF(message='','empty','not-empty') FROM the_table WHERE id=24169;
+-------+---------+-----------------+--------------+------------------------------------+
| id | message | LENGTH(message) | HEX(message) | IF(message='','empty','not-empty') |
+-------+---------+-----------------+--------------+------------------------------------+
| 24169 | | 1 | 20 | empty |
+-------+---------+-----------------+--------------+------------------------------------+
1 row in set (0.00 sec)

(note that this is a stock CentOS 5.7 box.)

This patch fixed things for me ...

- $col = "IF(`$col`='', '', CONCAT('0x', HEX(`$col`))) AS `$col`";
+ $col = "IF(LENGTH(`$col`)=0, '', CONCAT('0x', HEX(`$col`))) AS `$col`";

... but assumes that LENGTH() does something sensible for blob and binary columns (which it probably does, I just haven't tested it.)

Related branches

tags: added: pt-table-sync
tags: added: whitespace
Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: New → Confirmed
Brian Fraser (fraserbn)
Changed in percona-toolkit:
assignee: nobody → Brian Fraser (fraserbn)
Brian Fraser (fraserbn)
Changed in percona-toolkit:
milestone: none → 2.1.4
Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: Confirmed → In Progress
Changed in percona-toolkit:
importance: Undecided → Medium
Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: In Progress → Fix Committed
Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-471

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.