Large BLOB/TEXT/BINARY Produces NULL Checksum

Bug #1592608 reported by Jervin R
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Undecided
Carlos Salguero

Bug Description

When checksumming columns with large BLOB/TEXT/BINARY, this can cause CONCAT_WS to overflow and result in NULL value for this_crc/master_crc. By wrapping these columns into CRC32, it removes this problem.

Submitted a pull request for pt-table-sync here:

https://github.com/percona/percona-toolkit/pull/91

pt-table-checksum is affected as well.

Revision history for this message
Jervin R (revin) wrote :
Revision history for this message
Jervin R (revin) wrote :

How to repeat:

master [localhost] {msandbox} (test) > create table t (id int unsigned not null auto_increment primary key, a mediumtext, b mediumblob)engine=innodb;
Query OK, 0 rows affected (0.02 sec)

master [localhost] {msandbox} (test) > insert into t (a, b) values (repeat('a', 2097152*2), cast(repeat('a', 2097152*2) as binary));
Query OK, 1 row affected (0.26 sec)

master [localhost] {msandbox} (test) > select length(a), length(b) from t;
+-----------+-----------+
| length(a) | length(b) |
+-----------+-----------+
| 4194304 | 4194304 |
+-----------+-----------+
1 row in set (0.00 sec)

-- When checksumming with PTDEBUG=1, the following chunk checksumming is produced:

# RowChecksum:5792 17587 Row checksum: CRC32(CONCAT_WS('#', `id`, `a`, `b`, CONCAT(ISNULL(`a`), ISNULL(`b`))))
# RowChecksum:5831 17587 Chunk checksum: COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `id`, `a`, `b`, CONCAT(ISNULL(`a`), ISNULL(`b`)))) AS UNSIGNED)), 10, 16)), 0) AS crc

-- On the slave, the CRC is NULL

slave1 [localhost] {msandbox} (test) > select * from percona.checksum \G
*************************** 1. row ***************************
            db: test
           tbl: t
         chunk: 1
    chunk_time: NULL
   chunk_index: NULL
lower_boundary: NULL
upper_boundary: NULL
      this_crc: 0
      this_cnt: 1
    master_crc: NULL
    master_cnt: NULL
            ts: 2016-06-16 19:19:30
1 row in set (0.00 sec)

-- Testing with the actual CRC query on the table, we get NULL:

master [localhost] {msandbox} (test) > select CRC32(CONCAT_WS('#', `id`, `a`, `b`, CONCAT(ISNULL(`a`), ISNULL(`b`)))) as crc from t;
+------+
| crc |
+------+
| NULL |
+------+
1 row in set, 1 warning (0.00 sec)

-- This is because of max_allowed_packet:

master [localhost] {msandbox} (test) > show warnings;
+---------+------+--------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------------------------------------------+
| Warning | 1301 | Result of concat_ws() was larger than max_allowed_packet (4194304) - truncated |
+---------+------+--------------------------------------------------------------------------------+
1 row in set (0.00 sec)

-- Adding CRC32 to the BLOB/TEXT columns solves the problem

master [localhost] {msandbox} (test) > select CRC32(CONCAT_WS('#', `id`, CRC32(`a`), CRC32(`b`), CONCAT(ISNULL(`a`), ISNULL(`b`)))) as crc from t;
+------------+
| crc |
+------------+
| 2498444559 |
+------------+
1 row in set (0.01 sec)

Changed in percona-toolkit:
status: New → Confirmed
Changed in percona-toolkit:
assignee: nobody → Carlos Salguero (carlos-salguero)
status: Confirmed → In Progress
Changed in percona-toolkit:
status: In Progress → Fix Committed
Changed in percona-toolkit:
milestone: none → 2.2.19
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1356

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.