pt-table-checksum stucks in an infinite loop

Bug #1246627 reported by Muhammad Irfan on 2013-10-31
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit
Undecided
Daniel Nichter

Bug Description

I encountered this problem with version 2.2.4

Here's what I have so far. I have 2 table dumps, a1 and a2. With a1, you can get skipped chunks and infinite loop errors. Though sometimes, checksum for a1 completes successfully. As for a2, I was only able to narrow it down to 200000 rows.

DROP TABLE IF EXISTS `a1`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `a1` (
  `word` varchar(50) NOT NULL DEFAULT '',
  `sid` int(10) unsigned NOT NULL DEFAULT '0',
  `type` varchar(16) DEFAULT NULL,
  `score` float DEFAULT NULL,
  UNIQUE KEY `word_sid_type` (`word`,`sid`,`type`),
  KEY `sid_type` (`sid`,`type`),
  KEY `word` (`word`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

DROP TABLE IF EXISTS `a2`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `a2` (
  `word` varchar(50) NOT NULL DEFAULT '',
  `sid` int(10) unsigned NOT NULL DEFAULT '0',
  `type` varchar(16) DEFAULT NULL,
  `score` float DEFAULT NULL,
  UNIQUE KEY `word_sid_type` (`word`,`sid`,`type`),
  KEY `sid_type` (`sid`,`type`),
  KEY `word` (`word`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

[user@localhost ~]$ pt-table-checksum --recursion-method=dsn=h=127.0.0.1,D=percona,t=dsns,P=22987 --databases=test3 --host=127.0.0.1 --port=22987 --empty-replicate-table
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LC_CTYPE = "UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
10-01T11:01:56 Skipping chunk 1 of test3.a1 because MySQL chose no index instead of the word_sid_typeindex.
10-01T11:01:56 Error checksumming table test3.a1: Possible infinite loop detected! The lower boundary for chunk 2 is <???, ???, 187641, ???, 187641, node, node> and the lower boundary for chunk 3 is also <???, ???, 187641, ???, 187641, node, node>. This usually happens when using a non-unique single column index. The current chunk index for table test3.a1 is word_sid_type which is unique and covers 3 columns.

TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
10-01T11:01:56 1 0 0 2 2 0.751 test3.a1
Checksumming test3.a2: 30% 01:08 remain
10-01T11:02:41 Skipping chunk 49 of test3.a2 because MySQL chose no index instead of the word_sid_typeindex.
^C# Caught SIGINT.
10-01T11:05:25 0 0 8431156 301 92 208.889 test3.a2

description: updated
Changed in percona-toolkit:
status: New → Confirmed
assignee: nobody → Daniel Nichter (daniel-nichter)
tags: added: infinite-loop percona-35597 pt-table-checksum
Kevin Cormier (kevin-cormier) wrote :

I can confirm I'm having the same issue, specifically when the selected index is on a varchar column. If the nibble/chunk boundary falls on a string with a chinese character in it, the next boundary is computed using a ? instead of the chinese character.

mysql Ver 15.1 Distrib 5.5.40-MariaDB, for Linux (x86_64) using readline 5.1
pt-table-checksum 2.2.14

Here is an example of the boundaries that are being computed.

4 ??? lgw
5 lh1815 widgetlike
6 widley ???
7 ??? lgw
8 lh1815 widgetlike
9 widley ???
10 ??? lgw
11 lh1815 widgetlike
12 widley ???
13 ??? lgw
14 lh1815 widgetlike
15 widley ???
16 ??? lgw
17 lh1815 widgetlike
18 widley ???
19 ??? lgw
20 lh1815 widgetlike
21 widley ???

When I grab the same data directly from mysql I get the following boundaries:

lh1815 widgetlike
widley 吃什麼
 各あな <0 results returned......indicates end of table>

The tool should specifically print an error if a varchar index is being used and the db, server, and client charsets don't match. My db and server were both set to utf8 however my client wasn't explicitly set, so was using the default (latin1).

I fixed the issue by adding A=utf8 to the end of the command.

Also, sorry for the change of tone half way through the post. I was originally going to post a +1 along with as much debug info as I could find. I ended up solving my own solution so I figured I'd at least post my findings so hopefully they can be useful to someone in the future.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers