pt-table-checksum ignoring primary key

Bug #978432 reported by Dave Juntgen
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
High
Daniel Nichter
2.0
Invalid
High
Daniel Nichter
2.1
Fix Released
High
Daniel Nichter

Bug Description

pt-table-checksum is ignoring the use of a primary key.

$ ./pt-table-checksum --user=X --password=X -d wc_acj -t multi_resource_apt wc-c1-db1
04-10T19:06:18 Error checksumming table wc_acj.multi_resource_apt: Possible infinite loop detected! The lower boundary for chunk 1 is <738> and the lower boundary for chunk 2 is also <738>. This usually happens when using a non-unique single column index. The current chunk index for table wc_acj.multi_resource_apt is resid which is not unique and covers 1 column.

            TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
04-10T19:06:18 1 0 0 1 1 0.005 wc_acj.multi_resource_apt

Running PTDEBUG=1, I see the following.

# SchemaIterator:4807 17970 Found 1 tables in database wc_acj
# TableParser:1783 17970 /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := REPLACE(REPLACE(@@SQL_MODE, 'ANSI_QUOTES', ''), ',,', ','), @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
# TableParser:1788 17970 DBI::db=HASH(0x2f246b0) USE `wc_acj`
# TableParser:1792 17970 SHOW CREATE TABLE `wc_acj`.`multi_resource_apt`
# TableParser:1804 17970 /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
# TableParser:2013 17970 Storage engine: InnoDB
# TableParser:1834 17970 Table cols: `apt_id`, `res_id`
# TableParser:2013 17970 Storage engine: InnoDB
# TableParser:2029 17970 Parsed key: PRIMARY KEY (`apt_id`,`res_id`),
# TableParser:2055 17970 PRIMARY key cols: `apt_id`, `res_id`
# TableParser:2077 17970 This key is the clustered key
# TableParser:2029 17970 Parsed key: KEY `resid` (`res_id`)
# TableParser:2055 17970 resid key cols: `res_id`
# SchemaIterator:4698 17970 Next schema object: wc_acj multi_resource_apt
# RowChecksum:3226 17970 Checksum strat: $VAR1 = {
# crc_type => 'int',
# crc_width => 16,
# func => 'CRC32',
# opt_slice => undef
# };
#
# RowChecksum:3208 17970 Row checksum: CRC32(CONCAT_WS('#', `apt_id`, `res_id`))
# RowChecksum:3247 17970 Chunk checksum: COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `apt_id`, `res_id`)) AS UNSIGNED)), 10, 16)), 0) AS crc
# NibbleIterator:3993 17970 EXPLAIN SELECT * FROM `wc_acj`.`multi_resource_apt` WHERE 1=1
# NibbleIterator:3995 17970 $VAR1 = {
# extra => 'Using index',
# id => '1',
# key => 'resid',
# key_len => '4',
# possible_keys => undef,
# ref => undef,
# rows => '396069',
# select_type => 'SIMPLE',
# table => 'multi_resource_apt',
# type => 'index'
# };
#
# NibbleIterator:3870 17970 One nibble: no

####################
#
# HERE - MySQL ignore's the primary key, in return so does pt-table-checksum
# I think it would be wise to force PRIMARY KEY indexes to be used regardless of other indexes.
# Even though the cardinality is not the highest, the primary key is still unique.
#
####################
# NibbleIterator:3908 17970 MySQL wants to use index resid
# NibbleIterator:3920 17970 Wanted index is a possible index
# NibbleIterator:3938 17970 No PRIMARY or unique indexes; will use index with highest cardinality
# NibbleIterator:3969 17970 SHOW INDEXES FROM `wc_acj`.`multi_resource_apt` WHERE Key_name = 'resid'
# NibbleIterator:3979 17970 Index resid cardinality: 10
# NibbleIterator:3958 17970 Best index: resid
# TableNibbler:2204 17970 Will ascend index resid
# TableNibbler:2211 17970 Will ascend columns res_id
# TableNibbler:2222 17970 Will ascend, in ordinal position: 1
# NibbleIterator:3591 17970 Ascend params: $VAR1 = {
# boundaries => {
# '<' => '((`res_id` < ?))',
# '<=' => '((`res_id` <= ?))',
# '>' => '((`res_id` > ?))',
# '>=' => '((`res_id` >= ?))'
# },
# cols => [
# 'apt_id',
# 'res_id'
# ],
# index => 'resid',
# scols => [
# 'res_id'
# ],
# slice => [
# 1
# ],
# where => '((`res_id` > ?))'
# };
#

Related branches

Dave Juntgen (djuntgen)
description: updated
description: updated
Revision history for this message
Dave Juntgen (djuntgen) wrote :

I know that there is a setting to force a key to be used, --chunk-index=PRIMARY, however as a feature I think that if a primary index exist and no other unique key exists that the primary key should take precedence over any non unique key regardless what MySQL Explain says.

Revision history for this message
Dave Juntgen (djuntgen) wrote :

Additional information regarding invalid index selection. Different error but similar problem:

04-13T05:08:37 Error checksumming table wc_dow.patient_pdups: Possible infinite loop detected! The lower boundary for chunk 2 is <9> and the lower boundary for chunk 3 is also <9>. This usually happens when using a non-unique single column index. The current chunk index for table wc_dow.patient_pdups is idx3 which is not unique and covers 1 column.

Indexes on table:

+---------------+------------+----------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------------+------------+----------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
| patient_pdups | 0 | idx1 | 1 | pat_id | A | 110893 | NULL | NULL | | BTREE | |
| patient_pdups | 0 | idx1 | 2 | pat_id_match | A | 443575 | NULL | NULL | | BTREE | |
| patient_pdups | 0 | idx2 | 1 | pat_id_match | A | 147858 | NULL | NULL | | BTREE | |
| patient_pdups | 0 | idx2 | 2 | pat_id | A | 443575 | NULL | NULL | | BTREE | |
| patient_pdups | 1 | idx3 | 1 | rating | A | 13 | NULL | NULL | | BTREE | |
+---------------+------------+----------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+

As you can see, idx1 and idx2 are unique indexes, but pt-table-checksum is using idx3, a non unique index, for its chucking.

Revision history for this message
Dave Juntgen (djuntgen) wrote :

Here is the bug:

...
# NibbleIterator:3938 24062 No PRIMARY or unique indexes; will use index with highest cardinality
...

NibbleIterator states there are no PRIMARY or Unique indexes, when in fact there are unique indexes on the table as idx1 and idx2.

See attached patch file.

Revision history for this message
Dave Juntgen (djuntgen) wrote :
Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Dave,

My memory is that the tool will prefer primary keys over any others regardless. I wonder why that is not happening in this case. Can you paste in SHOW CREATE TABLE for the index?

Revision history for this message
Dave Juntgen (djuntgen) wrote :

mysql> SHOW CREATE TABLE wc_acj.multi_resource_apt\G
*************************** 1. row ***************************
       Table: multi_resource_apt
Create Table: CREATE TABLE `multi_resource_apt` (
  `apt_id` int(10) unsigned NOT NULL DEFAULT '0',
  `res_id` int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`apt_id`,`res_id`),
  KEY `resid` (`res_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

tags: added: chunking pt-table-checksum
Changed in percona-toolkit:
status: New → In Progress
assignee: nobody → Daniel Nichter (daniel-nichter)
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

I've reproduced this error:

05-04T15:42:32 Error checksumming table test.multi_resource_apt: Possible infinite loop detected! The lower boundary for chunk 1 is <1> and the lower boundary for chunk 2 is also <1>. This usually happens when using a non-unique single column index. The current chunk index for table test.multi_resource_apt is resid which is not unique and covers 1 column.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

The problem was more than the tool just ignoring the primary key, it was related to the row estimate (which, yes, seems unrelated). The tool does "EXPLAIN SELECT * FROM table WHERE 1=1" to get the row estimate and also the MySQL-chosen index for that query, but that query is not representative of the actual nibbling query (the query that ascends the table, not the one that selects the rows to checksum), so in cases like this, what's a good index for "SELECT * FROM table WHERE 1=1" may not be a good index for the more complex nibbling query.

So the new code does EXPLAIN for the actual nibbling query and if MySQL chooses a different index than the tool originally chose, then the MySQL index is used because MySQL should know best. I know the two indexes (the tool's and MySQL's) can differ when there's a --where clause that the tool can't account for but MySQL can. They will probably also differ in more complex scenarios with various MySQL optimizer tricks that the tool can't account for.

The original pt-table-checksum did this--used MySQL's chosen index--and the new pt-table-checksum does too but in a simplistic manner which doesn't work in cases like this where the PK is not always MySQL's preferred choice.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

I realized a simpler, better solution. The very original issue for this bug was http://code.google.com/p/maatkit/issues/detail?id=378

MySQL's chosen index is only something we should prefer if --where is used. Else, we can chose our own index and disregard the MySQL index from EXPLAIN SELECT * FROM tbl WHERE 1=1 (the row estimate) because this query is nothing like the checksum queries. If there's a --where, however, then it's used instead of 1=1 and then the MySQL is important because it tells us how MySQL plans to optimize for the --where.

The latest code of the branch simply does:

   if ( !$where ) {
      $mysql_index = undef;
   }

The test for this bug passes and all original tests pass too.

Revision history for this message
Dave Juntgen (djuntgen) wrote : Re: [Bug 978432] Re: pt-table-checksum ignoring primary key
Download full text (5.4 KiB)

I like simple solutions. Thanks.

On Tue, May 8, 2012 at 2:40 PM, Daniel Nichter <email address hidden> wrote:
> I realized a simpler, better solution.  The very original issue for this
> bug was http://code.google.com/p/maatkit/issues/detail?id=378
>
> MySQL's chosen index is only something we should prefer if --where is
> used.  Else, we can chose our own index and disregard the MySQL index
> from EXPLAIN SELECT * FROM tbl WHERE 1=1 (the row estimate) because this
> query is nothing like the checksum queries.  If there's a --where,
> however, then it's used instead of 1=1 and then the MySQL is important
> because it tells us how MySQL plans to optimize for the --where.
>
> The latest code of the branch simply does:
>
>   if ( !$where ) {
>      $mysql_index = undef;
>   }
>
> The test for this bug passes and all original tests pass too.
>
> ** Bug watch added: code.google.com/p/maatkit/issues #378
>   http://code.google.com/p/maatkit/issues/detail?id=378
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/978432
>
> Title:
>  pt-table-checksum ignoring primary key
>
> Status in Percona Toolkit:
>  In Progress
> Status in Percona Toolkit 2.0 series:
>  In Progress
> Status in Percona Toolkit 2.1 series:
>  In Progress
>
> Bug description:
>  pt-table-checksum is ignoring the use of a primary key.
>
>  $ ./pt-table-checksum --user=X --password=X -d wc_acj -t multi_resource_apt wc-c1-db1
>  04-10T19:06:18 Error checksumming table wc_acj.multi_resource_apt: Possible infinite loop detected!  The lower boundary for chunk 1 is <738> and the lower boundary for chunk 2 is also <738>.  This usually happens when using a non-unique single column index.  The current chunk index for table wc_acj.multi_resource_apt is resid which is not unique and covers 1 column.
>
>              TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
>  04-10T19:06:18      1      0        0       1       1   0.005 wc_acj.multi_resource_apt
>
>  Running PTDEBUG=1, I see the following.
>
>  # SchemaIterator:4807 17970 Found 1 tables in database wc_acj
>  # TableParser:1783 17970 /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := REPLACE(REPLACE(@@SQL_MODE, 'ANSI_QUOTES', ''), ',,', ','), @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
>  # TableParser:1788 17970 DBI::db=HASH(0x2f246b0) USE `wc_acj`
>  # TableParser:1792 17970 SHOW CREATE TABLE `wc_acj`.`multi_resource_apt`
>  # TableParser:1804 17970 /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
>  # TableParser:2013 17970 Storage engine: InnoDB
>  # TableParser:1834 17970 Table cols: `apt_id`, `res_id`
>  # TableParser:2013 17970 Storage engine: InnoDB
>  # TableParser:2029 17970 Parsed key: PRIMARY KEY (`apt_id`,`res_id`),
>  # TableParser:2055 17970 PRIMARY key cols: `apt_id`, `res_id`
>  # TableParser:2077 17970 This key is the clustered key
>  # TableParser:2029 17970 Parsed key: KEY `resid` (`res_id`)
>  # TableParser:2055 17970 resid key cols: `res_id`
>  # SchemaIterator:4698 17970 Next schema object: wc_acj multi_resource_apt
>  # RowChecksum:3226 17970 Checksum strat: $VAR1 = ...

Read more...

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Turns out the problem doesn't happen in 2.0 because in that series the tool still used SHOW TABLE STATUS for the row estimate, thereby not returning a MySQL-chosen index. But when --where is used, then it uses EXPLAIN and return's MySQL chosen index.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Tested in 2.0 to ensure that it stays ok in 2.0.

Revision history for this message
Sheeri K. Cabral (awfief) wrote :

I'm running pt-table-checksum version 2.1.3 and I'm getting this issue...

08-10T10:44:12 Error checksumming table bugs.bug_group_map: Possible infinite loop detected! The lower boundary for chunk 4 is <26> and the lower boundary for chunk 5 is also <26>. This usually happens when using a non-unique single column index. The current chunk index for table bugs.bug_group_map is bug_group_map_group_id_idx which is not unique and covers 1 column.

            TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
08-10T10:44:12 1 0 0 4 4 0.015 bugs.bug_group_map
08-10T10:50:11 Error checksumming table bugs.user_group_map: Possible infinite loop detected! The lower boundary for chunk 2 is <69> and the lower boundary for chunk 3 is also <69>. This usually happens when using a non-unique single column index. The current chunk index for table bugs.user_group_map is fk_user_group_map_group_id_groups_id which is not unique and covers 1 column.

08-10T10:50:11 1 0 0 2 2 0.092 bugs.user_group_map
08-10T10:50:12 0 1 0 1 0 0.010 mysql.proc
08-10T10:50:13 0 1 300 1 0 0.012 mysql.tables_priv
08-10T10:50:13 0 1 30 1 0 0.010 mysql.user

real 18m47.019s
user 0m3.645s
sys 0m0.467s

[reply] [-] Comment 7 Sheeri Cabral [:sheeri] 2012-08-28 16:27:19 EDT

Here's the bugs.bug_group_map table:

       Table: bug_group_map
Create Table: CREATE TABLE `bug_group_map` (
  `bug_id` mediumint(9) NOT NULL,
  `group_id` mediumint(9) NOT NULL,
  UNIQUE KEY `bug_group_map_bug_id_idx` (`bug_id`,`group_id`),
  KEY `bug_group_map_group_id_idx` (`group_id`),
  CONSTRAINT `fk_bug_group_map_bug_id_bugs_bug_id` FOREIGN KEY (`bug_id`) REFERENCES `bugs` (`bug_id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `fk_bug_group_map_group_id_groups_id` FOREIGN KEY (`group_id`) REFERENCES `groups` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.03 sec)

and the bugs.user_group_map table:
       Table: user_group_map
Create Table: CREATE TABLE `user_group_map` (
  `user_id` mediumint(9) NOT NULL,
  `group_id` mediumint(9) NOT NULL,
  `isbless` tinyint(4) NOT NULL DEFAULT '0',
  `grant_type` tinyint(4) NOT NULL DEFAULT '0',
  UNIQUE KEY `user_group_map_user_id_idx` (`user_id`,`group_id`,`grant_type`,`isbless`),
  KEY `fk_user_group_map_group_id_groups_id` (`group_id`),
  CONSTRAINT `fk_user_group_map_group_id_groups_id` FOREIGN KEY (`group_id`) REFERENCES `groups` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `fk_user_group_map_user_id_profiles_userid` FOREIGN KEY (`user_id`) REFERENCES `profiles` (`userid`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-305

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.