pt-table-checksum fails with binary log error in mysql >= 5.5.18

Bug #919499 reported by Bryan Aldridge
60
This bug affects 12 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
High
Daniel Nichter
2.0
Fix Released
High
Daniel Nichter
2.1
Fix Released
High
Daniel Nichter

Bug Description

When trying to run pt-table-checksum (percona-toolkit-2.0.2-1) against a master (with one slave) both running MySQL-server-5.5.19, the tool generates the following message for all tables/databases:

$ pt-table-checksum --ask-pass --user root

01-20T16:16:23 Error checksumming table mysql.user: Error executing checksum query: Checksum query for table mysql.user caused MySQL error 1592:
    Level: Note
     Code: 1592
  Message: Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE... SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave.
    Query: REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT ?, ?, ?, ?, ?, ?, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `host`, `user`, `password`, `select_priv`, `insert_priv`, `update_priv`, `delete_priv`, `create_priv`, `drop_priv`, `reload_priv`, `shutdown_priv`, `process_priv`, `file_priv`, `grant_priv`, `references_priv`, `index_priv`, `alter_priv`, `show_db_priv`, `super_priv`, `create_tmp_table_priv`, `lock_tables_priv`, `execute_priv`, `repl_slave_priv`, `repl_client_priv`, `create_view_priv`, `show_view_priv`, `create_routine_priv`, `alter_routine_priv`, `create_user_priv`, `event_priv`, `trigger_priv`, `create_tablespace_priv`, `ssl_type`, `ssl_cipher`, `x509_issuer`, `x509_subject`, `max_questions`, `max_updates`, `max_connections`, `max_user_connections`, `plugin`, `authentication_string`, CONCAT(ISNULL(`plugin`), ISNULL(`authentication_string`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `mysql`.`user` /*checksum table*/

I suspect it is related to the first "Incompatible change: replication" under "Bug Fixed" here: http://dev.mysql.com/doc/refman/5.5/en/news-5-5-18.html

Related branches

Revision history for this message
Scott Nemes (scottmnemes) wrote :

I received the same error when running pt-table-checksum.

Details:

Percona Server v5.5.19-55 (installed using Percona-Server-server-55-5.5.19-rel24.0.204.rhel5.x86_64.rpm)
pt-table-checksum v2.0.2
CentOS release 5.7 (Final) x86_64

Revision history for this message
Derek Downey (derek-9) wrote :

I also receive this error. My install is from the IUS yum repository:

Name : mysql55-server
Arch : x86_64
Version : 5.5.19
Release : 1.ius.el5

My master server is set to MIXED:

mysql> SHOW VARIABLES LIKE 'binlog_format';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | MIXED |
+---------------+-------+
1 row in set (0.04 sec)

This seems to just be a warning as the results end up in the slave.

I cannot reproduce a REPLACE...SELECT warning on the master, even using prepared statements (which I suspected as a bug with myslq 5.5.18+). My test was something like this:

> CREATE TABLE foo.bar ( tbl char(10) PRIMARY KEY, myCol char(10) );
Query OK, 0 rows affected (0.04 sec)

> REPLACE INTO bar VALUES ('test', 'something');
Query OK, 1 row affected (0.07 sec)

> SELECT * FROM bar WHERE tbl='test';
+------+-----------+
| tbl | myCol |
+------+-----------+
| test | something |
+------+-----------+
1 row in set (0.05 sec)

>PREPARE STMT FROM 'REPLACE INTO bar SELECT ?, ?';
Query OK, 0 rows affected (0.05 sec)
Statement prepared

>set @pTbl = 'test';
Query OK, 0 rows affected (0.04 sec)

l> set @pVar = 'another';
Query OK, 0 rows affected (0.04 sec)

execute STMT USING @pTbl, @pVar;
Query OK, 2 rows affected (0.04 sec)
Records: 1 Duplicates: 1 Warnings: 0

> SELECT * FROM bar WHERE tbl='test';
+------+---------+
| tbl | myCol |
+------+---------+
| test | another |
+------+---------+
1 row in set (0.04 sec)

Not sure if any of that helps.

Changed in percona-toolkit:
status: New → Confirmed
tags: added: mysql-5.5
Revision history for this message
Derek Downey (derek-9) wrote :

This is what I get for not reading the source code. It doesn't matter what my server configuration for 'binlog_format' is set to, as the tool will change the session binlog_format to STATEMENT.

Just adding this as a note in case others were not aware (as I wasn't).

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Right. I think we need to look into whether the tool should just ignore this and go ahead. I might also ask Oracle to take a look at whether this behavior should be configurable somehow.

Revision history for this message
Zach Carlson (zcarlson) wrote :

It appears pt-table-checksum already has a section of code dedicated to ignoring warning codes -- in fact, *this specific warning code*. Line 6807:

6807:my %ignore_code = (
6808- # Error: 1592 SQLSTATE: HY000 (ER_BINLOG_UNSAFE_STATEMENT)
6809- # Message: Statement may not be safe to log in statement format.
6810- # Ignore this warning because we have purposely set statement-based
6811- # replication.
6812- 1592 => 1,
6813-);

The problem I ran into was that this variable was in a set of curly braces (i.e. in its own scope somewhere), but the code being executed was in an eval outside of that scope. So it appeared as if the variable never existed; because of that, it figured code 1592 should not be ignored. I fixed this by moving it outside of its curly braces; the fix is enclosed in the patch file attached to this post, but you may want to figure out a better solution if there is one.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

We need to re-apply this fix to the next release and add a test case for it to avoid a regression in the future.

Changed in percona-toolkit:
importance: Undecided → High
Revision history for this message
Marco (marco-andreatta) wrote :

I received the same error when running pt-table-checksum, for all functions but CRC32.
i.e
the following works:
shell> pt-table-checksum --function CRC32

the following don't work:
shell> pt-table-checksum --function FNV1A_64
shell> pt-table-checksum --function MURMUR_HASH

The zcarlson's patch fix it.

Details:

Percona-Server-server-51-5.1.58-rel12.9.271.rhel5
pt-table-checksum v2.0.3
Linux db1.hostelsclub.com 2.6.18-274.18.1.el5 #1 SMP Fri Jan 20 15:11:18 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Howard Chou (maingroup) wrote :

I used pt-table-checksum from the 2.1 branch which has Zach Carlson's patch in it. The bug is still there.

pt-table-checksum -host myhost -u login -pMypasswd

I tried this :
pt-table-checksum --function CRC32 -host myhost -u login -pMypasswd

Both times I got error:
 MySQL error 1592:
    Level: Note
     Code: 1592
  Message: Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE... SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave.

Revision history for this message
Howard Chou (maingroup) wrote :

This is a correction to my previous post.

When I looked in the pt-table-checksum source code I just downloaded, I realized that it did not include Zach's fix.
  I edited the Perl script according to Zach's patch. It worked perfectly.

I issued the following command:

pt-table-checksum -h myhost -u login -pMypasswd

No error was received.

Thank you Percona for providing this wonderful tool!

Revision history for this message
Marco Schirrmeister (marcoschirrmeister) wrote :

I also just got this error.

My MySQL version is 5.5.22.

To me it looks like it is OS depended.
The tool runs fine without any error when I run it from a CentOS 6.2 64bit machine.
I get the error when I run it from a CentOS 5.8 64bit machine.

Same command against the same remote database server.
pt-table-checksum dbsrv1 -u mschirrmeister -p xxx --replicate percona.checksums --create-replicate-table --databases certs --nocheck-replication-filters --replicate-database percona

CentOS6
Linux toolbox2 2.6.32-220.4.2.el6.x86_64 #1 SMP Tue Feb 14 04:00:16 GMT 2012 x86_64 x86_64 x86_64 GNU/Linux
perl-5.10.1-119.el6_1.1.x86_64
percona-toolkit-2.0.4-1

CentOS5
Linux toolbox1 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
perl-5.8.8-38.el5
percona-toolkit-2.0.4-1

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

I believe there is some subtlety in the scoping of the Perl variable. So it is probably the difference in the Perl version that is causing the different behavior.

Revision history for this message
Paulus (paulmd-ya) wrote :

Looks like perl 5.8 is affected. Upgraded perl to 5.10 and the bug has gone.

Revision history for this message
Michael Marx (michael-marx) wrote :

I am having the exact same issue as reported by Bryan Aldridge initially; pt-table-checksum failing with unsafe binary log statement. I read through this bug, but a little confused as to what the resolution was. Was the problem with the declaration of the variable in the code or was it due to the version of Perl? I am running Perl 5.8.8.

Changed in percona-toolkit:
importance: High → Undecided
Revision history for this message
Dave Coutu (cout) wrote :

Adding myself as another affected user.

CentOS 5.7 & 5.8
perl-5.8.8-32.el5_6.3
perl-DBD-MySQL-3.0007-2.el5
percona-toolkit-2.1.1-1
Running against MySQL 5.5.20-1

Like Michael I'd like to know what the resolution is. Just need to know whether or not I have to invest some time in bringing up a CentOS 6 box to run pt-table-checksum, or if there is a further fix coming in the 2.1.x train. Zach's patch looks like it is in the 2.1.1 version of pt-table-checksum so it hasn't addressed the issue for me yet. Thanks!

Changed in percona-toolkit:
milestone: none → 2.1.2
assignee: nobody → Daniel Nichter (daniel-nichter)
importance: Undecided → High
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Indeed, closures in Perl 5.8 either work differently or there is a bug, as demonstrated by:

#!/usr/bin/perl

sub main {
   exec_nibble();
   return 0;
}

{
my %foo = ( 1 => 2 );

sub exec_nibble {
   my $callback = sub {
      print "OK in callback in exec_nibble()\n" if $foo{1};
   };
   $callback->();
}
}

if ( !caller ) { exit main(@ARGV); }
1;

On 5.8 that prints nothing, but on 5.10 it prints the OK message. I would argue that this is more a bug than how closures work in 5.8 because what's interesting is if you add something like print Dumper(\%foo); either before or after the callback, then %foo becomes visible in the callback and the OK message is printed. Somehow touching %foo in the sub's scope brings it to life in the callback's/closure's scope, so sometimes it's in scope and sometimes it's not even though the scopes don't change.

Odd, but oh well; no need to file a Perl 5.8 bug. We'll just remove the outer { } and make the hashes global vars.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Further evidence that this is probably a Perl 5.8 closure bug is that if you use that snippet as a module, i.e. "require test.pl", then call main(), then it works (the OK message is printed). This came up in testing because that's normally how tests call tools (require "pt-table-checksum" then pt_table_checksum::main()), but that wasn't reproducing the problem even though the problem occurred when I ran the tool from the command line. No problem, the test can just call pt-table-checksum as if from the command line.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Failing test on CentOS 5 with Perl 5.8:

not ok 6 - Bug 987393 (Perl 5.8 scoping): no errors
# Failed test 'Bug 987393 (Perl 5.8 scoping): no errors'
# in bugs.t at line 2277.
# got: '1'
# expected: '0'
not ok 7 - Bug 987393 (Perl 5.8 scoping): checksummed table
# Failed test 'Bug 987393 (Perl 5.8 scoping): checksummed table'
# in bugs.t at line 2283.
# got: '0'
# expected: '109'
# Looks like you failed 2 tests of 7.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

This is tested and fixed in the 2.0 and 2.1 branches. You can download it now directly from those branches, or this fix will be in the 2.0.5 and 2.1.2 releases.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Might be the reason for the 5.8/5.10 behavior difference: http://perldoc.perl.org/perl590delta.html#Closures%2c-eval-and-lexicals

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-290

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.