Comment 4 for bug 1298689

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote : Re: [Bug 1298689] Re: Possible bug in the in MySQL + Galera and PXC when running PT-table-checksum on a slave.

On 30 Mar 2014 01:15, "Marco Tusa" <email address hidden> wrote:
>
> Ragu,
> I know very well that STATEMENT is not the official replication way for
>

>
>
> But this has NOTHING to do with that trick, but with the fact that
> galera is not enforcing the ROW format at session level.

This has been fixed in 5.6 and a warning is emitted in 5.5 (no error here
for backward compatibility).

>
> As such the pt-table-checksum is able to change it (Line 9097)
> if ( $original_binlog_format !~ /STATEMENT/i ) {
> $sql = q{/*!50108 SET @@binlog_format := 'STATEMENT'*/};
> eval {
> PTDEBUG && _d($dbh, $sql);
> $dbh->do($sql);
> };
>
>
> At the same time if I have other activity running on the other node ...
> or on this Master node, this activity is reported in ROW format:
>
> root@localhost:pm) [sakila]>insert into payment3 select * from payment2;
> #140329 15:18:19 server id 3 end_log_pos 4991358 Write_rows: table id
> 57 flags: STMT_END_F
> ### INSERT INTO sakila.payment3
> ### SET
> ### @1=1 /* SHORTINT meta=0 nullable=1 is_null=0 */
> ### @2=1 /* SHORTINT meta=0 nullable=0 is_null=0 */
> ### @3=1 /* TINYINT meta=0 nullable=0 is_null=0 */
> ### @4=76 /* INT meta=0 nullable=1 is_null=0 */
> ### @5=000000002.990000000 /* DECIMAL(5,2) meta=1282 nullable=0
> is_null=0 */
> ### @6=2005-05-25 11:30:37 /* DATETIME meta=0 nullable=0 is_null=0 */
> ### @7=1140059550 /* TIMESTAMP meta=0 nullable=0 is_null=0 */
> ### INSERT INTO sakila.payment3
>
>
> Given all the above, it seems that WE CAN use pt-table-checksum, but I
> am unsure about the GALERA stability, and WHY then we have that trick.
>
> The parameter wsrep_forced_binlog_format was introduced in Galera on the
> 2011-11-08 (Seppo comment
> https://bugs.launchpad.net/codership-mysql/+bug/887564).
> The trick seems from 4/24/12 Alex comment
> (
https://groups.google.com/forum/#!msg/codership-team/kKqigq31dVQ/TV-DTUI0YYYJ
)
>
> SO it comes AFTER, the wsrep_forced_binlog_format, and as such I am tent
> to assume it was introduced because there was a problem on the use of
> wsrep_forced_binlog_format...
>
> So after all this I think we can summarize my question as:
>
> "If we have to use the pt-table-checksum, will be the usage of
> wsrep_forced_binlog_format=NONE to be consider SAFE and CONSISTENT for
> galera replication, OR we have to assume that this parameter should
> NEVER be set/modify?
>
> In the case it cannot be used, what is the way then to use
> pt-table-checksum.
> In any case what is the reason of having the /*!99997*/ trick?
> "
> Should be consider all this as a documentation BUG or an dangerous
> exposed feature or a real bug?
>
>
> ------------------------------------------------
> Marco Tusa,
> Mobile: +1-613-282-7337
> do something for your planet!
> http://www.tusacentral.net/
> ------------------------------------------------
> Save a tree... please don't print this e-mail unless you really need to
>
> On 29/03/2014 03:27, Raghavendra D Prabhu wrote:
> >> The issue is that when running pt-table-checksum on node1 all the
statement that pt-table-checksum execute to test the data
> > > on the nodes, are replicated with ROW replication.
> >
> > This is because STATEMENT binlog format is not supported in
> > wsrep/galera. pt-checksum possibly needs fixes for PXC in this regard.
> >
>
> --
> You received this bug notification because you are subscribed to MySQL
> patches by Codership.
> Matching subscriptions: wsrep
> https://bugs.launchpad.net/bugs/1298689
>
> Title:
> Possible bug in the in MySQL + Galera and PXC when running PT-table-
> checksum on a slave.
>
> Status in MySQL patches by Codership:
> New
>
> Bug description:
> Current nodes setup
>
> node1 <-> node2 <-> node3
> |
> \/
> Slave
>
> The issue is that when running pt-table-checksum on node1 all the
statement that pt-table-checksum execute to test the data on the nodes, are
replicated with ROW replication.
> To note that I am testing on percona toolkit 2.2.7 and the version of
pt-table-checksum is implementing the /*!99997*/ trick as describe at
https://groups.google.com/forum/#!msg/codership-team/kKqigq31dVQ/TV-DTUI0YYYJ
.
>
> From MySQL point of view I have tested with PXC 5.5.34 and Codership
5.5.43 (wsrep 25.9)
>
> When pt-table-checksum is executed both the servers print the DEBUG
information,
> 140327 15:38:48 [Note] WSREP: consistency check: REPLACE INTO
`percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary,
upper_boundary, this_cnt, this_crc)
> SELECT /*!99997*/ 'sakila', 'staff', '1', NULL, NULL, NULL, COUNT(*)
AS cnt,
> COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `staff_id`,
`first_name`, `last_name`, `address_id`, `picture`, `email`,
> `store_id`, `active`, `username`, `password`, `last_update` + 0,
CONCAT(ISNULL(`picture`), ISNULL(`email`), ISNULL(`password`))))
> AS UNSIGNED)), 10, 16)), 0) AS crc FROM `sakila`.`staff` /*checksum
table*/
>
> But then the information is passed as ROW format as UPDATE:
> BINLOG '
>
SH40UxMjAAAATAAAAA2SAAAAACQAAAAAAAEAB3BlcmNvbmEACWNoZWNrc3VtcwAM/v4DBA/8/P4D
> /gMHDf7A/sAEWAICAv54/nh4Bg==
>
SH40UxgjAAAAeQAAAIaSAAAAACQAAAAAAAEADP////949gZzYWtpbGEFc3RhZmYBAAAACDIzMzY2
>
OGFlAgAAAEh+NFNw8AZzYWtpbGEFc3RhZmYBAAAANzMqPggyMzM2NjhhZQIAAAAIMjMzNjY4YWUC
> AAAASH40Uw==
> '/*!*/;
> ### UPDATE percona.checksums
> ### WHERE
> ### @1='sakila' /* STRING(192) meta=65216 nullable=0
is_null=0 */
> ### @2='staff' /* STRING(192) meta=65216 nullable=0
is_null=0 */
> ### @3=1 /* INT meta=0 nullable=0 is_null=0 */
> ### @4=NULL /* INT meta=4 nullable=1 is_null=1 */
> ### @5=NULL /* INT meta=600 nullable=1 is_null=1 */
> ### @6=NULL /* INT meta=2 nullable=1 is_null=1 */
> ### @7=NULL /* INT meta=2 nullable=1 is_null=1 */
> ### @8='233668ae' /* STRING(120) meta=65144 nullable=0
is_null=0 */
> ### @9=2 /* INT meta=0 nullable=0 is_null=0 */
> ### @10=NULL /* INT meta=65144 nullable=1 is_null=1 */
> ### @11=NULL /* INT meta=0 nullable=1 is_null=1 */
> ### @12=1395949128 /* TIMESTAMP meta=0 nullable=0
is_null=0 */
> ### SET
> ### @1='sakila' /* STRING(192) meta=65216 nullable=0
is_null=0 */
> ### @2='staff' /* STRING(192) meta=65216 nullable=0
is_null=0 */
> ### @3=1 /* INT meta=0 nullable=0 is_null=0 */
> ### @4=0.166211 /* FLOAT meta=4 nullable=1
is_null=0 */
> ### @5=NULL /* FLOAT meta=600 nullable=1 is_null=1 */
> ### @6=NULL /* FLOAT meta=2 nullable=1 is_null=1 */
> ### @7=NULL /* FLOAT meta=2 nullable=1 is_null=1 */
> ### @8='233668ae' /* STRING(120) meta=65144 nullable=0
is_null=0 */
> ### @9=2 /* INT meta=0 nullable=0 is_null=0 */
> ### @10='233668ae' /* STRING(120) meta=65144 nullable=1
is_null=0 */
> ### @11=2 /* INT meta=0 nullable=1 is_null=0 */
> ### @12=1395949128 /* TIMESTAMP meta=0 nullable=0
is_null=0 */
> # at 37510
>
> Given that there is no checksum execution on the SLAVE.
>
>
> pt-table-checksum code related to the patch
> #
########################################################################
> # Checksum args and the DMS part of the checksum query for each
table.
> #
########################################################################
> my %crc_args = $rc->get_crc_args(dbh => $master_dbh);
> my $checksum_dml = "REPLACE INTO $repl_table "
> . "(db, tbl, chunk, chunk_index,"
> . " lower_boundary, upper_boundary, this_cnt,
this_crc) "
> . "SELECT"
> . ($cluster->is_cluster_node($master_cxn) ? '
/*!99997*/' : '')
> . " ?, ?, ?, ?, ?, ?,";
> my $past_cols = " COUNT(*), '0'";
>
>
>
>
>
> HOW to replicate:
> ================================
> 1) install a cluster using either PXC or MySQL/Galera both(5.5)
> 2) connect another MySQL (simple) as slave to node1
> 3) load a test schema
> 4) remove some records on the SLAVE
> 5) run pt-table-checksum as /pt-table-checksum -u<user> -p<pw> -h
192.168.0.35 -P3306 --databases sakila,test --recursion-method
dsn=h=192.168.0.35,P=3306,u=<user>,p=<pw>,D=percona,t=dsns
--no-check-binlog-format
> dsns table:
> (root@localhost:pm) [percona]>select * from dsns;
> +----+-----------+---------------------------------------+
> | id | parent_id | dsn |
> +----+-----------+---------------------------------------+
> | 6 | NULL | h=192.168.0.35,P=5510,u=stress,p=tool |
> +----+-----------+---------------------------------------+
> 1 row in set (0.00 sec)
> 6) review data from the pt-table-checksum and from the query on
master/slave
> SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS
chunks
> FROM percona.checksums
> WHERE (
> master_cnt <> this_cnt
> OR master_crc <> this_crc
> OR ISNULL(master_crc) <> ISNULL(this_crc))
> GROUP BY db, tbl;
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/codership-mysql/+bug/1298689/+subscriptions