Ragu,
I know very well that STATEMENT is not the official replication way for
galera.
But this is exactly what I am trying to underline, there is an
inconsistency from my point of view.
Let me explain the whole in more details.
If you check the link in the bug report you will see that there was a
"trick" design for the pt-table-checksum, trick that still there given
it report the info in the log when use.
This trick DOES not work when wsrep_forced_binlog_format=ROW.
Given it force the log to stay as ROW whatever instruction.
If we change wsrep_forced_binlog_format=STATEMENT, obviously it works,
but given the incompatibility with the GALERA replication you have 99%
that the node will crash with:
140328 22:43:41 [Warning] WSREP: BF applier failed to
open_and_lock_tables: 1666, fatal: 0 wsrep = (exec_mode: 1
conflict_state: 0 seqno: 291772803)
140328 22:43:41 [ERROR] Slave SQL: Error executing row event: 'Cannot
execute statement: impossible to write to binary log since statement is
in row format and BINLOG_FORMAT = STATEMENT.', Error_code: 1666
140328 22:43:41 [Warning] WSREP: RBR event 2 Update_rows apply warning:
1666, 291772803
140328 22:43:41 [ERROR] WSREP: Failed to apply trx: source:
1f284b88-7401-11e3-afd2-76df52ec0065 version: 2 local: 0 state: APPLYING
flags: 1 conn_id: 37912472 trx_id: 828722030 seqnos (l: 92316516, g:
291772803, s: 291772802, d: 291772802, ts: 1396064621012184336)
140328 22:43:41 [ERROR] WSREP: Failed to apply trx 291772803 10 times
140328 22:43:41 [ERROR] WSREP: Node consistency compromized, aborting...
140328 22:43:41 [Note] WSREP: Closing send monitor...
Testing wsrep_forced_binlog_format=NONE ...
pt-table-checksum is workign fine, the binary log show that the
statement are logged in STATEMENT mode:
#140329 15:03:52 server id 35 end_log_pos 1798 Query
thread_id=44936 exec_time=0 error_code=0
use `sakila`/*!*/;
SET TIMESTAMP=1396119832/*!*/;
REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index,
lower_boundary, upper_boundary, this_cnt, this_crc) SELECT /*!99997*/
'sakila', 'actor', '1', NULL, NULL, NULL, COUNT(*) AS cnt,
COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `actor_id`,
`first_name`, `last_name`, `last_update` + 0)) AS UNSIGNED)), 10, 16)),
0) AS crc FROM `sakila`.`actor` /*checksum table*/
/*!*/;
# at 1798
#140329 15:03:52 server id 35 end_log_pos 1825 Xid = 306
COMMIT/*!*/;
But this has NOTHING to do with that trick, but with the fact that
galera is not enforcing the ROW format at session level.
As such the pt-table-checksum is able to change it (Line 9097)
if ( $original_binlog_format !~ /STATEMENT/i ) {
$sql = q{/*!50108 SET @@binlog_format := 'STATEMENT'*/};
eval { PTDEBUG && _d($dbh, $sql); $dbh->do($sql);
};
At the same time if I have other activity running on the other node ...
or on this Master node, this activity is reported in ROW format:
root@localhost:pm) [sakila]>insert into payment3 select * from payment2;
#140329 15:18:19 server id 3 end_log_pos 4991358 Write_rows: table id
57 flags: STMT_END_F
### INSERT INTO sakila.payment3
### SET
### @1=1 /* SHORTINT meta=0 nullable=1 is_null=0 */
### @2=1 /* SHORTINT meta=0 nullable=0 is_null=0 */
### @3=1 /* TINYINT meta=0 nullable=0 is_null=0 */
### @4=76 /* INT meta=0 nullable=1 is_null=0 */
### @5=000000002.990000000 /* DECIMAL(5,2) meta=1282 nullable=0
is_null=0 */
### @6=2005-05-25 11:30:37 /* DATETIME meta=0 nullable=0 is_null=0 */
### @7=1140059550 /* TIMESTAMP meta=0 nullable=0 is_null=0 */
### INSERT INTO sakila.payment3
Given all the above, it seems that WE CAN use pt-table-checksum, but I
am unsure about the GALERA stability, and WHY then we have that trick.
SO it comes AFTER, the wsrep_forced_binlog_format, and as such I am tent
to assume it was introduced because there was a problem on the use of
wsrep_forced_binlog_format...
So after all this I think we can summarize my question as:
"If we have to use the pt-table-checksum, will be the usage of
wsrep_forced_binlog_format=NONE to be consider SAFE and CONSISTENT for
galera replication, OR we have to assume that this parameter should
NEVER be set/modify?
In the case it cannot be used, what is the way then to use
pt-table-checksum.
In any case what is the reason of having the /*!99997*/ trick?
"
Should be consider all this as a documentation BUG or an dangerous
exposed feature or a real bug?
------------------------------------------------
Marco Tusa,
Mobile: +1-613-282-7337
do something for your planet! http://www.tusacentral.net/
------------------------------------------------
Save a tree... please don't print this e-mail unless you really need to
On 29/03/2014 03:27, Raghavendra D Prabhu wrote:
>> The issue is that when running pt-table-checksum on node1 all the statement that pt-table-checksum execute to test the data
> > on the nodes, are replicated with ROW replication.
>
> This is because STATEMENT binlog format is not supported in
> wsrep/galera. pt-checksum possibly needs fixes for PXC in this regard.
>
Ragu,
I know very well that STATEMENT is not the official replication way for
galera.
But this is exactly what I am trying to underline, there is an
inconsistency from my point of view.
Let me explain the whole in more details.
If you check the link in the bug report you will see that there was a binlog_ format= ROW.
"trick" design for the pt-table-checksum, trick that still there given
it report the info in the log when use.
This trick DOES not work when wsrep_forced_
Given it force the log to stay as ROW whatever instruction.
If we change wsrep_forced_ binlog_ format= STATEMENT, obviously it works, lock_tables: 1666, fatal: 0 wsrep = (exec_mode: 1 7401-11e3- afd2-76df52ec00 65 version: 2 local: 0 state: APPLYING 4336)
but given the incompatibility with the GALERA replication you have 99%
that the node will crash with:
140328 22:43:41 [Warning] WSREP: BF applier failed to
open_and_
conflict_state: 0 seqno: 291772803)
140328 22:43:41 [ERROR] Slave SQL: Error executing row event: 'Cannot
execute statement: impossible to write to binary log since statement is
in row format and BINLOG_FORMAT = STATEMENT.', Error_code: 1666
140328 22:43:41 [Warning] WSREP: RBR event 2 Update_rows apply warning:
1666, 291772803
140328 22:43:41 [ERROR] WSREP: Failed to apply trx: source:
1f284b88-
flags: 1 conn_id: 37912472 trx_id: 828722030 seqnos (l: 92316516, g:
291772803, s: 291772802, d: 291772802, ts: 139606462101218
140328 22:43:41 [ERROR] WSREP: Failed to apply trx 291772803 10 times
140328 22:43:41 [ERROR] WSREP: Node consistency compromized, aborting...
140328 22:43:41 [Note] WSREP: Closing send monitor...
Testing wsrep_forced_ binlog_ format= NONE ... 1396119832/ *!*/; .`checksums` (db, tbl, chunk, chunk_index, LOWER(CONV( BIT_XOR( CAST(CRC32( CONCAT_ WS('#', `actor_id`,
pt-table-checksum is workign fine, the binary log show that the
statement are logged in STATEMENT mode:
#140329 15:03:52 server id 35 end_log_pos 1798 Query
thread_id=44936 exec_time=0 error_code=0
use `sakila`/*!*/;
SET TIMESTAMP=
REPLACE INTO `percona`
lower_boundary, upper_boundary, this_cnt, this_crc) SELECT /*!99997*/
'sakila', 'actor', '1', NULL, NULL, NULL, COUNT(*) AS cnt,
COALESCE(
`first_name`, `last_name`, `last_update` + 0)) AS UNSIGNED)), 10, 16)),
0) AS crc FROM `sakila`.`actor` /*checksum table*/
/*!*/;
# at 1798
#140329 15:03:52 server id 35 end_log_pos 1825 Xid = 306
COMMIT/*!*/;
But this has NOTHING to do with that trick, but with the fact that
galera is not enforcing the ROW format at session level.
As such the pt-table-checksum is able to change it (Line 9097) binlog_ format !~ /STATEMENT/i ) {
PTDEBUG && _d($dbh, $sql);
$dbh- >do($sql) ;
if ( $original_
$sql = q{/*!50108 SET @@binlog_format := 'STATEMENT'*/};
eval {
};
At the same time if I have other activity running on the other node ...
or on this Master node, this activity is reported in ROW format:
root@localhost:pm) [sakila]>insert into payment3 select * from payment2; 990000000 /* DECIMAL(5,2) meta=1282 nullable=0
#140329 15:18:19 server id 3 end_log_pos 4991358 Write_rows: table id
57 flags: STMT_END_F
### INSERT INTO sakila.payment3
### SET
### @1=1 /* SHORTINT meta=0 nullable=1 is_null=0 */
### @2=1 /* SHORTINT meta=0 nullable=0 is_null=0 */
### @3=1 /* TINYINT meta=0 nullable=0 is_null=0 */
### @4=76 /* INT meta=0 nullable=1 is_null=0 */
### @5=000000002.
is_null=0 */
### @6=2005-05-25 11:30:37 /* DATETIME meta=0 nullable=0 is_null=0 */
### @7=1140059550 /* TIMESTAMP meta=0 nullable=0 is_null=0 */
### INSERT INTO sakila.payment3
Given all the above, it seems that WE CAN use pt-table-checksum, but I
am unsure about the GALERA stability, and WHY then we have that trick.
The parameter wsrep_forced_ binlog_ format was introduced in Galera on the /bugs.launchpad .net/codership- mysql/+ bug/887564). /groups. google. com/forum/ #!msg/codership -team/kKqigq31d VQ/TV-DTUI0YYYJ)
2011-11-08 (Seppo comment
https:/
The trick seems from 4/24/12 Alex comment
(https:/
SO it comes AFTER, the wsrep_forced_ binlog_ format, and as such I am tent binlog_ format. ..
to assume it was introduced because there was a problem on the use of
wsrep_forced_
So after all this I think we can summarize my question as:
"If we have to use the pt-table-checksum, will be the usage of binlog_ format= NONE to be consider SAFE and CONSISTENT for
wsrep_forced_
galera replication, OR we have to assume that this parameter should
NEVER be set/modify?
In the case it cannot be used, what is the way then to use
pt-table-checksum.
In any case what is the reason of having the /*!99997*/ trick?
"
Should be consider all this as a documentation BUG or an dangerous
exposed feature or a real bug?
------- ------- ------- ------- ------- ------- ------ www.tusacentral .net/ ------- ------- ------- ------- ------- ------
Marco Tusa,
Mobile: +1-613-282-7337
do something for your planet!
http://
-------
Save a tree... please don't print this e-mail unless you really need to
On 29/03/2014 03:27, Raghavendra D Prabhu wrote:
>> The issue is that when running pt-table-checksum on node1 all the statement that pt-table-checksum execute to test the data
> > on the nodes, are replicated with ROW replication.
>
> This is because STATEMENT binlog format is not supported in
> wsrep/galera. pt-checksum possibly needs fixes for PXC in this regard.
>