inconsistency in multi-master test when NOT using binlogging

Bug #872193 reported by Seppo Jaakola
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Fix Released
Critical
Seppo Jaakola
5.1
Fix Released
Critical
Seppo Jaakola
5.5
Fix Released
Critical
Seppo Jaakola

Bug Description

yes, this is a sibling bug to: https://bugs.launchpad.net/codership-mysql/+bug/869538

Similar inconsistency will happen in multi-master test when binlogging has not been enabled. However, the cause is different.

How to reproduce:
1. start two nodes and make sure that neither log-bin nor log-slave-updates is enabled
2. start a load balancer (glbd) to distribute load with round robin policy
3. launch randgen load against the load balancer

randgen commandline used was:
./gentest.pl --gendata=conf/drizzle/translog_drizzle.zz --grammar=conf/drizzle/translog_concurrent1.yy --queries=2000 --threads=6 --dsn=dbi:mysql:host=127.0.0.1:port=3306:user=root:password=rootpass:database=randgen --debug --sqltrace

One node will fail in applying with error like:

111006 9:41:10 [ERROR] Slave SQL: Could not execute Delete_rows event on table randgen.BB; Can't find record in 'BB', Error_code: 1032; handler erro
r HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 259, Error_code: 1032
111006 9:41:10 [Warning] WSREP: RBR event 2 Delete_rows apply warning: 120, 18623
111006 9:42:10 [ERROR] WSREP: Failed to apply trx: source: 91e241d8-efe5-11e0-0800-0883f5b4cb25 version: 1 local: 0 state: CERTIFYING flags: 1 conn_
id: 17 trx_id: 295490 seqnos (l: 1941, g: 18623, s: 18620, d: 18599, ts: 1317883179974328479)
111006 9:42:10 [ERROR] WSREP: Failed to apply app buffer: (M<8D>N^S, seqno: 18623, status: WSREP_FATAL

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

The anatomy of this bug is as follows:
1. a large transaction processing in node A suffers a statement rollback
2. failing statement is rolled back in node A, and transaction continues
3. the large transaction commits. But here the populated write set contains the failed statement
4. node B applies this WS with failed statement, and now nodes are inconsistent
5. node B commits another transaction, which uses information received in the failed statement
6. when node A, tries to apply the WS from node B, inconsistency due to failed statement will make applying impossible

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Fix for statement rollback handling was pushed in change set #3582:
http://bazaar.launchpad.net/~codership/codership-mysql/wsrep-5.5/revision/3582

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Fix for statement rollback handling in wsrep-5.1 branch, was pushed in change set #3145:
http://bazaar.launchpad.net/~codership/codership-mysql/wsrep-5.1/revision/3145

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.