Percona XtraDB Cluster - HA scalable solution for MySQL

wsrep hton refactoring

Reported by Teemu Ollakka on 2013-10-22
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
Teemu Ollakka
Teemu Ollakka
Percona XtraDB Cluster
Status tracked in Trunk

Bug Description

This ticket is used to track wsrep hton refactoring efforts which are not directly related to certain bugs.

Analysis of several bugs has indicated that wsrep hton and state code is overly complicated and easily leads to situations which are extremely hard to diagnose and debug. Some of the recent observations:

1) Transaction cleanup is done in wsrep_cleanup_transaction() which is called from thd->transaction.cleanup(). This kind of decoupling from wsrep_commit()/wsrep_rollback() calls requires extra state LOCAL_COMMIT and extra variable thd->wsrep_seqno_changed. Better to move cleanup/state reset to happen in wsrep_commit()/wsrep_rollback() and after replay/TOI end.

2) Cleanup and state reset after transaction finishes is incomplete and inconsistent. Some of the variables retain their non default values over transaction boundaries which may generate situations where after-crash diagnosis is impossible without extensive debug logs due to uncertainty of the state.

3) Insufficient amount of debug time sanity checks.

Some of the recent bugs that are directly or indirectly related to issues described above:

Teemu Ollakka (teemu-ollakka) wrote :

Rest of the wsrep patch refactoring and cleanup should go to lp:1233627.

Seppo Jaakola (seppo-jaakola) wrote :

cleanup for IO cache is missing in slave thread (applier)
This is added for wsrep-5.6 branch in revision:

Seppo Jaakola (seppo-jaakola) wrote :

Slave thread writes binlog events when binlogging has not been enabled. This is obsolete, as slave transaction will not be replicated further. An Extended fix was pushed in wsrep-5.6 branch where both slave thread (applier) and replaying thread, skip binlog operations, if binlog is not enabled.

The actual fix goes in revision:
Revision: removes the original fix in revision #4001, which is now obsolete.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers