Comment 8 for bug 1288528

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote : Re: SST Resumes Even When Donor Was Already Detected as SYNCED State

> Ok, good point about that wsrep_desync is global. The problem is now, if wsrep_desync is already ON, should we allow SET GLOBAL wsrep_desync=ON again (since as you said it should be idempotent)? Or should we block? Or return error?

There is/was a bug in wsrep_desync.

I have fixed it in https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1281696 as follows:

  Bug#1281696 Fix the wsrep_desync warnings; also return true in wsrep_desync_check to avoid error in wsrep_desync_update
diff:
=== modified file 'sql/wsrep_var.cc'
--- sql/wsrep_var.cc 2014-01-14 15:59:04 +0000
+++ sql/wsrep_var.cc 2014-02-27 11:31:26 +0000
@@ -494,16 +494,20 @@

 bool wsrep_desync_check (sys_var *self, THD* thd, set_var* var)
 {
- bool new_wsrep_desync = var->value->val_bool();
+ bool new_wsrep_desync = var->save_result.ulonglong_value;
   if (wsrep_desync == new_wsrep_desync) {
     if (new_wsrep_desync) {
+ WSREP_DEBUG("wsrep_desync is already ON.");
       push_warning (thd, MYSQL_ERROR::WARN_LEVEL_WARN,
                    ER_WRONG_VALUE_FOR_VAR,
                    "'wsrep_desync' is already ON.");
+ return true;
     } else {
+ WSREP_DEBUG("wsrep_desync is already OFF.");
       push_warning (thd, MYSQL_ERROR::WARN_LEVEL_WARN,
                    ER_WRONG_VALUE_FOR_VAR,
                    "'wsrep_desync' is already OFF.");
+ return true;
     }
   }
   return 0;

The problem, reported in this bug, however is not due to
wsrep_desync. This is how it probably goes:

a) Node receives State transfer request.
b) Node becomes Donor and enters Donor/Desynced state.
c) During this, a mysql client does wsrep_desync=ON - this will
return 'wsrep_desync is already ON'.
d) Now, this client doesn wsrep_desync=OFF while state transfer
is in progress. This is not prevented.
e) SST from this donor node can probably break due to this.

So, steps a) - c) are fine, it is the step d) which is
problematic.

I think this can be handled in mysqld itself, probably with
LOCK_wsrep_sst, in wsrep_desync_update.