Inconsistent behavior on data consistency abort
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Fix Committed
|
Undecided
|
Unassigned | ||
5.7 |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
Currently when nodes abort on data consistency problem, cluster may behave in different ways, depending on how many total nodes and how fast other nodes manage to quit the cluster.
For a two-node cluster, in all my tests the writer remains Primary, so it behaves like clean shutdown of the second node, so quorum is not lost.
However, in three-node cluster - sometimes writer remains primary, sometimes not - depending on how fast the other two nodes leave the cluster. In case both of them are seen by writer as leaving cluster in the same time - writer enters non-Primary state.
Tested on version:
mysql> select @@version,
*******
@@version: 5.7.18-15-57-log
@@version_comment: Percona XtraDB Cluster (GPL), Release rel15, Revision 7693d6e, WSREP version 29.20, wsrep_29.20
1 row in set (0.00 sec)
Test case:
use test
create table t1 (id int primary key);
insert into t1 values (1),(2);
set wsrep_on=0; insert into t1 values (7); set wsrep_on=1; delete from t1 where id=7;
Result on writer node on 2-node cluster:
2017-07-
2017-07-
2017-07-
view (view_id(
memb {
97ba771e,0
}
joined {
}
left {
}
partitioned {
9cf75139,0
}
)
2017-07-
2017-07-
2017-07-
Result on writer node on 3-node cluster, when both peers exit in the same time:
2017-07-
2017-07-
2017-07-
2017-07-
view (view_id(
memb {
97ba771e,0
}
joined {
}
left {
}
partitioned {
39e4cef7,0
71f2b1d2,0
}
)
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
...
2017-07-
...
2017-07-
2017-07-
In all cases, nodes which abort, seem to send the leave message properly:
2017-07-
2017-07-
2017-07-
2017-07-
...
2017-07-
...
2017-07-
IMHO behavior should be consistent - either we treat the leaving nodes as if it was clean shutdown, and remaining node re-configures PC as single node, primary cluster, or we put it in non-primary state when it looses quorum (cluster doesn't have >50% of nodes up).
Przemek,
I just tried this multiple times including a scenario when both nodes leave at same time but couldn't reproduce the issue. Do you have anything specific in your my.cnf ?
Can you share your my.cnf ?
------- ------- ------- -
2017-07- 17T08:50: 21.717558Z 0 [Note] WSREP: 2.2 (n3): State transfer from 0.0 (n1) complete. 17T08:50: 21.718225Z 0 [Note] WSREP: Member 2.2 (n3) synced with group. 17T08:50: 22.834585Z 0 [Note] WSREP: 1.0 (n2): State transfer from 0.0 (n1) complete. 17T08:50: 22.835420Z 0 [Note] WSREP: Member 1.0 (n2) synced with group. 17T08:50: 36.900232Z 0 [Note] WSREP: forgetting ed17fecc (tcp:// 127.0.0. 1:5030) 17T08:50: 36.900323Z 0 [Note] WSREP: forgetting edcc146d (tcp:// 127.0.0. 1:6030) 17T08:50: 36.900459Z 0 [Note] WSREP: Node ec95cd79 state primary 17T08:50: 36.900572Z 0 [Note] WSREP: Current view of cluster as seen by this node PRIM,ec95cd79, 4)
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
view (view_id(
memb {
ec95cd79,0
}
joined {
}
left {
}
partitioned {
ed17fecc,0
edcc146d,2
}
)
***************** BOTH NODE NOTIFIED TO LEAVE AT SAME TIME ******* ******* ******* ******* **
2017-07- 17T08:50: 36.900598Z 0 [Note] WSREP: Save the discovered primary-component to disk 17T08:50: 36.900810Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1 17T08:50: 36.901330Z 0 [Note] WSREP: forgetting ed17fecc (tcp:// 127.0.0. 1:5030) 17T08:50: 36.901356Z 0 [Note] WSREP: forgetting edcc146d (tcp:// 127.0.0. 1:6030) 17T08:50: 36.901518Z 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 0099ca74- 6acd-11e7- b21f-9715285cfe c1 17T08:50: 36.901586Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 0099ca74- 6acd-11e7- b21f-9715285cfe c1 17T08:50: 36.901600Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 0099ca74- 6acd-11e7- b21f-9715285cfe c1 from 0 (n1) 17T08:50: 36.901612Z 0 [Note] WSREP: Quorum results: 6acc-11e7- 9cca-5bbfe78342 d4 17T08:50: 36.901624Z 0 [Note] WSREP: Flow-control interval: [100, 100] 17T08:50: 36.901666Z 2 [Note] WSREP: New cluster view: global state: ec962b8c- 6acc-11e7- 9cca-5bbfe78342 d4:4, view# 4: Primary, number of nodes: 1, my index: 0, protocol version 3 17T08:50: 36.901676Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 17T08:50: 36.901686Z 2 [Note] WSREP: REPL Protocols: 7 (3, 2) 17T08:50: 36.901750Z 2 [Note] WSREP: Assign initial position for certification: 4, protocol version: 3 17T08:50: 36.901793Z 0 [Note] WSREP: Service thread queue flushed.
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
version = 4,
component = PRIMARY,
conf_id = 3,
members = 1/1 (primary/total),
act_id = 4,
last_appl. = 0,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = ec962b8c-
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-
2017-07-