removing node from a cluster via SET GLOBAL wsrep_cluster_address - hangs node being removed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
Low
|
Seppo Jaakola | ||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Hi,
Trying to remove a single node from a 3-node cluster via:
SET GLOBAL wsrep_cluster_
using Server version: 5.5.17-22.1-log Percona XtraDB Cluster (GPL), Release 22.1, Revision 3683 wsrep_22.3.r3683
Command hangs the server - command never returns - and the server is no longer accessible - I have to use a kill -9 to get the instance usable again.
From the error log from the node being removed:
120213 22:15:09 [Note] WSREP: Stop replication
120213 22:15:09 [Note] WSREP: Closing send monitor...
120213 22:15:09 [Note] WSREP: Closed send monitor.
120213 22:15:09 [Note] WSREP: gcomm: terminating thread
120213 22:15:09 [Note] WSREP: gcomm: joining thread
120213 22:15:09 [Note] WSREP: gcomm: closing backend
120213 22:15:09 [Note] WSREP: evs::proto(
120213 22:15:09 [Note] WSREP: evs::proto(
120213 22:15:09 [Note] WSREP: GMCast:
} joined {
} left {
} partitioned {
})
120213 22:15:09 [Note] WSREP: New COMPONENT: primary = no, my_idx = 0, memb_num = 1
120213 22:15:09 [Note] WSREP: GMCast:
120213 22:15:09 [Note] WSREP: gcomm: closed
120213 22:15:09 [Note] WSREP: Flow-control interval: [230, 256]
120213 22:15:09 [Note] WSREP: Received NON-PRIMARY.
120213 22:15:09 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 4046094)
120213 22:15:09 [Note] WSREP: Received self-leave message.
120213 22:15:09 [Note] WSREP: New cluster view: global state: 0ec148da-
120213 22:15:09 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120213 22:15:09 [Note] WSREP: Flow-control interval: [0, 0]
120213 22:15:09 [Note] WSREP: Received SELF-LEAVE. Closing connection.
120213 22:15:09 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 4046094)
120213 22:15:09 [Note] WSREP: RECV thread exiting 0: Success
120213 22:15:09 [Note] WSREP: recv_thread() joined.
120213 22:15:09 [Note] WSREP: Closing slave action queue.
120213 22:15:09 [Note] WSREP: New cluster view: global state: 0ec148da-
120213 22:15:09 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120213 22:15:09 [Note] WSREP: applier thread exiting (code:0)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:09 [Note] WSREP: applier thread exiting (code:5)
120213 22:15:11 [Note] WSREP: rollbacker thread exiting
Contrast this with a valid gcomm address (which works as expected although it's not removing the node from the cluster):
mysql> set global wsrep_cluster_
Query OK, 0 rows affected (3.51 sec)
Thanks
Patrick
Changed in codership-mysql: | |
status: | New → In Progress |
assignee: | nobody → Seppo Jaakola (seppo-jaakola) |
milestone: | none → 5.5.20-23.4 |
importance: | Undecided → Low |
Changed in codership-mysql: | |
status: | Fix Committed → Fix Released |
Changed in percona-xtradb-cluster: | |
status: | New → Fix Released |
Trying with lp:codership-mysql (wsrep-5.5) I cannot reproduce this issue. Tried with:
SET GLOBAL wsrep_cluster_ address= 'gcomm: //';
and then joining node back with:
SET GLOBAL wsrep_cluster_ address= 'gcomm: //127.0. 0.1:4567' ;
and it works both ways with current wsrep-5.5 trunk version.
However, command: address= '';
SET GLOBAL wsrep_cluster_
...is rejected