Node fails to gracefully leave the cluster
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Galera |
Fix Released
|
Medium
|
Teemu Ollakka | ||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
This happened when 2 nodes in a 3-node cluster had to abort due to inconsistency introduced by lp:587170. One of them failed to send LEAVE message which resulted in surviving node (master) to lose primary component and subsequent downtime.
Master:
=======
130128 18:24:06 [Warning] IP address '190.129.175.58' could not be resolved: Name or service not known
130128 18:57:36 [Note] WSREP: (62f840e0-
nonlive peers: tcp://10.
130128 18:57:37 [Note] WSREP: (62f840e0-
800-64742e6f03f5 (tcp://
130128 18:57:40 [Note] WSREP: (62f840e0-
800-350da336b5e7 (tcp://
130128 18:57:41 [Note] WSREP: evs::proto(
c9b518a6c,3)) suspecting node: bcf36816-
130128 18:57:41 [Note] WSREP: evs::proto(
c9b518a6c,3)) suspecting node: fd286470-
130128 18:57:42 [Note] WSREP: view(view_
} joined {
} left {
} partitioned {
})
130128 18:57:42 [Note] WSREP: view(view_
} joined {
} left {
} partitioned {
})
130128 18:57:42 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
Slave 1:
========
130128 18:57:34 [ERROR] WSREP: Node consistency compromized, aborting...
130128 18:57:34 [Note] WSREP: Closing send monitor...
130128 18:57:34 [Note] WSREP: Closed send monitor.
130128 18:57:34 [Note] WSREP: gcomm: terminating thread
130128 18:57:34 [Note] WSREP: gcomm: joining thread
130128 18:57:34 [Note] WSREP: gcomm: closing backend
130128 18:57:34 [Note] WSREP: view(view_
} joined {
} left {
} partitioned {
})
130128 18:57:34 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
130128 18:57:34 [Note] WSREP: view((empty))
130128 18:57:34 [Note] WSREP: gcomm: closed
130128 18:57:34 [Note] WSREP: Flow-control interval: [16, 16]
130128 18:57:34 [Note] WSREP: Received NON-PRIMARY.
130128 18:57:34 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 18533923)
130128 18:57:34 [Note] WSREP: Received self-leave message.
Slave 2
=======
130128 18:57:44 [ERROR] WSREP: Node consistency compromized, aborting...
130128 18:57:44 [Note] WSREP: Closing send monitor...
130128 18:57:44 [Note] WSREP: Closed send monitor.
130128 18:57:44 [Note] WSREP: gcomm: terminating thread
130128 18:57:44 [Note] WSREP: gcomm: joining thread
130128 18:57:44 [Note] WSREP: (fd286470-
inting to uuid fd286470-
130128 18:57:44 [Note] WSREP: (fd286470-
inting to uuid fd286470-
130128 18:57:44 [Note] WSREP: gcomm: closing backend
130128 18:57:44 [ERROR] WSREP: failed to close gcomm backend connection: 131: Forbidden state transition: INSTALL -> LEAVING (FATAL)
at gcomm/src/
130128 18:57:44 [Note] WSREP: Received self-leave message.
Changed in galera: | |
status: | Fix Committed → Fix Released |
Changed in percona-xtradb-cluster: | |
milestone: | none → 5.5.30-23.7.4 |
Changed in percona-xtradb-cluster: | |
status: | New → Fix Released |
Internal trac reference: #607