cluster crash and can't normal shutdown under very high loads
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Galera |
Fix Released
|
Low
|
Teemu Ollakka | ||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
I've got a three node xtradb cluster.
Sysbench command:
#######
for ((i=1;i<
#######
When --num-threads > 500 , mysql error log print(sometimes) :
#######
121112 17:15:22 [Note] WSREP: (52d1b2b7-
121112 17:15:22 [Note] WSREP: (52d1b2b7-
121112 17:15:24 [Warning] WSREP: readjusting seq range 801 to 255
121112 17:15:24 [ERROR] WSREP: exception caused by message: evs::msg{
}
121112 17:15:24 [ERROR] WSREP: state after handling message: evs::proto(
current_
} joined {
} left {
} partitioned {
}),
input_map=
}
121112 17:15:24 [ERROR] WSREP: exception from gcomm, backend must be restarted:
} evs::input_map: {aru_seq=
}
121112 17:15:24 [Note] WSREP: Received self-leave message.
121112 17:15:24 [Note] WSREP: Flow-control interval: [0, 0]
121112 17:15:24 [Note] WSREP: Received SELF-LEAVE. Closing connection.
121112 17:15:24 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 1229737)
121112 17:15:24 [Note] WSREP: RECV thread exiting 0: Success
#######
All cluster nodes status like this:
mysql> show status like 'wsrep%';
+------
| Variable_name | Value |
+------
| wsrep_local_
| wsrep_protocol_
| wsrep_last_
| wsrep_replicated | 238039 |
| wsrep_replicate
| wsrep_received | 770 |
| wsrep_received_
| wsrep_local_commits | 238039 |
| wsrep_local_
| wsrep_local_
| wsrep_local_replays | 0 |
| wsrep_local_
| wsrep_local_
| wsrep_local_
| wsrep_local_
| wsrep_flow_
| wsrep_flow_
| wsrep_flow_
| wsrep_cert_
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_local_state | 0 |
| wsrep_local_
| wsrep_cert_
| wsrep_causal_reads | 0 |
| wsrep_cluster_
| wsrep_cluster_size | 2 |
| wsrep_cluster_
| wsrep_cluster_
| wsrep_connected | ON |
| wsrep_local_index | 0 |
| wsrep_provider_name | Galera |
| wsrep_provider_
| wsrep_provider_
| wsrep_ready | OFF |
+------
I can only normal shutdown write node.
Read node shutdown logs:
#######
121112 19:42:13 [Note] /usr/local/
121112 19:42:13 [Note] WSREP: Stop replication
121112 19:42:13 [Note] WSREP: Provider disconnect
121112 19:42:13 [Note] WSREP: Closing send monitor...
121112 19:42:13 [Note] WSREP: Closed send monitor.
121112 19:42:13 [Note] WSREP: closing connection 74
121112 19:42:13 [Note] WSREP: Before Lock_thread_count
121112 19:42:15 [Note] WSREP: waiting for client connections to close: 33 (Wait a long time .... no next response ....)
#######
affects: | codership-mysql → percona-xtradb-cluster |
Changed in galera: | |
importance: | Undecided → Low |
Changed in galera: | |
milestone: | none → 23.2.4 |
Changed in galera: | |
status: | Fix Committed → Fix Released |
Changed in percona-xtradb-cluster: | |
milestone: | none → 5.5.30-23.7.4 |
Changed in percona-xtradb-cluster: | |
status: | New → Fix Released |
Hi,
Could you post the value of your wsrep_provider_ options variable?