clustercheck hangs when server is hung

Bug #1035927 reported by Koa McCullough on 2012-08-12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to
Fix Released
Raghavendra D Prabhu

Bug Description

When the mysql server on a node is hung clustercheck will also hang and never fail the server over.

Steps to reproduce:

1) setup a 3 node cluster
2) configure ha_proxy to run with an active writer and 2 backup servers
3) start a sysbench test
4) log into the writer node and run 'kill -SIGSTOP <mysqld_pid>' as root to hang the server

Setup details can be found here:

The sysbench test should timeout and the ha_proxy writer vip should also hang.

Attached is a python version of clustercheck that will timeout after 30 seconds and mark the node as failed. I didn't create a branch since this would add a new python dependency to PXC so, I wasn't sure if this needed to be written in perl or not in order to be accepted.

Tested. The existing script can be reused by using timeout in front of mysql like

Tested the python script too, and it works.

Changed in percona-xtradb-cluster:
status: New → Confirmed
Changed in percona-xtradb-cluster:
milestone: none → 5.5.30-23.7.4
assignee: nobody → Raghavendra D Prabhu (raghavendra-prabhu)
status: Confirmed → Fix Released

Percona now uses JIRA for bug reports so this bug report is migrated to:

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers