Malformed 3 unit cluster (percona-cluster)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Landscape Server |
Invalid
|
Undecided
|
Unassigned | ||
OpenStack Percona Cluster Charm |
Fix Released
|
High
|
Unassigned | ||
percona-cluster (Juju Charms Collection) |
Invalid
|
High
|
Unassigned |
Bug Description
Using percona-cluster r247 from the charm store, on xenial
What happened initially was that keystone/0 failed in the shared-
2017-01-10 16:32:26 INFO shared-
Investigation showed that mysql/2 was blocked and that there was no mysql running on that unit:
mysql/2 blocked idle 1/lxd/4 10.2.103.76 Unit is not in sync
hacluster-mysql/2 active idle 10.2.103.76 Unit is ready and clustered
mysql show global status on the leader unit (mysql/0) confirmed that the cluster was composed of two members only (wsrep.txt):
| wsrep_cluster_size | 2 |
crm status on mysql/2 was oblivious to the failure (crm_status.txt):
root@juju-
Last updated: Tue Jan 10 17:12:38 2017 Last change: Tue Jan 10 16:09:10 2017 by hacluster via crmd on juju-eb69cd-3-lxd-0
Stack: corosync
Current DC: juju-eb69cd-1-lxd-4 (version 1.1.14-70404b0) - partition with quorum
3 nodes and 4 resources configured
Online: [ juju-eb69cd-1-lxd-4 juju-eb69cd-3-lxd-0 juju-eb69cd-4-lxd-4 ]
Full list of resources:
Resource Group: grp_percona_cluster
res_mysql_vip (ocf::heartbeat
Clone Set: cl_mysql_monitor [res_mysql_monitor]
Started: [ juju-eb69cd-1-lxd-4 juju-eb69cd-3-lxd-0 juju-eb69cd-4-lxd-4 ]
systemctl status confirmed mysql had existed (systemctl_
root@juju-
● mysql.service - LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon
Loaded: loaded (/etc/init.d/mysql; bad; vendor preset: enabled)
Active: active (exited) since Tue 2017-01-10 16:08:03 UTC; 1h 40min ago
Docs: man:systemd-
Jan 10 16:07:39 juju-eb69cd-1-lxd-4 systemd[1]: Stopped LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon.
Jan 10 16:07:39 juju-eb69cd-1-lxd-4 systemd[1]: Starting LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon...
Jan 10 16:07:39 juju-eb69cd-1-lxd-4 mysql[21801]: * Starting MySQL (Percona XtraDB Cluster) database server mysqld
Jan 10 16:07:42 juju-eb69cd-1-lxd-4 mysql[21801]: * State transfer in progress, setting sleep higher mysqld
Jan 10 16:08:03 juju-eb69cd-1-lxd-4 mysql[21801]: ...done.
Jan 10 16:08:03 juju-eb69cd-1-lxd-4 systemd[1]: Started LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon.
I did a "service mysql start", to no effect. Then I did a "service mysql stop" followed by a "service mysql start", and that fixed things. The cluster now has 3 units, and even juju status agreed the next time update-status ran.
I'm attaching the mentioned attachments, and also a tarball called mysql-2.tar.bz2 which has /var/log from mysql/2 before we tried to fix it.
Changed in landscape: | |
milestone: | none → 17.01 |
Changed in landscape: | |
milestone: | 17.01 → 17.02 |
Changed in charm-percona-cluster: | |
assignee: | nobody → David Ames (thedac) |
importance: | Undecided → High |
status: | New → In Progress |
Changed in percona-cluster (Juju Charms Collection): | |
status: | In Progress → Invalid |
summary: |
- Malformed 3 unit cluster + Malformed 3 unit cluster (percona-cluster) |
Changed in charm-percona-cluster: | |
status: | In Progress → Incomplete |
/var/log/* from mysql/2 before the attempts to fix it via service restarts.