The `reboot-cluster-from-complete-outage` action fails after power-loss binary log corruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL InnoDB Cluster Charm |
New
|
Undecided
|
Unassigned |
Bug Description
Under certain circumstances, power loss causes MySQL state files to become corrupted on charm units. If present on any unit, this corruption causes the charm to fail the `reboot-
The following is a snapshot of the action result:
https:/
As a result, the following "clone method" workaround is required to recover from this critical outage:
1. Obtain the passwords:
- `juju run --unit mysql-innodb-
- `juju run --unit mysql-innodb-
2. Access each downed unit and clone the instance from the working unit:
- SSH to the downed unit: `juju ssh mysql-innodb-
- Obtain a MySQL shell: `mysql -u root -p # use 'mysql.passwd' when prompted`
- Clone the working unit (please note **errata** below):
```sql
STOP GROUP_REPLICATION \W;
SET GLOBAL super_read_only = 0;
CLONE INSTANCE FROM 'clusteruser'@'[IP of the working unit]':3306 IDENTIFIED BY '[use cluster-password]' REQUIRE SSL;
```
3. Join each downed unit back into the cluster:
- Grab a new MySQL shell: `mysqlsh`
- Join the cluster:
```python
shell.
cluster = dba.get_cluster()
cluster.
```
**Errata** for the "clone method:"
- Where `CLONE INSTANCE` fails, stating the plugin is not loaded, it may need to be loaded:
```sql
INSTALL PLUGIN clone SONAME 'mysql_clone.so';
```
- Where an error is raised regarding `clone_
```sql
SET GLOBAL clone_valid_
```
If possible, could this also be made into a separate action?
description: | updated |