Cluster stuck on "SystemError: RuntimeError: Cluster.add_instance: RESET MASTER is not allowed because Group Replication is running."
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL InnoDB Cluster Charm |
Fix Released
|
High
|
David Ames |
Bug Description
Hi,
Running latest mysql-innodb-
I am seeing cluster stuck on "Not all instances clustered".
I am aware of LP #1881735 and #1901771.
My issue does not seem to be directly related to either.
Looking at Juju logs, I can see that charm consistently fails with:
2021-01-21 19:34:43 INFO juju-log Adding instance, <redacted>.34, to the cluster.
2021-01-21 19:34:43 ERROR juju-log Failed adding instance <redacted>.34 to cluster: Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
Clone based recovery selected through the recoveryMethod option
NOTE: Group Replication will communicate with other members using '<redacted>
Validating instance configuration at <redacted>
This instance reports its own address as <redacted>.34:3306
Instance configuration is suitable.
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.
Adding instance to the cluster...
Traceback (most recent call last):
File "<string>", line 3, in <module>
SystemError: RuntimeError: Cluster.
That repeats essentially for every update_status.
Going back to the addInstance command that we are trying to execute here, I can see on the documentation that it states:
https:/
auto: let Group Replication choose whether or not a full snapshot has to be taken, based on what the target server supports and the group_replicati
If I switch the "recoveryMethod" from "clone" to "auto" for that particular addInstance call, it works:
https:/
So, is there any reason why we are using recovery method as clone instead of auto?
Changed in charm-mysql-innodb-cluster: | |
status: | Incomplete → Confirmed |
importance: | Undecided → High |
assignee: | nobody → David Ames (thedac) |
Changed in charm-mysql-innodb-cluster: | |
status: | Confirmed → In Progress |
Changed in charm-mysql-innodb-cluster: | |
milestone: | none → 21.10 |
Changed in charm-mysql-innodb-cluster: | |
status: | Fix Committed → Fix Released |
So as you can see the first functional test [1] from [0] has failed because it did not cluster with auto. I originally set recoverymethod to clone due to these kinds of problems. I am open to changing it but it will need to be thoroughly tested.
Having the logs from all the mysql-innodb- cluster from your particular failure would be very helpful. As well as the bundle.
I'll test the auto setting myself and report back.
[0] https:/ /review. opendev. org/c/openstack /charm- mysql-innodb- cluster/ +/771882 /openstack- ci-reports. ubuntu. com/artifacts/ test_charm_ pipeline_ func_smoke/ openstack/ charm-mysql- innodb- cluster/ 771882/ 2/21322/ index.html
[1] https:/