leader fails to connect to itself during leader elected

Bug #1895315 reported by Alexander Balderson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL InnoDB Cluster Charm
Invalid
Undecided
Unassigned

Bug Description

I have a 3 unit cluster Im trying to stand up ontop of vsphere:
the two non leader units report:
mysql-innodb-cluster/1 blocked executing 18 10.245.206.139 (leader-settings-changed) 'cluster' missing, Instance not yet configured for clustering
mysql-innodb-cluster/0 blocked executing 17 10.245.206.140 (leader-settings-changed) 'cluster' missing, Instance not yet configured for clustering

and the leader failed to run its leader-elected hook:
mysql-innodb-cluster/2* error idle 19 10.245.206.138 hook failed: "leader-elected"

the logs for the leader show it failing to connect to 10.245.206.138:

2020-09-10 23:26:20 DEBUG juju-log Opening db connection for clusteruser@10.245.206.138
2020-09-10 23:26:20 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-09-10 23:26:20 DEBUG juju-log Could not connect to clusteruser@10.245.206.138
2020-09-10 23:26:20 DEBUG jujuc server.go:211 running hook tool "juju-log"
2020-09-10 23:26:20 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/charm/reactive/mysql_innodb_cluster_handlers.py", line 124, in initialize_cluster
    instance.configure_instance(instance.cluster_address)
  File "lib/charm/openstack/mysql_innodb_cluster.py", line 570, in configure_instance
    self.wait_until_connectable(username=self.cluster_user,
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 329, in wrapped_f
    return self.call(f, *args, **kw)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 409, in call
    do = self.iter(retry_state=retry_state)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 368, in iter
    raise retry_exc.reraise()
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 186, in reraise
    raise self.last_attempt.result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 412, in call
    result = fn(*args, **kwargs)
  File "lib/charm/openstack/mysql_innodb_cluster.py", line 1374, in wait_until_connectable
    raise CannotConnectToMySQL("Unable to connect to MySQL")
charm.openstack.mysql_innodb_cluster.CannotConnectToMySQL: Unable to connect to MySQL

It also looks like mysql restarted about 40 seconds before this hook ran; maybe its possible it hadnt come fully back up?
Sep 10 23:24:55 juju-6de53f-19 systemd[1]: Stopping MySQL Community Server...
Sep 10 23:25:01 juju-6de53f-19 systemd[1]: mysql.service: Succeeded.
Sep 10 23:25:01 juju-6de53f-19 systemd[1]: Stopped MySQL Community Server.
Sep 10 23:25:01 juju-6de53f-19 systemd[1]: Starting MySQL Community Server...
Sep 10 23:25:01 juju-6de53f-19 CRON[20887]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Sep 10 23:25:19 juju-6de53f-19 systemd[1]: Started MySQL Community Server.
Sep 10 23:25:28 juju-6de53f-19 systemd[1]: tmp-snap.rootfs_zDUdu1.mount: Succeeded.
Sep 10 23:25:43 juju-6de53f-19 systemd[1]: mysql.service: Main process exited, code=exited, status=16/n/a
Sep 10 23:25:43 juju-6de53f-19 systemd[1]: mysql.service: Failed with result 'exit-code'.
Sep 10 23:25:43 juju-6de53f-19 systemd[1]: mysql.service: Scheduled restart job, restart counter is at 1.
Sep 10 23:25:43 juju-6de53f-19 systemd[1]: Stopped MySQL Community Server.
Sep 10 23:25:43 juju-6de53f-19 systemd[1]: Starting MySQL Community Server...
Sep 10 23:25:52 juju-6de53f-19 systemd[1]: Started MySQL Community Server.

The full run with bundle/logs can be found at https://solutions.qa.canonical.com/kubernetes/testRun/99949bc5-72c9-4f1d-b228-19dd96de9f87

Revision history for this message
Alexander Balderson (asbalderson) wrote :

Running juju resolved on the unit allowed everything to move forward; but this should work without having to resolve a unit or retry hooks.

Revision history for this message
Konstantinos Kaskavelis (kaskavel) wrote :

Closing this due to inactivity (low number of occurrences, and no hit for more than one year)

tags: added: solutions-qa-expired
Changed in charm-mysql-innodb-cluster:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.