The crashdumps including [0] all have the same failure mode. The leader is trying to add an instance to the cluster: 2020-10-14 11:53:43 INFO juju-log cluster:4: Adding instance, 172.17.112.7, to the cluster. 2020-10-14 11:55:38 ERROR juju-log cluster:4: Failed adding instance 172.17.112.7 to cluster: Logger: Tried to log to an uninitialized logger. ^[[33mWARNING: ^[[0mA GTID set check of the MySQL instance at '172.17.112.7:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster. 172.17.112.7:3306 has the following errant GTIDs that do not exist in the cluster: 49f37ae7-0e13-11eb-b788-fa163ea0e822:1-16 ^[[33mWARNING: ^[[0mDiscarding these extra GTID events can either be done manually or by completely overwriting the state of 172.17.112.7:3306 with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'. Having extra GTID events is not expected, and it is recommended to investigate this further and ensure that the data can be removed prior to choosing the clone recovery method. Clone based recovery selected through the recoveryMethod option ^[[36mNOTE: ^[[0mGroup Replication will communicate with other members using '172.17.112.7:33061'. Use the localAddress option to override. Validating instance configuration at 172.17.112.7:3306... This instance reports its own address as ^[[1m172.17.112.7:3306^[[0m Instance configuration is suitable. A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours. Adding instance to the cluster... ^[[31mERROR: ^[[0mUnable to start Group Replication for instance '172.17.112.7:3306'. Please check the MySQL server error log for more information. Traceback (most recent call last): File "", line 3, in SystemError: RuntimeError: Cluster.add_instance: Group Replication failed to start: MySQL Error 3092 (HY000): 172.17.112.7:3306: The server is not configured properly to be an active member of the group. Please see more details on error log. 2020-10-14 11:55:38 DEBUG juju-log cluster:4: tracer: set flag leadership.changed.cluster-instances-configured 2020-10-14 11:55:38 DEBUG juju-log cluster:4: tracer> tracer: set flag leadership.set.cluster-instances-configured tracer: ++ queue handler reactive/mysql_innodb_cluster_handlers.py:168:add_instances_to_cluster tracer: -- dequeue handler reactive/mysql_innodb_cluster_handlers.py:134:configure_instances_for_clustering 2020-10-14 11:55:38 INFO juju-log cluster:4: Invoking reactive handler: hooks/relations/mysql-router/provides.py:47:joined:db-router 2020-10-14 11:55:39 INFO juju-log cluster:4: Invoking reactive handler: hooks/relations/mysql-router/provides.py:53:changed:db-router 2020-10-14 11:55:39 INFO juju-log cluster:4: Invoking reactive handler: hooks/relations/mysql-innodb-cluster/peers.py:69:joined:cluster 2020-10-14 11:55:39 INFO juju-log cluster:4: Invoking reactive handler: hooks/relations/mysql-innodb-cluster/peers.py:75:changed:cluster 2020-10-14 11:55:39 INFO juju-log cluster:4: Invoking reactive handler: hooks/relations/tls-certificates/requires.py:109:broken:certificates 2020-10-14 11:55:39 INFO juju-log cluster:4: Invoking reactive handler: reactive/mysql_innodb_cluster_handlers.py:168:add_instances_to_cluster 2020-10-14 11:55:39 DEBUG juju-log cluster:4: Adding instances to cluster. 2020-10-14 11:55:39 WARNING juju-log cluster:4: Instance: 172.17.112.26, already clustered. 2020-10-14 11:55:40 DEBUG juju-log cluster:4: Checking cluster status. 2020-10-14 11:55:41 INFO juju-log cluster:4: Adding instance, 172.17.112.7, to the cluster. 2020-10-14 11:55:41 ERROR juju-log cluster:4: Failed adding instance 172.17.112.7 to cluster: Logger: Tried to log to an uninitialized logger. ^[[36mNOTE: ^[[0mThe target instance '172.17.112.7:3306' has not been pre-provisioned (GTID set is empty). The Shell is unable to decide whether incremental state recovery can correctly provision it. Clone based recovery selected through the recoveryMethod option ^[[36mNOTE: ^[[0mGroup Replication will communicate with other members using '172.17.112.7:33061'. Use the localAddress option to override. Traceback (most recent call last): File "", line 3, in SystemError: RuntimeError: Cluster.add_instance: The port '33061' for localAddress option is already in use. Specify an available port to be used with localAddress option or free port '33061'. On the instance in question we have: 2020-10-14T11:53:31.490181Z 13 [System] [MY-011086] [Server] Received RESTART from user clusteruser. Restarting mysqld (Version: 8.0.21-0ubuntu0.20.04.4). 2020-10-14T11:53:33.335599Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.21-0ubuntu0.20.04.4) (Ubuntu). 2020-10-14T11:53:34.336202Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.21-0ubuntu0.20.04.4) starting as process 24580 2020-10-14T11:53:34.359908Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2020-10-14T11:53:34.793834Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2020-10-14T11:53:34.954980Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock 2020-10-14T11:53:35.044728Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed. 2020-10-14T11:53:35.045027Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel. 2020-10-14T11:53:35.077034Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.21-0ubuntu0.20.04.4' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Ubuntu). 2020-10-14T11:53:44.625889Z 9 [ERROR] [MY-011685] [Repl] Plugin group_replication reported: 'The group_replication_group_name option is mandatory' 2020-10-14T11:53:44.625978Z 9 [ERROR] [MY-011660] [Repl] Plugin group_replication reported: 'Unable to start Group Replication on boot' 2020-10-14T11:53:44.806068Z 9 [Warning] [MY-010604] [Repl] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=juju-40b2cf-zaza-7e27fd55bcd4-2-relay-bin' to avoid this problem. 2020-10-14T11:53:44.814271Z 9 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. 2020-10-14T11:53:44.825541Z 9 [System] [MY-013587] [Repl] Plugin group_replication reported: 'Plugin 'group_replication' is starting.' 2020-10-14T11:53:44.842678Z 12 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. 2020-10-14T11:54:14.863847Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Timeout while waiting for the group communication engine's communications status to change!' 2020-10-14T11:54:14.864065Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:54:48.430647Z 9 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' 2020-10-14T11:54:48.430956Z 9 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' 2020-10-14T11:54:48.440907Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:54:55.711386Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=44, errno=98)!' 2020-10-14T11:54:55.711451Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:54:55.711556Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:54:55.769352Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:00.852400Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=45, errno=98)!' 2020-10-14T11:55:00.852476Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:00.852516Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:00.919204Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:06.003752Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=46, errno=98)!' 2020-10-14T11:55:06.003838Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:06.003888Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:06.071541Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:11.272759Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=47, errno=98)!' 2020-10-14T11:55:11.272919Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:11.273121Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:11.352668Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:16.437659Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=48, errno=98)!' 2020-10-14T11:55:16.437823Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:16.437957Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:16.500998Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:21.612663Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=49, errno=98)!' 2020-10-14T11:55:21.612782Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:21.612915Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:21.678882Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:26.756625Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=50, errno=98)!' 2020-10-14T11:55:26.756699Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:26.756747Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:26.819219Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:31.903793Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=51, errno=98)!' 2020-10-14T11:55:31.903863Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:31.903911Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:31.972750Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' 2020-10-14T11:55:37.055100Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to bind to INADDR_ANY:33061 (socket=52, errno=98)!' 2020-10-14T11:55:37.055192Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Unable to announce tcp port 33061. Port already in use?' 2020-10-14T11:55:37.055336Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:55:37.123399Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061' I am tempted to blame a busy under-cloud here. There is a 30 seconds between the attempt and the timeout: 2020-10-14T11:53:44.842678Z 12 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. 2020-10-14T11:54:14.863847Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Timeout while waiting for the group communication engine's communications status to change!' 2020-10-14T11:54:14.864065Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error joining the group while waiting for the network layer to become ready.' 2020-10-14T11:54:48.430647Z 9 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' We may need to find a timeout value to turn up to avoid this failure mode. [0] https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_func_full/openstack/charm-ceph-radosgw/757922/2/7146/index.html