Strange issue, nodes join the cluster and then suddenly abort

Bug #2003643 reported by Zeeshan Saeed Paracha
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Galera
New
Undecided
Unassigned

Bug Description

Please advice me as per below logs
I am unable to find any reason

Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal

Galera setup :

Total of 4 nodes and all three are connected except the fourth one.
the fourth one is running on aarch hardware.

Below logs are for the fourth node which connects and later gets kicked out for no reason,

(systemctl status mariadb)

● mariadb.service - MariaDB 10.3.37 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Tue 2023-01-24 04:42:10 UTC; 1min 19s ago
       Docs: man:mysqld(8)
             https://mariadb.com/kb/en/library/systemd/
    Process: 838546 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mys>
    Process: 838547 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSI>
    Process: 838549 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || >
    Process: 838670 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_S>
   Main PID: 838670 (code=exited, status=1/FAILURE)
     Status: "MariaDB server is down"

Jan 24 04:41:37 instance-20220523-1749 systemd[1]: Starting MariaDB 10.3.37 database serv>
Jan 24 04:41:39 instance-20220523-1749 sh[838550]: WSREP: Recovered position f1e1e4a1-9a0>
Jan 24 04:41:39 instance-20220523-1749 mysqld[838670]: 2023-01-24 4:41:39 0 [Note] /usr/>
Jan 24 04:42:10 instance-20220523-1749 systemd[1]: mariadb.service: Main process exited, >
Jan 24 04:42:10 instance-20220523-1749 systemd[1]: mariadb.service: Failed with result 'e>
Jan 24 04:42:10 instance-20220523-1749 systemd[1]: Failed to start MariaDB 10.3.37 databa>

Linux instance-20220523-1749 5.15.0-1016-oracle #20~20.04.1-Ubuntu SMP Mon Aug 8 07:30:37 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

mysql Ver 15.1 Distrib 10.3.37-MariaDB, for debian-linux-gnu (aarch64) using readline 5.2

(tail -f /var/log/mysql/error.log)

2023-01-22 6:26:52 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
2023-01-22 6:26:52 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2023-01-22 6:26:52 0 [Note] WSREP: wsrep_load(): Galera 3.29(ra60e019) by Codership Oy <email address hidden> loaded successfully.
2023-01-22 6:26:52 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2023-01-22 6:26:52 0 [Note] WSREP: Found saved state: f1e1e4a1-9a0f-11ed-9c0f-1be97a0e7b5b:-1, safe_to_bootstrap: 1
2023-01-22 6:26:52 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.0.1.169; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; p
2023-01-22 6:26:52 0 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1
2023-01-22 6:26:52 0 [Note] WSREP: wsrep_sst_grab()
2023-01-22 6:26:52 0 [Note] WSREP: Start replication
2023-01-22 6:26:52 0 [Note] WSREP: Setting initial position to f1e1e4a1-9a0f-11ed-9c0f-1be97a0e7b5b:0
2023-01-22 6:26:52 0 [Note] WSREP: protonet asio version 0
2023-01-22 6:26:52 0 [Note] WSREP: Using CRC-32C for message checksums.
2023-01-22 6:26:52 0 [Note] WSREP: backend: asio
2023-01-22 6:26:52 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
2023-01-22 6:26:52 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2023-01-22 6:26:52 0 [Note] WSREP: restore pc from disk failed
2023-01-22 6:26:52 0 [Note] WSREP: GMCast version 0
2023-01-22 6:26:52 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2023-01-22 6:26:52 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2023-01-22 6:26:52 0 [Note] WSREP: EVS version 0
2023-01-22 6:26:52 0 [Note] WSREP: gcomm: connecting to group 'MariaDB Galera Cluster', peer '35.212.132.67:,85.122.127.235:,192.3.91.168:,10.0.1.169:'
2023-01-22 6:26:52 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://10.0.1.169:4567
2023-01-22 6:26:52 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') connection established to 6ab548fb tcp://85.122.127.235:4567
2023-01-22 6:26:52 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2023-01-22 6:26:52 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') connection established to 7191d038 tcp://192.3.91.168:4567
2023-01-22 6:26:53 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') connection established to 5149ecf7 tcp://35.212.132.67:4567
2023-01-22 6:26:53 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') connection established to 7191d038 tcp://192.3.91.168:4567
2023-01-22 6:26:53 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') connection established to 5149ecf7 tcp://35.212.132.67:4567
2023-01-22 6:26:54 0 [Note] WSREP: declaring 5149ecf7 at tcp://35.212.132.67:4567 stable
2023-01-22 6:26:54 0 [Note] WSREP: declaring 6ab548fb at tcp://85.122.127.235:4567 stable
2023-01-22 6:26:54 0 [Note] WSREP: declaring 7191d038 at tcp://192.3.91.168:4567 stable
2023-01-22 6:26:55 0 [Note] WSREP: view(view_id(NON_PRIM,5149ecf7,2792) memb {
5149ecf7,0
6ab548fb,0
7191d038,0
c273183b,0
} joined {
} left {
} partitioned {
07d1e878,0
77a052d6,0
9e708b0d,0
a1d8493f,0
a79d17a2,0
b2984c45,0
b60181c9,0
bad3a406,0
c401108d,0
d392217d,0
dc8cc78b,0
})
2023-01-22 6:26:56 0 [Note] WSREP: (c273183b, 'tcp://0.0.0.0:4567') turning message relay requesting off
2023-01-22 6:27:23 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():160
2023-01-22 6:27:23 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
2023-01-22 6:27:23 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1457: Failed to open channel 'MariaDB Galera Cluster' at 'gcomm://35.212.132.67,85.122.127.235,192.3.91.168,10.0.1.169': -110 (Connection timed out)
2023-01-22 6:27:23 0 [ERROR] WSREP: gcs connect failed: Connection timed out
2023-01-22 6:27:23 0 [ERROR] WSREP: wsrep::connect(gcomm://35.x.x.x,85.122.x.x,192.3.x.x,10.0.x.x) failed: 7
2023-01-22 6:27:23 0 [ERROR] Aborting

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.