fail to bootstrap a 10.2 galera cluster

Bug #1787305 reported by DevX
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Undecided
Unassigned

Bug Description

The role fails to converge to a new 10.2 cluster.

Example playbook:

  roles:
    - role: galera_server
      galera_major_version: 10.2
      galera_minor_version: 16

Expected: a new galera cluster running 10.2

Actual output:
(ansible25) root@logging1:/opt/openstack-ansible-ops/osquery# ansible mariadb -m shell -a "mysql -h 127.0.0.1 -e '\s'"
logging2_mariadb_container-237de7ac | FAILED | rc=1 >>
ERROR 2002 (HY000): Can't connect to MySQL server on '127.0.0.1' (115)non-zero return code

logging1_mariadb_container-6618a0d1 | SUCCESS | rc=0 >>
--------------
mysql Ver 15.1 Distrib 10.2.16-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

Connection id: 33
Current database:
Current user: root@localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server: MariaDB
Server version: 10.2.16-MariaDB-10.2.16+maria~xenial-log mariadb.org binary distribution
Protocol version: 10
Connection: 127.0.0.1 via TCP/IP
Server characterset: utf8
Db characterset: utf8
Client characterset: utf8
Conn. characterset: utf8
TCP port: 3306
Uptime: 1 min 9 sec

Threads: 12 Questions: 415 Slow queries: 0 Opens: 172 Flush tables: 1 Open tables: 28 Queries per second avg: 6.014
--------------

logging3_mariadb_container-1439a8e7 | FAILED | rc=1 >>
ERROR 2013 (HY000): Lost connection to MySQL server at 'handshake: reading inital communication packet', system error: 104non-zero return code

(ansible25) root@logging1:/opt/openstack-ansible-ops/osquery# openstack-ansible site.yml $USER_VARS -vvv^C
(ansible25) root@logging1:/opt/openstack-ansible-ops/osquery# ansible mariadb -m shell -a "systemctl status mariadb.service"
logging1_mariadb_container-6618a0d1 | SUCCESS | rc=0 >>
● mariadb.service - MariaDB 10.2.16 database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─environment.conf, limits.conf, migrated-from-my.cnf-settings.conf, restart.conf, slice.conf, timeout.conf, without-privatedevices.conf
   Active: active (running) since Thu 2018-08-16 09:22:58 CDT; 1min 29s ago
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 6369 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 6367 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
  Process: 6085 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
  Process: 6083 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 6082 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
 Main PID: 6326 (mysqld)
   Status: "Taking your SQL requests now..."
    Tasks: 45
      CPU: 4.003s
   CGroup: /galera.slice/mariadb.service
           ├─6326 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
           ├─7959 sh -c wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.0.239.32:4444/xtrabackup_sst//10.0.239.32' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --binlog '/var/lib/mysql/mariadb-bin' --gtid 'dfc02e4e-a15f-11e8-a81a-92f7342753a2:7' --gtid-domain-id '0'
           ├─7960 /bin/bash -ue /usr//bin/wsrep_sst_xtrabackup-v2 --role donor --address 10.0.239.32:4444/xtrabackup_sst//10.0.239.32 --socket /var/run/mysqld/mysqld.sock --datadir /var/lib/mysql/ --binlog /var/lib/mysql/mariadb-bin --gtid dfc02e4e-a15f-11e8-a81a-92f7342753a2:7 --gtid-domain-id 0
           └─8154 sleep 10

Aug 16 09:22:55 logging1-mariadb-container-6618a0d1 systemd[1]: Starting MariaDB 10.2.16 database server...
Aug 16 09:22:58 logging1-mariadb-container-6618a0d1 mysqld[6085]: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
Aug 16 09:22:58 logging1-mariadb-container-6618a0d1 mysqld[6326]: 2018-08-16 9:22:58 140259370625216 [Note] /usr/sbin/mysqld (mysqld 10.2.16-MariaDB-10.2.16+maria~xenial-log) starting as process 6326 ...
Aug 16 09:22:58 logging1-mariadb-container-6618a0d1 systemd[1]: Started MariaDB 10.2.16 database server.

logging3_mariadb_container-1439a8e7 | FAILED | rc=3 >>
● mariadb.service - MariaDB 10.2.16 database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─environment.conf, limits.conf, migrated-from-my.cnf-settings.conf, restart.conf, slice.conf, timeout.conf, without-privatedevices.conf
   Active: activating (auto-restart) (Result: signal) since Thu 2018-08-16 09:24:23 CDT; 4s ago
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 7861 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=killed, signal=ABRT)
  Process: 7595 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
  Process: 7593 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 7592 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
 Main PID: 7861 (code=killed, signal=ABRT)

Aug 16 09:24:23 logging3-mariadb-container-1439a8e7 systemd[1]: mariadb.service: Unit entered failed state.
Aug 16 09:24:23 logging3-mariadb-container-1439a8e7 systemd[1]: mariadb.service: Failed with result 'signal'.non-zero return code

logging2_mariadb_container-237de7ac | FAILED | rc=3 >>
● mariadb.service - MariaDB 10.2.16 database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─environment.conf, limits.conf, migrated-from-my.cnf-settings.conf, restart.conf, slice.conf, timeout.conf, without-privatedevices.conf
   Active: activating (start) since Thu 2018-08-16 09:24:17 CDT; 10s ago
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 7132 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
  Process: 7130 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 7129 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
 Main PID: 7371 (mysqld)
    Tasks: 14
      CPU: 1.268s
   CGroup: /galera.slice/mariadb.service
           ├─7371 /usr/sbin/mysqld --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
           ├─7379 sh -c wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.0.239.32' --datadir '/var/lib/mysql/' --parent '7371' --binlog '/var/lib/mysql/mariadb-bin'
           ├─7380 /bin/bash -ue /usr//bin/wsrep_sst_xtrabackup-v2 --role joiner --address 10.0.239.32 --datadir /var/lib/mysql/ --parent 7371 --binlog /var/lib/mysql/mariadb-bin
           ├─7586 /bin/bash -ue /usr//bin/wsrep_sst_xtrabackup-v2 --role joiner --address 10.0.239.32 --datadir /var/lib/mysql/ --parent 7371 --binlog /var/lib/mysql/mariadb-bin
           ├─7596 socat -u TCP-LISTEN:4444,reuseaddr stdio
           └─7597 xbstream -x

Aug 16 09:24:17 logging2-mariadb-container-237de7ac systemd[1]: Starting MariaDB 10.2.16 database server...
Aug 16 09:24:19 logging2-mariadb-container-237de7ac mysqld[7132]: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
Aug 16 09:24:19 logging2-mariadb-container-237de7ac mysqld[7371]: 2018-08-16 9:24:19 140558556211392 [Note] /usr/sbin/mysqld (mysqld 10.2.16-MariaDB-10.2.16+maria~xenial-log) starting as process 7371 ...non-zero return code

DevX (palma-victor)
description: updated
Revision history for this message
DevX (palma-victor) wrote :

The issue is related to xtrabackup requiring version 2.4+ for galera clusters greater than 10.1.

I have proposed the fix in issue https://review.openstack.org/592531

Changed in openstack-ansible:
assignee: nobody → DevX (palma-victor)
status: New → In Progress
Changed in openstack-ansible:
assignee: DevX (palma-victor) → Jesse Pretorius (jesse-pretorius)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on openstack-ansible-galera_server (master)

Change abandoned by Jesse Pretorius (odyssey4me) (<email address hidden>) on branch: master
Review: https://review.openstack.org/592531
Reason: The role is now hard set to install and bootstrap MariadDB 2.4, so I guess this is no longer needed.

Changed in openstack-ansible:
status: In Progress → Fix Released
assignee: Jesse Pretorius (jesse-pretorius) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.