galera_server : Create galera users fails on CentOS7

Bug #1745281 reported by Marcin Dulak on 2018-01-25
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openstack-ansible
Critical
Major Hayden

Bug Description

openstack-ansible bbf804f43363d9a867b2fd7af995bd892f3fa746 on CentOS7.

TASK [galera_server : Create galera users] fails at the second run of setup-infrastructure.yml in a three node infra setup.
The first run of setup-infrastructure.yml failed due to https://bugs.launchpad.net/openstack-ansible/+bug/1745270.
Entering the galera-container that failed the TASK shows mysqld is stopped.

[root@dev-os1-hci2-galera-container-1fea4829 ~]# mysql
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused")
[root@dev-os1-hci2-galera-container-1fea4829 ~]# service mysqld status
Redirecting to /bin/systemctl status mysqld.service
● mariadb.service - MariaDB 10.1.30 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─environment.conf, limits.conf, migrated-from-my.cnf-settings.conf, restart.conf, slice.conf, timeout.conf, without-privatedevices.conf
   Active: inactive (dead) since Thu 2018-01-25 02:36:22 UTC; 12min ago
     Docs: man:mysqld(8)
           https://mariadb.com/kb/en/library/systemd/
 Main PID: 2184 (code=exited, status=0/SUCCESS)
   Status: "MariaDB server is down"

Jan 25 00:55:33 dev-os1-hci2-galera-container-1fea4829 systemd[1]: Starting MariaDB 10.1.30 database server...
Jan 25 00:57:28 dev-os1-hci2-galera-container-1fea4829 sh[1950]: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
Jan 25 00:57:29 dev-os1-hci2-galera-container-1fea4829 mysqld[2184]: 2018-01-25 0:57:29 140341256513792 [Note] /usr/sbin/mysqld (mysqld 10.1.30-MariaDB) starting as process 2184 ...
Jan 25 00:57:30 dev-os1-hci2-galera-container-1fea4829 systemd[1]: Started MariaDB 10.1.30 database server.
Jan 25 02:36:17 dev-os1-hci2-galera-container-1fea4829 systemd[1]: Stopping MariaDB 10.1.30 database server...
Jan 25 02:36:22 dev-os1-hci2-galera-container-1fea4829 systemd[1]: Stopped MariaDB 10.1.30 database server.

Attached the relevant ansible output.

Marcin Dulak (marcin-dulak) wrote :
Changed in openstack-ansible:
assignee: nobody → Jean-Philippe Evrard (jean-philippe-evrard)
Marcin Dulak (marcin-dulak) wrote :

The problem is still present on a fresh installation of the latest master

https://github.com/openstack/openstack-ansible/commit/e883d33630b06c921ec71b09169940dc7a2c6914
https://github.com/openstack/openstack-ansible-galera_server/commit/486e565172ab8e1496e64d524da4ba5dfd0d1fac

The second run of
ansible-playbook /opt/openstack-ansible/playbooks/setup-infrastructure.yml
fails since mysqld is not running on dev-os1-hci2-galera-container at the time when
TASK [galera_server : Create galera users]
is executing.

It seems that one of the openstack-ansble_galera_server tasks is stopping mysqld incorrectly. Even if I manually
[root@dev-os1-hci2-galera-container-8bd53def ~]# service mysqld start
before running
ansible-playbook /opt/openstack-ansible/playbooks/setup-infrastructure.yml
when the
TASK [galera_server : Create galera users]
is executing the mysqld is stopped on dev-os1-hci2-galera-container.
I can make this task to finish successfully if I quickly service mysqld start manually.

This is the same behavior as in the original report.

Marcin Dulak (marcin-dulak) wrote :

openstack_inventory.json
openstack_user_config.yml
user_variables.yml

Marcin Dulak (marcin-dulak) wrote :

The problem is related to https://github.com/openstack/openstack-ansible-galera_server/blob/579550d32d0d60e621cc12c85d5c0694580ef211/tasks/galera_install.yml#L74

It is removing mariadb every time setup-infrastructure.yml is run - this is what I see in /var/log/yum.log on dev-os1-hci2-galera-container:
...
Feb 17 19:30:43 Erased: percona-toolkit-3.0.6-1.el7.x86_64
Feb 17 19:30:43 Erased: percona-xtrabackup-2.3.10-1.el7.x86_64
Feb 17 19:30:43 Erased: perl-DBD-MySQL-4.023-5.el7.x86_64
Feb 17 19:30:43 Erased: MariaDB-shared-10.1.30-1.el7.centos.x86_64
Feb 17 19:30:48 Erased: MariaDB-server-10.1.30-1.el7.centos.x86_64
Feb 17 19:32:27 Installed: MariaDB-shared-10.1.30-1.el7.centos.x86_64
Feb 17 19:32:28 Installed: perl-DBD-MySQL-4.023-5.el7.x86_64
Feb 17 19:32:29 Installed: percona-toolkit-3.0.6-1.el7.x86_64
Feb 17 19:32:30 Installed: percona-xtrabackup-2.3.10-1.el7.x86_64
Feb 17 19:32:49 Installed: MariaDB-server-10.1.30-1.el7.centos.x86_64
...

Changed in openstack-ansible:
assignee: Jean-Philippe Evrard (jean-philippe-evrard) → nobody
status: New → Confirmed
assignee: nobody → Major Hayden (rackerhacker)
importance: Undecided → High
importance: High → Critical
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers