mysql-systemd start-post never returns when pid_file specified in multiple sections of the my.cnf

Bug #1490897 reported by David Kedves on 2015-09-01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster
Status tracked in 5.6
5.6
Undecided
Tomislav Plavcic

Bug Description

Environment:
 OS: CentOS 7.0 (systemd)

Package versions:
  [vagrant@galera_node1 ~]$ sudo rpm -qa | grep -i percona
  Percona-XtraDB-Cluster-galera-3-3.11-1.rhel7.x86_64
  percona-xtrabackup-2.2.12-1.el7.x86_64
  Percona-XtraDB-Cluster-server-56-5.6.24-25.11.1.el7.x86_64
  Percona-XtraDB-Cluster-shared-56-5.6.24-25.11.1.el7.x86_64
  Percona-XtraDB-Cluster-client-56-5.6.24-25.11.1.el7.x86_64
  percona-toolkit-2.2.15-1.noarch

Description of the issue:
  I've installed the percona galera server, and I created my own my.cnf, then I tried to start it via
 "systemctl start <email address hidden>" and it did not returned even after ~15 mins...
  so I checked the system what is going on:

root 5220 0.0 0.1 132496 1152 pts/0 S+ 04:15 0:00 systemctl start <email address hidden>
root 5309 0.0 0.1 115348 1732 ? Ss 04:15 0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr --wsrep-new-cluster
root 5310 0.0 0.1 115216 1664 ? Ss 04:15 0:00 /bin/bash -ue /usr/bin/mysql-systemd start-post 5309
mysql 6206 0.0 11.2 1249676 99884 ? Sl 04:15 0:01 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsrep-provider=/usr/lib64/libgalera_smm.so --wsrep-new-cluster --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysql.pid --socket=/var/lib/mysql/mysql.sock --port=3306 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1

I checked (and added some printouts) to the mysql-systemd script as that looked "hanging":

[root@galera_node1 ~]# /bin/bash -ue /usr/bin/mysql-systemd start-post 5309
action: start-post
pid_file_path: /var/run/mysqld/mysql.pid /var/run/mysqld/mysql.pid

^C
[root@galera_node1 ~]# vi /etc/my.cnf
[root@galera_node1 ~]# /bin/bash -ue /usr/bin/mysql-systemd start-post 5309
action: start-post
pid_file_path: /var/run/mysqld/mysql.pid
6206
 SUCCESS!
wait_for_pid created finished.

Apparently the problem was caused by the 'pid_file' config option, I had it both in the [mysqld] and [mysqld_safe] groups, and this caused the mysql-systemd script to be "hanging".

My fix / workaround:
  After I removed the repeated pid_file option from the [mysqld_safe] group of my my.cnf and then the mysqld-systemd exited properly ^^^.

But I still think that the 'mysqld-systemd' helper script should tolerate this (and must not *hang*).

David Kedves (kedazo) on 2015-09-01
description: updated
David Kedves (kedazo) wrote :

Hi again, now I had similar issue (i had 'datadir' defined under both mysqld and mysqld_safe section in my 'my.cnf')
and now systemctl start mysql@bootsrap failed in the 'pre-step':

[root@testbox115 lib]# bash -x /usr/bin/mysql-systemd start-pre; echo $?
# and the problematic line because of the parsing issue:
+ /usr/bin/mysql_install_db --rpm '--datadir=/var/lib/mysql /var/lib/mysql' --user=mysql

Used package version on CentOS7:
Percona-XtraDB-Cluster-server-55-5.5.41-25.11.853.el7.x86_64

So this '/usr/bin/mysql-systemd' must be really fixed to properly handle this same option is defined under multiple 'groups' in my.cnf issue....

Dave (akxws32zf6g92mu3-dave) wrote :

We also had hassles with the systemd scripts....

Server appeared to be crashing after 900 seconds but no errors in the logs other than a normal shut-down message.

Managed to trace back to the systemd process restarting mysql as the post section had timed out, reading through the very crude ping script allowed us to identify the issue in that it just performs a very basic ping.

We were using a custom socket path which meant the ping failed surely this script needs to take into account the configuration options of MySql?

Percona Server 5.7 can not start after upgrade from 5.6.28 on Ubuntu 15.04 because of this or similar bug. It just hangs on startup for 900 seconds. Can we get anyone checking it and fixing the scripts?

Tomislav Plavcic (tplavcic) wrote :
no longer affects: percona-server
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers