[library] Timeout of deployment is exceeded: running of p_mysql_start timed out

Bug #1365541 reported by Artem Panchenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Sergii Golovatiuk

Bug Description

This issue was caught during system tests on CI:

http://jenkins-product.srt.mirantis.net:8080/view/0_master_swarm/job/master_fuelmain.system_test.centos.thread_4/153/testReport/(root)/ha_flat_scalability/ha_flat_scalability/

Here are the steps to reproduce:

1. Create new cluster: CentOS, HA, NovaFlat, Cinder LVM for volumes,
2. Add 3 controllers and deploy changes

Expected result:

 - cluster successfully deployed

Actual:

 - deployment timed out on 2 nodes (node-4, node-5)

According to puppet log on node-5 its catalog run took 109 minutes and it seems that most of time it tried to start pacemaker resources:

[root@node-5 ~]# tail -1 /var/log/puppet.log
Thu Sep 04 03:48:49 +0000 2014 Puppet (notice): Finished catalog run in 6581.26 seconds

Here is the part of pacemaker logs(node-5):

http://paste.openstack.org/show/105806/

and MySQL logs(node-5):

http://paste.openstack.org/show/105809/

Also I've found that MySQL on node-1 (ready controller) was turned off by pacemaker:

<27>Sep 4 01:59:57 node-1 pengine[14303]: error: clone_color: p_mysql:0 is running on node-1.test.domain.local which isn't allowed
<29>Sep 4 01:59:57 node-1 pengine[14303]: notice: LogActions: Stop p_mysql:0 (node-1.test.domain.local)
<27>Sep 4 02:00:00 node-1 pengine[14303]: error: clone_color: p_mysql:1 is running on node-1.test.domain.local which isn't allowed
<29>Sep 4 02:00:00 node-1 pengine[14303]: notice: LogActions: Stop p_mysql:1 (node-1.test.domain.local)
<29>Sep 4 02:00:00 node-1 crmd[14304]: notice: te_rsc_command: Initiating action 22: stop p_mysql_stop_0 on node-1.test.domain.local (local)

<27>Sep 4 02:00:00 node-1 mysqld: 2014-09-04 02:00:00 22028 [Note] /usr/sbin/mysqld: Normal shutdown

and then wasn't turned up.

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :
Changed in fuel:
status: New → Confirmed
importance: Undecided → High
assignee: Fuel Library Team (fuel-library) → Sergii Golovatiuk (sgolovatiuk)
summary: - [libabry] Timeout of deployment is exceeded: running of p_mysql_start
+ [library] Timeout of deployment is exceeded: running of p_mysql_start
timed out
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

We tried to reproduce the bug. Seems an environmental issue as there are no other reproducers even on swarm tests.

Changed in fuel:
status: Confirmed → Incomplete
Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.