mysql-wss can potentially stop mysqld during SST

Bug #1572239 reported by Dmitry Nikishov
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Bogdan Dobrelya
6.1.x
Won't Fix
High
MOS Maintenance
7.0.x
Won't Fix
High
MOS Maintenance
8.0.x
Won't Fix
High
MOS Maintenance
Mitaka
Fix Released
High
Bogdan Dobrelya
Newton
Fix Committed
High
Bogdan Dobrelya

Bug Description

During the deployment of galera cluster, p_mysql pacemaker resource starts before the actual DB configuration. mysql_status() and mysql_monitor() interact in such a way, that it is possible to have a situation, when mysql_monitor() considers a MySQL server unresponsive, when really it is in process of syncing with master server (SST).

We are occasionally experiencing this in a CI environment.
There are 3 servers in MySQL/Galera cluster. The sequence of events (as seen in logs):
- mysql 1 (primary) has been deployed
- mysql 2 deploying
- mysql 2 syncing with 1
- OCF script on 2 tries to mysql_monitor(), which actually tries to connect to the DB and execute certain queries. It obviously fails, since Puppet actually waits for sync to finish to configure "clustercheck" user.
- failure count for p_mysql reaches a threshold
- mysql 2 in JOINED state
- pacemaker/corosync on 2 issues a restart command for p_mysql
- mysql 3 deploying
- mysql 3 syncing with 2
- mysql 2 stops (due to a stop command for p_mysql)
- mysql 3 crashes
- mysql 2 starts

The galera cluster is not able to recover from this, which leads to a failed deployment.

Possible solution
mysql_status() has a check if "/var/lib/mysql/sst_in_progress" exists. If this file exists, it immediately returns "OCF_SUCCESS".
https://github.com/openstack/fuel-library/blob/master/files/fuel-ha-utils/ocf/mysql-wss#L456-L458
mysql_monitor() calls mysql_status(), but it doesn't distinguish if SST in progress.
https://github.com/openstack/fuel-library/blob/master/files/fuel-ha-utils/ocf/mysql-wss#L498

So the solution could be to move the SST check to mysql_monitor() so that it doesn't try to connect to the mysql server if SST is in progress

Revision history for this message
Dmitry Klenov (dklenov) wrote :

@Dmitry, can you please put more data:
1) Which version of fuel is affected?
2) Can you please collect diagnostic snapshot? It would help a lot to troubleshoot the issue.

tags: added: area-library
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Logs are not required, the assumption to immediately exit monitor with SUCCESS, if discovered SST in fly, is correct.

tags: added: galera
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/308232

Changed in fuel:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/309033

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/308232
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=c45eaa298549a2b03d1f3aead8ca92115c817f81
Submitter: Jenkins
Branch: master

commit c45eaa298549a2b03d1f3aead8ca92115c817f81
Author: Bogdan Dobrelya <email address hidden>
Date: Wed Apr 20 11:39:44 2016 +0200

    Fix SST check for MySQL OCF RA

    Make sure start/monitor actions will exit OK,
    if SST's in progress

    Closes-bug: #1572239

    Change-Id: I1a82f383ce0deba1e4b3d0db0634329a74f03ce6
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/309033
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=aaebffd70689cbd9c45ad3b3e81e25e96ba57185
Submitter: Jenkins
Branch: stable/mitaka

commit aaebffd70689cbd9c45ad3b3e81e25e96ba57185
Author: Bogdan Dobrelya <email address hidden>
Date: Wed Apr 20 11:39:44 2016 +0200

    Fix SST check for MySQL OCF RA

    Make sure start/monitor actions will exit OK,
    if SST's in progress

    Closes-bug: #1572239

    Change-Id: I1a82f383ce0deba1e4b3d0db0634329a74f03ce6

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/6.1)

Fix proposed to branch: stable/6.1
Review: https://review.openstack.org/315989

tags: added: tech-debt
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/316802

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/317978

tags: added: on-verification
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :

verified on 9.0-mos-404

tags: removed: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/8.0)

Change abandoned by Bogdan Dobrelya (<email address hidden>) on branch: stable/8.0
Review: https://review.openstack.org/317978

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/7.0)

Change abandoned by Bogdan Dobrelya (<email address hidden>) on branch: stable/7.0
Review: https://review.openstack.org/316802

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/6.1)

Change abandoned by Bogdan Dobrelya (<email address hidden>) on branch: stable/6.1
Review: https://review.openstack.org/315989

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Won't Fix for 6.1-, 7.0- and 8.0-updates as this is pretty big change

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/374219

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/7.0)

Reviewed: https://review.openstack.org/374219
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=f9a2d479f3687157d2b17a927a09ce5f995522d6
Submitter: Jenkins
Branch: stable/7.0

commit f9a2d479f3687157d2b17a927a09ce5f995522d6
Author: Denis Puchkin <email address hidden>
Date: Wed Sep 21 17:38:54 2016 +0300

    Backport mysql OCF from stable/mitaka

    backport mysql ocf script from stable/mitaka

    Closes-bug: #1524826
    Closes-bug: #1542256
    Closes-bug: #1572239
    Closes-bug: #1572557
    Closes-bug: #1572601
    Closes-bug: #1574747
    Closes-bug: #1574497
    Closes-bug: #1576244
    Closes-bug: #1574999
    Closes-bug: #1578278
    Closes-bug: #1388779
    Closes-bug: #1574999
    Closes-bug: #1576244
    Closes-bug: #1583173
    Closes-bug: #1585125

    Change-Id: I1cc6f95884a8fbd5c3418ede89bdf9ec6864bdc8

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/377597

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/377597
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=b3873f5f5a0bb1526b1269f163223ae48d6e21f5
Submitter: Jenkins
Branch: stable/8.0

commit b3873f5f5a0bb1526b1269f163223ae48d6e21f5
Author: Denis Puchkin <email address hidden>
Date: Tue Sep 27 13:20:25 2016 +0300

    Backport mysql OCF from stable/mitaka

    backport mysql ocf script from stable/mitaka

    Closes-bug: #1524826
    Closes-bug: #1542256
    Closes-bug: #1572239
    Closes-bug: #1572557
    Closes-bug: #1572601
    Closes-bug: #1574747
    Closes-bug: #1574497
    Closes-bug: #1576244
    Closes-bug: #1574999
    Closes-bug: #1578278
    Closes-bug: #1388779
    Closes-bug: #1574999
    Closes-bug: #1576244
    Closes-bug: #1583173
    Closes-bug: #1585125

    Change-Id: I1cc6f95884a8fbd5c3418ede89bdf9ec6864bdc8

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.