mysql restart handler logic is incorrect

Bug #1533126 reported by Hugh Saunders
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Medium
Hugh Saunders
Kilo
Fix Released
Medium
Jesse Pretorius
Liberty
Fix Released
Medium
Jesse Pretorius
Trunk
Fix Released
Medium
Hugh Saunders

Bug Description

Currently the mysql restart handler attempts to restart mysql, then notify a cleanup task and another restart.
The problem is that if the first restart fails, the cleanup and second restart wont be triggered, even though failed_when is set.

Changed in openstack-ansible:
assignee: nobody → Hugh Saunders (hughsaunders)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-galera_server (master)

Fix proposed to branch: master
Review: https://review.openstack.org/266265

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (liberty)

Reviewed: https://review.openstack.org/265915
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=d839e2e4f950b08b6fb28169da16f56449797fb8
Submitter: Jenkins
Branch: liberty

commit d839e2e4f950b08b6fb28169da16f56449797fb8
Author: Jesse Pretorius <email address hidden>
Date: Mon Jan 11 16:36:49 2016 +0000

    Resolve MariaDB/Galera cluster startup/logging issues

    This patch ensures that MariaDB is given adequate time to start on a
    resources constrained system (180s versus the default of 30s),
    ensures that the error log is appropriately populated and also
    provides a failback restart in the case where there may be a corrupt
    sst directory.

    In the handler changes:
     - the environment variable "MYSQLD_STARTUP_TIMEOUT" is now being
       passed into the init script because the defaults are not being
       sourced at the init script runtime.
     - the temporary "sst" directory is cleaned up should the handler
       restart fail. This ensurez that a node is in a clean state if a
       leftover sst directory was on the disk which would cause a node
       to fail to join a cluster or bootstrap.

    In the task changes a new configuration file, that is part of the
    mariadb package, is being removed which has unforeseen options within
    it causing no logs to be created.

    The default option "galera_innodb_additional_mem_pool_size" was removed
    because its no longer valid within MariaDB10 and we'd never caught that
    error message until now.

    This patch is based on:
     - https://review.openstack.org/256016
     - https://review.openstack.org/266265

    Closes-Bug: #1532761
    Closes-Bug: #1533126
    Change-Id: I16af30c660790656fc2d59f9943c172b88098905

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-galera_server (master)

Reviewed: https://review.openstack.org/266265
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_server/commit/?id=5d1f8bf4e1f0a628cbb4ca41cc60999e4664bc8a
Submitter: Jenkins
Branch: master

commit 5d1f8bf4e1f0a628cbb4ca41cc60999e4664bc8a
Author: Hugh Saunders <email address hidden>
Date: Tue Jan 12 09:41:00 2016 +0000

    Ensure fallback galera restarts are notified

    Notifies are only fired when the result of a task is "changed". In this
    case we want the fallback handlers to be notified when the initial
    handler fails so we set changed_when: result|failed.

    Change-Id: Ib12e8de961d9c55ed3701cc883a00de878211c27
    Closes-Bug: #1533126

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (kilo)

Fix proposed to branch: kilo
Review: https://review.openstack.org/268975

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on openstack-ansible (kilo)

Change abandoned by Jesse Pretorius (<email address hidden>) on branch: kilo
Review: https://review.openstack.org/265910
Reason: This is included in https://review.openstack.org/268975

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (kilo)

Reviewed: https://review.openstack.org/268975
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=4a401125e46ca28d3d8848ea194737f1e17f6992
Submitter: Jenkins
Branch: kilo

commit 4a401125e46ca28d3d8848ea194737f1e17f6992
Author: Jesse Pretorius <email address hidden>
Date: Mon Jan 11 16:24:38 2016 +0000

    Resolve MariaDB/Galera cluster startup/logging issues

    This patch ensures that MariaDB is given adequate time to start on a
    resources constrained system (180s versus the default of 30s),
    ensures that the error log is appropriately populated and also
    provides a failback restart in the case where there may be a corrupt
    sst directory.

    In the handler changes:
     - the environment variable "MYSQLD_STARTUP_TIMEOUT" is now being
       passed into the init script because the defaults are not being
       sourced at the init script runtime.
     - the temporary "sst" directory is cleaned up should the handler
       restart fail. This ensurez that a node is in a clean state if a
       leftover sst directory was on the disk which would cause a node
       to fail to join a cluster or bootstrap.

    In the task changes:
     - a new configuration file (part of the mariadb package) is being
       removed which has unforeseen options within it causing no logs
       to be created.
     - a mysql ping check is implemented to verify that the service is
       responding after the restart handler is fired.

    This patch is based on:
     - https://review.openstack.org/256016
     - https://review.openstack.org/266265
     - https://review.openstack.org/268707

    Closes-Bug: #1532761
    Closes-Bug: #1533126
    Change-Id: I16af30c660790656fc2d59f9943c172b88098905

    Wait for galera to respond after restarts

    Add a mysql ping check to verify the service is responding
    after a restart handler is fired.

    Change-Id: Idfc1e1a1113ab0ffa221e4c0a4cc074df23fe89a
    (cherry picked from commit f6fb63f3477e7cdada1a1be8d670755a0e4e6f0b)

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 11.2.11

This issue was fixed in the openstack/openstack-ansible 11.2.11 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 12.0.8

This issue was fixed in the openstack/openstack-ansible 12.0.8 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 11.2.12

This issue was fixed in the openstack/openstack-ansible 11.2.12 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 12.0.9

This issue was fixed in the openstack/openstack-ansible 12.0.9 release.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 12.0.11

This issue was fixed in the openstack/openstack-ansible 12.0.11 release.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 11.2.14

This issue was fixed in the openstack/openstack-ansible 11.2.14 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 11.2.15

This issue was fixed in the openstack/openstack-ansible 11.2.15 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.