During the upgrade M->N mysql doesn't start properly

Bug #1595143 reported by Vitaliy Nogin
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Medium
Jesse Pretorius

Bug Description

Detailed bug description:
During the upgrade Mitaka->Newton mysql doesn't start properly

TASK [galera_server : Start MySQL] *********************************************
fatal: [infra1_galera_container-7b8c247e]: FAILED! => {"changed": true, "cmd": "for i in {1..3}; do\n /etc/init.d/mysql start || true\n if pgrep mysqld; then\n exit 0\n else\n sleep 2\n fi\n done\n echo \"Service failed to start\"\n exit 1", "delta": "0:30:18.315926", "end": "2016-06-22 12:38:01.590688", "failed": true, "rc": 1, "start": "2016-06-22 12:07:43.274762", "stderr": "", "stdout": " * Starting MariaDB database server mysqld\n ...fail!\nService failed to start", "stdout_lines": [" * Starting MariaDB database server mysqld", " ...fail!", "Service failed to start"], "warnings": []}

It seems that galera_upgrade_post.yml playbook should be reviewed.

Steps to reproduce:
Run upgrade from M to N

Expected results:
Mysql should be started without errors

Actual result:
Error during the task running

Reproducibility:
100%

Workaround:
If I run the following command during the playbook running:
openstack-ansible galera-install.yml --tags galera-bootstrap
then mysql will be started.

Impact:
-

Vitaliy Nogin (vnogin)
description: updated
Revision history for this message
Jay Pipes (jaypipes) wrote :

MySQL should not need to be restarted during an upgrade unless the MySQL binary itself is being upgraded (which isn't recommended for normal OpenStack upgrade procedures). The bug here is that MySQL service is being restarted at all :)

Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

There is indeed a documentation bug here:
The upgrade documentation and script is mentioning "-e galera_upgrade=true" where it shouldn't.

Could you tell us how you triggered this procedure?
If it's a simple use of the setup-infrastructure without this flag, then another issue should be created: after an upgrade M->N mysql can't be restarted.

Changed in openstack-ansible:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Vitaliy Nogin (vnogin) wrote :

Actually the following script was used: https://github.com/openstack/openstack-ansible/blob/master/scripts/run-upgrade.sh#L151
As you can see from the line 151 the mentioned above flag is used there.

From my point of view additional check should be added to triggering this option during upgrade (only if mysql version was changed).

Revision history for this message
Jesse Pretorius (jesse-pretorius) wrote :

I think that there may be a number of issues to look into here:

1 - I agree that the upgrade of the DB (and of RabbitMQ, in fact) should be handled as a separate step. I expect that perhaps this should actually be done prior to upgrading anything else.

2 - If the binary is not being upgraded, we need to ensure that the MariaDB service isn't restarted. If the container is restarted due to a container config change, then we need to ensure that it's done in serial to prevent cluster down-time.

3 - The failed bootstrap is likely due to the environment only having one MariaDB container. While this is not a typical deployment scenario for OSA deployers, it is common for test environments and it would be trivial to make the upgrade process automatically force a bootstrap if there is only one MariaDB container in the environment.

Thanks for registering the bug!

Revision history for this message
Jesse Pretorius (jesse-pretorius) wrote :

FYI for single node environments, https://review.openstack.org/382683 was recently merged (and backported to Newton) to handle the use-case for ensuring that the cluster bootstraps if there is only one node.

Does this resolve the original bug?

Revision history for this message
Vitaliy Nogin (vnogin) wrote :

Hi Jesse,

Thanks a lot for update. I'll try to check it.

Regards,
Vitaliy

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-galera_server (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/394545

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-galera_server (stable/mitaka)

Reviewed: https://review.openstack.org/394545
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_server/commit/?id=528bb2271cdb2c222f65df14fbded53f27415bd2
Submitter: Jenkins
Branch: stable/mitaka

commit 528bb2271cdb2c222f65df14fbded53f27415bd2
Author: Jimmy McCrory <email address hidden>
Date: Wed Oct 5 15:59:11 2016 -0700

    On single nodes use an empty cluster address

    When there is only one galera node, configure galera with an empty
    cluster address. Each time the mysql service starts on this node it will
    automatically create a new cluster.

    Closes-Bug: #1624327
    Closes-Bug: #1595143
    Change-Id: If653b1aacbd446a4ea5bb806a839dad40011b5b8
    (cherry picked from commit 21885c1f37fd28290f5fc04f40d5ad3581a80ed1)

tags: added: in-stable-mitaka
Changed in openstack-ansible:
assignee: nobody → Jesse Pretorius (jesse-pretorius)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-galera_server (master)

Reviewed: https://review.openstack.org/395118
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_server/commit/?id=ff4e9c6ece0849d08f36dda93315181fe5a75339
Submitter: Jenkins
Branch: master

commit ff4e9c6ece0849d08f36dda93315181fe5a75339
Author: Jesse Pretorius <email address hidden>
Date: Tue Nov 8 18:51:39 2016 +0000

    Allow a single-node MariaDB cluster to restart properly

    With the implementation of https://review.openstack.org/382683 a single
    MariaDB node has no peers configured, so there's no need to bootstrap
    the cluster on restart.

    This patch removes the condition in the handler which previously was
    needed to handle the re-bootstrap during a single node cluster service
    restart.

    Closes-Bug: #1595143
    Closes-Bug: #1639900
    Related-Bug: #1624327
    Change-Id: I599bbf0efa4e3d5abdf6d95c95d7983c464b3ae5

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-galera_server (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/397764

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-galera_server (stable/newton)

Reviewed: https://review.openstack.org/397764
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_server/commit/?id=cc58790ebefff19133b8114ad7d263409fe1ccc5
Submitter: Jenkins
Branch: stable/newton

commit cc58790ebefff19133b8114ad7d263409fe1ccc5
Author: Jesse Pretorius <email address hidden>
Date: Tue Nov 8 18:51:39 2016 +0000

    Allow a single-node MariaDB cluster to restart properly

    With the implementation of https://review.openstack.org/382683 a single
    MariaDB node has no peers configured, so there's no need to bootstrap
    the cluster on restart.

    This patch removes the condition in the handler which previously was
    needed to handle the re-bootstrap during a single node cluster service
    restart.

    Closes-Bug: #1595143
    Closes-Bug: #1639900
    Related-Bug: #1624327
    Change-Id: I599bbf0efa4e3d5abdf6d95c95d7983c464b3ae5
    (cherry picked from commit ff4e9c6ece0849d08f36dda93315181fe5a75339)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-galera_server (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/398385

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-galera_server (stable/mitaka)

Reviewed: https://review.openstack.org/398385
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_server/commit/?id=a9627447c0d5504d717b63f6677ba254fc86f421
Submitter: Jenkins
Branch: stable/mitaka

commit a9627447c0d5504d717b63f6677ba254fc86f421
Author: Jesse Pretorius <email address hidden>
Date: Tue Nov 8 18:51:39 2016 +0000

    Allow a single-node MariaDB cluster to restart properly

    With the implementation of https://review.openstack.org/382683 a single
    MariaDB node has no peers configured, so there's no need to bootstrap
    the cluster on restart.

    This patch removes the condition in the handler which previously was
    needed to handle the re-bootstrap during a single node cluster service
    restart.

    Closes-Bug: #1595143
    Closes-Bug: #1639900
    Related-Bug: #1624327
    Change-Id: I599bbf0efa4e3d5abdf6d95c95d7983c464b3ae5
    (cherry picked from commit cc58790ebefff19133b8114ad7d263409fe1ccc5)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-galera_server 15.0.0.0b1

This issue was fixed in the openstack/openstack-ansible-galera_server 15.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-galera_server 14.0.3

This issue was fixed in the openstack/openstack-ansible-galera_server 14.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-galera_server 13.3.9

This issue was fixed in the openstack/openstack-ansible-galera_server 13.3.9 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-galera_server 14.0.3

This issue was fixed in the openstack/openstack-ansible-galera_server 14.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.