primary controller database resyncs at the end of the cluster formation

Bug #1592401 reported by Michael Polenchuk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
Medium
Alex Schultz
Mitaka
Fix Released
Medium
Alex Schultz

Bug Description

Detailed bug description:
 Durring galera cluster formation, the primary-controller database is being restarted after the other databases join the cluster. This shows up as a keystone db task failure because the mysql instance may restart while being written to causing a task failure.

 Please see all info at https://ci.fuel-infra.org/job/10.0-community.main.ubuntu.bvt_2/240/

Steps to reproduce:
 1. Create cluster with Neutron
 2. Add 3 nodes with controller role
 3. Add 3 nodes with compute and ceph-osd role
 4. Deploy the cluster
 ....

Expected results:
 Deployment will be successful

Actual result:
 Failed tasks: Task[primary-keystone/2]
 (/Stage[main]/Keystone::Db::Sync/Exec[keystone-manage db_sync])
 Failed to call refresh: keystone-manage db_sync returned 1 instead of one of [0]

Description of the environment:
 See link above

Revision history for this message
Dmitry Klenov (dklenov) wrote :

Raising to critical as it is BVT blocker

Changed in fuel:
milestone: none → 10.0
importance: Undecided → Critical
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
tags: added: swarm-blocker
Changed in fuel:
status: New → Confirmed
Dmitry Klenov (dklenov)
tags: removed: swarm-blocker
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Maksim Malchuk (mmalchuk)
Revision history for this message
Maksim Malchuk (mmalchuk) wrote :

https://ci.fuel-infra.org/job/10.0-community.main.ubuntu.bvt_2/242/ - green
Lowering to High, because the race condition in effect.

Changed in fuel:
importance: Critical → High
tags: added: area-library team-bugfix
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

It's a race condition as other nodes are syncing data... In some rare conditions database will not be available for couple seconds. As we discussed there should be wait_for_backend should be moved to own task and all others should depend on it when all DBs are synced.

Changed in fuel:
assignee: Maksim Malchuk (mmalchuk) → Alex Schultz (alex-schultz)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/330254

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Alex Schultz (alex-schultz) wrote : Re: [bvt] keystone-manage db_sync failed

I believe this is related to the way we are determining the master when the cluster is initially being form. The logs http://paste.openstack.org/show/516381/ point to node-2 thinking it's in a split brain around the same time as the keystone db task runs. This causes mysql to get restarted which then breaks the db sync process.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The thing is that one and only one node shall be running with the --wsrep-new-cluster. And the log snippet shows there are few:
2016-06-14T06:30:23.444037+00:00 info: INFO: p_mysqld: check_if_galera_pc(): My neighbour is Primary Component with GTID: e6750b10-31f8-11e6-9424-523f05ea251b:23
2016-06-14T06:30:23.460831+00:00 err: ERROR: p_mysqld: check_if_galera_pc(): But I'm running a new cluster, PID:16029, this is a split-brain!

This is considered harmful for the cluster. I'm not sure we can fix anything here but only retry deployment tasks to avoid "stabilization" corner cases for DB cluster

Revision history for this message
Alex Schultz (alex-schultz) wrote :

Bogdan, There's an issue with the ocf script. The logs only show that the primary-controller had the --wsrep-new-cluster. The other bug was fixed with a hack. I have proposed a fix for the underlying issue as we have an issue with the cluster formation where the primary controller ends up getting resynced to the cluster after the other nodes join. I'm going to remove the duplicate and use this to fix the underlying issue.

summary: - [bvt] keystone-manage db_sync failed
+ primary controller database resyncs at the end of the cluster formation
description: updated
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I believe that's a medium bug, once we fixed db sync retries, a one node restarted at the end of the cluster bootstrap is not a high issue, it brings no downtime as there are 2 more at least

Changed in fuel:
importance: High → Medium
Revision history for this message
Alex Schultz (alex-schultz) wrote :

That's not completely true as when the database goes away any connections connected to that mysql instance get interrupted. We added the retry to keystone db task, but that won't prevent this from coming up again. We should not needless restart the mysql during initial setup. Its fine to lower it for now, but I foresee this being a high issue in the future.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I inspected logs and here is a snippet of split-brain detected http://pastebin.com/4FykZKQs
@Alex, what do you think what would/should be events flow *with* the patch?

My vision is that all nodes but one and only one shall eventually step down with the latter message, otherwise I cannot predict the future progress to be made to several clusters, but pretty destructive it would be, if kept running a few with --wsrep-new-cluster.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

Why are we even using wsrep-new-cluster after the systems are bootstrapped together? Given the way we are determining the master by looking at the most common GTID, why do we even use this flag? We should use the 0'd GTID to determine if the node needs to be bootstraped. The issue that we have is that all the GTIDs are identical why should we restart mysql as we've got a consistent cluster.

If we're looking for a better way to do master elections within the ocf script, why do we not use the notify command to let galera notify which host is the master and use that information rather than try and determine it ourselves? http://galeracluster.com/documentation-webpages/notificationcmd.html

Based on that paste, we should do nothing because the cluster is consistent and we should not reboot a node. We should only ever end up with one node still running the wsrep-new-cluster and that was the primary controller at the end of the cluster bootstrap. We can introduce a restart if you don't want any nodes to be running this at the end but then we need to add a wait/check until the cluster matches this expectation within the deployment and not start trying to use the cluster until this condition occurs. With my proposed change, we ned up with 1 bootstraps, 2 syncs from 1, 3 syncs from 2 and the cluster = 1+2+3. cluster is UP with no service disruption. Without this change, 1 bootstraps, 2 syncs from 1, 3 syncs from 2, 1 desyncs and resyncs from 3 (with partial service disruption during 1 resync) and cluster = 1+2+3.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/330254
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=26f1690d10a5c8f401033dff00166c14bf77d4ac
Submitter: Jenkins
Branch: master

commit 26f1690d10a5c8f401033dff00166c14bf77d4ac
Author: Alex Schultz <email address hidden>
Date: Wed Jun 15 16:25:16 2016 -0600

    Stop looking for master once latest seqno found

    This change updates how we look for a master by stopping once we have
    found a service with the largest seqno. Previously if all servers
    had the same seqno then we would return the last server as a master
    rather than the first. This had the side effect that during bootstrap
    when the first server was the original master, it would be demoted for
    each new server that was added to the group. This should prevent the
    ocf script from shuffling masters if they have the same seqno. We will
    always pick the first server rather than the last.

    Change-Id: Iacbd2e2ec403985a1ff52880669b1bec62dbbaba
    Closes-Bug: #1592401
    Related-Bug: #1592819

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/337717

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/337717
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=878ad54a04ff5bcb0b8a2650b55ac3f1498dca76
Submitter: Jenkins
Branch: stable/mitaka

commit 878ad54a04ff5bcb0b8a2650b55ac3f1498dca76
Author: Alex Schultz <email address hidden>
Date: Wed Jun 15 16:25:16 2016 -0600

    Stop looking for master once latest seqno found

    This change updates how we look for a master by stopping once we have
    found a service with the largest seqno. Previously if all servers
    had the same seqno then we would return the last server as a master
    rather than the first. This had the side effect that during bootstrap
    when the first server was the original master, it would be demoted for
    each new server that was added to the group. This should prevent the
    ocf script from shuffling masters if they have the same seqno. We will
    always pick the first server rather than the last.

    Change-Id: Iacbd2e2ec403985a1ff52880669b1bec62dbbaba
    Closes-Bug: #1592401
    Related-Bug: #1592819
    (cherry picked from commit 26f1690d10a5c8f401033dff00166c14bf77d4ac)

tags: added: on-verification
Revision history for this message
Andrey Lavrentyev (alavrentyev) wrote :

Wasn't able to reproduce it on 9.1 snapshot #76. Deployment was a success.

[root@nailgun ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 495
cat /etc/fuel_build_number:
 495
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-library9.0-9.0.0-1.mos8495.noarch
 rubygem-astute-9.0.0-1.mos753.noarch
 fuel-release-9.0.0-1.mos6349.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8748.noarch
 shotgun-9.0.0-1.mos90.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-openstack-metadata-9.0.0-1.mos8748.noarch
 python-packetary-9.0.0-1.mos142.noarch
 nailgun-mcagents-9.0.0-1.mos753.noarch
 fuel-utils-9.0.0-1.mos8495.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-misc-9.0.0-1.mos8495.noarch
 fuel-ostf-9.0.0-1.mos938.noarch
 fuel-notify-9.0.0-1.mos8495.noarch
 fuel-nailgun-9.0.0-1.mos8748.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-mirror-9.0.0-1.mos142.noarch
 fuel-migrate-9.0.0-1.mos8495.noarch

MOS_CENTOS_OS_MIRROR_ID: os-2016-06-23-135731
MOS_CENTOS_PROPOSED_MIRROR_ID: proposed-2016-07-29-200321
MOS_CENTOS_UPDATES_MIRROR_ID: updates-2016-06-23-135916
MOS_CENTOS_SECURITY_MIRROR_ID: security-2016-06-23-140002
MOS_CENTOS_HOLDBACK_MIRROR_ID: holdback-2016-06-23-140047
MOS_CENTOS_HOTFIX_MIRROR_ID: hotfix-2016-07-18-162958
MOS_UBUNTU_MIRROR_ID: 9.0-2016-07-29-200321
UBUNTU_MIRROR_ID: ubuntu-2016-07-30-170657
CENTOS_MIRROR_ID: centos-7.2.1511-2016-05-31-083834

tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.