clustercheck service should be in database role and not in controlleropenstack role

Bug #1715847 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Michele Baldessari

Bug Description

The clustercheck service is currently in the ControllerOpenstack role which represents a controller without the DB. Since the clustercheck service/container always talks to the SQL server via a localhost connection it *has* to run on the very same node that hosts the DB.

In a containerized deployment this error shows up with db syncs simply hanging because haproxy will stop serving port 3306 because the clustercheck service on port 9200 cannot talk to mysql locally.

First observed via:
https://bugzilla.redhat.com/show_bug.cgi?id=1486037

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/502031

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/502175

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/502031
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=1760079dfe5905f2e696b9fc5c729cffa44554ae
Submitter: Jenkins
Branch: master

commit 1760079dfe5905f2e696b9fc5c729cffa44554ae
Author: Michele Baldessari <email address hidden>
Date: Fri Sep 8 12:31:18 2017 +0200

    Move the clustercheck service to the DB role

    The clustercheck service is currently in the ControllerOpenstack role
    which represents a controller without the DB. Since the clustercheck
    service/container always talks to the SQL server via a localhost
    connection it *has* to run on the very same node that hosts the DB.

    In a containerized deployment this error shows up with db syncs simply
    hanging because haproxy will stop serving port 3306 because the
    clustercheck service on port 9200 cannot talk to mysql locally.

    Errors like this will be logged when trying to connect to the DB VIP:
    mysql -u heat -h 172.17.1.13 -p3UazsaeTC64V9UvEcJ3GZ9rbd
    ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0

    Fix this by making sure that the clustercheck service runs on
    the DB role.

    Change-Id: Iec4c9678d8b8d44e002c1e53110dedc0674359fb
    Closes-Bug: #1715847

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/pike)

Reviewed: https://review.openstack.org/502175
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f166254ad85db888faae54a4f96c62819f56fd75
Submitter: Jenkins
Branch: stable/pike

commit f166254ad85db888faae54a4f96c62819f56fd75
Author: Michele Baldessari <email address hidden>
Date: Fri Sep 8 12:31:18 2017 +0200

    Move the clustercheck service to the DB role

    The clustercheck service is currently in the ControllerOpenstack role
    which represents a controller without the DB. Since the clustercheck
    service/container always talks to the SQL server via a localhost
    connection it *has* to run on the very same node that hosts the DB.

    In a containerized deployment this error shows up with db syncs simply
    hanging because haproxy will stop serving port 3306 because the
    clustercheck service on port 9200 cannot talk to mysql locally.

    Errors like this will be logged when trying to connect to the DB VIP:
    mysql -u heat -h 172.17.1.13 -p3UazsaeTC64V9UvEcJ3GZ9rbd
    ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0

    Fix this by making sure that the clustercheck service runs on
    the DB role.

    Change-Id: Iec4c9678d8b8d44e002c1e53110dedc0674359fb
    Closes-Bug: #1715847
    (cherry picked from commit 1760079dfe5905f2e696b9fc5c729cffa44554ae)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.1

This issue was fixed in the openstack/tripleo-heat-templates 7.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.0.0.0b1

This issue was fixed in the openstack/tripleo-heat-templates 8.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.