docker clustercheck service overrides docker mysql firewall rules

Bug #1728918 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Michele Baldessari

Bug Description

When deploying a composable HA overcloud with a database role split off to separate nodes we could observe a deployment failure due to galera never starting up properly.

The reason for this was that instead of having the firewall rules for the galera bundle applied (i.e. those with the extra control-port for the bundle), we would see the firewall rules for the BM galera service. E.g. we would see the following on the host:
tripleo.mysql.firewall_rules: {
  104 mysql galera: {
    dport: [ 873, 3306, 4444, 4567, 4568, 9200 ]

Instead of the correct mysq bundle firewall rules:
tripleo.mysql.firewall_rules:
  104 mysql galera-bundle:
    dport: [ 873, 3123, 3306, 4444, 4567, 4568, 9200 ]

The reason for this is the following piece of code in https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/pacemaker/clustercheck.yaml#L62:
...
  MysqlPuppetBase:
    type: ../../../puppet/services/pacemaker/database/mysql.yaml
    properties:
      EndpointMap: {get_param: EndpointMap}
      ServiceData: {get_param: ServiceData}
      ServiceNetMap: {get_param: ServiceNetMap}
      DefaultPasswords: {get_param: DefaultPasswords}
      RoleName: {get_param: RoleName}
      RoleParameters: {get_param: RoleParameters}

outputs:
  role_data:
    description: Containerized service clustercheck using composable services.
    value:
      service_name: clustercheck
      config_settings: {get_attr: [MysqlPuppetBase, role_data, config_settings]}
logging_source: {get_attr: [MysqlPuppetBase, role_data, logging_source]}
...

Depending on the ordering of the clustercheck service within the role (before or after the mysql service), the above code will override the tripleo.mysql.firewall_rules with the wrong rules.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/516649

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/516649
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=3df6a4204a85b119cd67ccf176d5b72f9e550da6
Submitter: Zuul
Branch: master

commit 3df6a4204a85b119cd67ccf176d5b72f9e550da6
Author: Michele Baldessari <email address hidden>
Date: Tue Oct 31 13:23:17 2017 +0100

    Fix iptables rules override bug in clustercheck docker service

    When deploying a composable HA overcloud with a database role split off
    to separate nodes we could observe a deployment failure due to galera
    never starting up properly.

    The reason for this was that instead of having the firewall rules for
    the galera bundle applied (i.e. those with the extra control-port for
    the bundle), we would see the firewall rules for the BM galera service.
    E.g. we would see the following on the host:

    tripleo.mysql.firewall_rules: {
      104 mysql galera: {
        dport: [ 873, 3306, 4444, 4567, 4568, 9200 ]

    Instead of the correct mysq bundle firewall rules:
    tripleo.mysql.firewall_rules:
      104 mysql galera-bundle:
        dport: [ 873, 3123, 3306, 4444, 4567, 4568, 9200 ]

    The reason for this is the following piece of code in
    https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/pacemaker/clustercheck.yaml#L62:
    ...
      MysqlPuppetBase:
        type: ../../../puppet/services/pacemaker/database/mysql.yaml
        properties:
          EndpointMap: {get_param: EndpointMap}
          ServiceData: {get_param: ServiceData}
          ServiceNetMap: {get_param: ServiceNetMap}
          DefaultPasswords: {get_param: DefaultPasswords}
          RoleName: {get_param: RoleName}
          RoleParameters: {get_param: RoleParameters}

    outputs:
      role_data:
        description: Containerized service clustercheck using composable services.
        value:
          service_name: clustercheck
          config_settings: {get_attr: [MysqlPuppetBase, role_data, config_settings]}
    logging_source: {get_attr: [MysqlPuppetBase, role_data, logging_source]}
    ...

    Depending on the ordering of the clustercheck service within the role
    (before or after the mysql service), the above code will override the
    tripleo.mysql.firewall_rules with the wrong rules because we derive from
    puppet/services/... which contain the BM firewall rules.

    Let's just switch to derive from the docker service so we do not risk
    getting the wrong firewall rules during the map_merge.

    Tested this change successfully on a composable HA with split-off DB
    nodes.

    Change-Id: Ie87b327fe7981d905f8762d3944a0e950dbd0bfa
    Closes-Bug: #1728918

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/517576

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/pike)

Reviewed: https://review.openstack.org/517576
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=ba80b2bec5bcfc638c670debf11a98bd491ec996
Submitter: Zuul
Branch: stable/pike

commit ba80b2bec5bcfc638c670debf11a98bd491ec996
Author: Michele Baldessari <email address hidden>
Date: Tue Oct 31 13:23:17 2017 +0100

    Fix iptables rules override bug in clustercheck docker service

    When deploying a composable HA overcloud with a database role split off
    to separate nodes we could observe a deployment failure due to galera
    never starting up properly.

    The reason for this was that instead of having the firewall rules for
    the galera bundle applied (i.e. those with the extra control-port for
    the bundle), we would see the firewall rules for the BM galera service.
    E.g. we would see the following on the host:

    tripleo.mysql.firewall_rules: {
      104 mysql galera: {
        dport: [ 873, 3306, 4444, 4567, 4568, 9200 ]

    Instead of the correct mysq bundle firewall rules:
    tripleo.mysql.firewall_rules:
      104 mysql galera-bundle:
        dport: [ 873, 3123, 3306, 4444, 4567, 4568, 9200 ]

    The reason for this is the following piece of code in
    https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/pacemaker/clustercheck.yaml#L62:
    ...
      MysqlPuppetBase:
        type: ../../../puppet/services/pacemaker/database/mysql.yaml
        properties:
          EndpointMap: {get_param: EndpointMap}
          ServiceData: {get_param: ServiceData}
          ServiceNetMap: {get_param: ServiceNetMap}
          DefaultPasswords: {get_param: DefaultPasswords}
          RoleName: {get_param: RoleName}
          RoleParameters: {get_param: RoleParameters}

    outputs:
      role_data:
        description: Containerized service clustercheck using composable services.
        value:
          service_name: clustercheck
          config_settings: {get_attr: [MysqlPuppetBase, role_data, config_settings]}
    logging_source: {get_attr: [MysqlPuppetBase, role_data, logging_source]}
    ...

    Depending on the ordering of the clustercheck service within the role
    (before or after the mysql service), the above code will override the
    tripleo.mysql.firewall_rules with the wrong rules because we derive from
    puppet/services/... which contain the BM firewall rules.

    Let's just switch to derive from the docker service so we do not risk
    getting the wrong firewall rules during the map_merge.

    Tested this change successfully on a composable HA with split-off DB
    nodes.

    Change-Id: Ie87b327fe7981d905f8762d3944a0e950dbd0bfa
    Closes-Bug: #1728918
    (cherry picked from commit 3df6a4204a85b119cd67ccf176d5b72f9e550da6)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.4

This issue was fixed in the openstack/tripleo-heat-templates 7.0.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.0.0.0b2

This issue was fixed in the openstack/tripleo-heat-templates 8.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.