number_of_replicas is not configured when scaling up the cluster

Bug #1542300 reported by Swann Croiset
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StackLight
Fix Released
Medium
Swann Croiset

Bug Description

The number_of_replicas (settings in ES templates) is not configured when scaling the cluster from 1 to 3 nodes.

Diagnostic
========
The puppet ressource silently fails to delete then import the modified template, probably due the fact that the 1st ES instance is reconfigured to wait at least 2 nodes within the cluster before effectively start the cluster. But the 2nd and 3rd instances are configured ~10 minutes later during the scaling up.

The task reconfiguring the ES templates should be executed at post_deployment stage.

Steps to reproduce
===============
1/ deploy an environnement with one elasticsearch_kibana
2/ add 2 more elasticsearch_kibana nodes

Result
=====
verify the template setting curl -s 'http://10.109.31.5:9200/_template/log?pretty=true'|grep index.number_of_replicas
      "index.number_of_replicas" : "0",

Expected result
==============
curl -s 'http://10.109.31.5:9200/_template/log?pretty=true'|grep index.number_of_replicas
      "index.number_of_replicas" : "2",

Puppet logs:

It is missing the importation of the template:
2016-02-03 20:10:16 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[insert_template_log]/returns (debug): Exec try 1/6
2016-02-03 20:10:16 +0000 Exec[insert_template_log](provider=posix) (debug): Executing 'curl -sL -w "%{http_code}\n" -XPUT http://10.109.31.5:9200/_template/log -d @/usr/share/elasticsearch/
templates_import/elasticsearch-template-log.json -o /dev/null | egrep "(200|201)" > /dev/null'
2016-02-03 20:10:16 +0000 Puppet (debug): Executing 'curl -sL -w "%{http_code}\n" -XPUT http://10.109.31.5:9200/_template/log -d @/usr/share/elasticsearch/templates_import/elasticsearch-temp
late-log.json -o /dev/null | egrep "(200|201)" > /dev/null'

Current log:
2016-02-04 16:35:41 +0000 Elasticsearch::Template[log] (info): Starting to evaluate the resource
2016-02-04 16:35:41 +0000 Elasticsearch::Template[log] (info): Evaluated in 0.00 seconds
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/File[/usr/share/elasticsearch/templates_import/elasticsearch-template-log.jso
n] (info): Starting to evaluate the resource
2016-02-04 16:35:41 +0000 Puppet (info): Computing checksum on file /usr/share/elasticsearch/templates_import/elasticsearch-template-log.json
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/File[/usr/share/elasticsearch/templates_import/elasticsearch-template-log.jso
n] (info): Filebucketed /usr/share/elasticsearch/templates_import/elasticsearch-template-log.json to puppet with sum 37fa6a5efffe1231289303b72f10dddc
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/File[/usr/share/elasticsearch/templates_import/elasticsearch-template-log.jso
n]/content (notice): content changed '{md5}37fa6a5efffe1231289303b72f10dddc' to '{md5}8c18c739526997e49256640d8fd70c3e'
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/File[/usr/share/elasticsearch/templates_import/elasticsearch-template-log.jso
n] (info): Scheduling refresh of Exec[delete_template_log]
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/File[/usr/share/elasticsearch/templates_import/elasticsearch-template-log.jso
n] (debug): The container Elasticsearch::Template[log] will propagate my refresh event
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/File[/usr/share/elasticsearch/templates_import/elasticsearch-template-log.jso
n] (info): Evaluated in 0.02 seconds
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[delete_template_log] (info): Starting to evaluate the resource
2016-02-04 16:35:41 +0000 Exec[delete_template_log](provider=posix) (debug): Executing check 'test $(curl -s 'http://10.109.31.5:9200/_template/log?pretty=true' | wc -l) -gt 1'
2016-02-04 16:35:41 +0000 Puppet (debug): Executing 'test $(curl -s 'http://10.109.31.5:9200/_template/log?pretty=true' | wc -l) -gt 1'
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[delete_template_log]/returns (debug): Exec try 1/6
2016-02-04 16:35:41 +0000 Exec[delete_template_log](provider=posix) (debug): Executing 'curl -s -XDELETE http://10.109.31.5:9200/_template/log'
2016-02-04 16:35:41 +0000 Puppet (debug): Executing 'curl -s -XDELETE http://10.109.31.5:9200/_template/log'
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[delete_template_log] (notice): Triggered 'refresh' from 1 events
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[delete_template_log] (info): Scheduling refresh of Exec[insert_template_
log]
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[delete_template_log] (debug): The container Elasticsearch::Template[log]
 will propagate my refresh event
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[delete_template_log] (info): Evaluated in 0.04 seconds
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[insert_template_log] (info): Starting to evaluate the resource
2016-02-04 16:35:41 +0000 Exec[insert_template_log](provider=posix) (debug): Executing check 'test $(curl -s 'http://10.109.31.5:9200/_template/log?pretty=true' | wc -l) -gt 1'
2016-02-04 16:35:41 +0000 Puppet (debug): Executing 'test $(curl -s 'http://10.109.31.5:9200/_template/log?pretty=true' | wc -l) -gt 1'
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[insert_template_log] (notice): Triggered 'refresh' from 1 events
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[insert_template_log] (debug): The container Elasticsearch::Template[log]
 will propagate my refresh event
2016-02-04 16:35:41 +0000 /Stage[main]/Main/Lma_logging_analytics::Es_template[log]/Elasticsearch::Template[log]/Exec[insert_template_log] (info): Evaluated in 0.02 seconds
2016-02-04 16:35:41 +0000 Elasticsearch::Template[log] (info): Starting to evaluate the resource
2016-02-04 16:35:41 +0000 Elasticsearch::Template[log] (debug): The container Lma_logging_analytics::Es_template[log] will propagate my refresh event
2016-02-04 16:35:41 +0000 Elasticsearch::Template[log] (info): Evaluated in 0.00 seconds

Swann Croiset (swann-w)
Changed in lma-toolchain:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-elasticsearch-kibana (master)

Fix proposed to branch: master
Review: https://review.openstack.org/276708

Changed in lma-toolchain:
assignee: nobody → Swann Croiset (swann-w)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-elasticsearch-kibana (master)

Reviewed: https://review.openstack.org/276708
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-elasticsearch-kibana/commit/?id=b3795fdd938e27fa2a99045cdfa22f024b95ccad
Submitter: Jenkins
Branch: master

commit b3795fdd938e27fa2a99045cdfa22f024b95ccad
Author: Swann Croiset <email address hidden>
Date: Fri Feb 5 13:39:42 2016 +0100

    Provision services at post_deployment stage

    Fixes-bug: #1542300
    Fixes-bug: #1540951

    Change-Id: I9566b638b1636ce50b2fbc5733e1698807b71c22

Changed in lma-toolchain:
status: In Progress → Fix Committed
Changed in lma-toolchain:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.