StackLight

Bug #1552772
Activity log

Activity log for bug #1552772

Date	Who	What changed	Old value	New value	Message
2016-03-03 16:03:28	Swann Croiset	bug			added bug
2016-03-03 16:03:28	Swann Croiset	attachment added		bug_dash_apache_nagios_200nodes.png https://bugs.launchpad.net/bugs/1552772/+attachment/4587528/+files/bug_dash_apache_nagios_200nodes.png
2016-03-03 16:05:16	Swann Croiset	lma-toolchain: assignee		LMA-Toolchain Fuel Plugins (mos-lma-toolchain)
2016-03-08 12:07:09	Swann Croiset	lma-toolchain: importance	Undecided	Medium
2016-05-10 17:37:29	Swann Croiset	description	MOS 8.0 build 589, ElasticSearch from origin/master Environment: 3 controllers 193 compute (20 of them are also ceph nodes) 3 elasticsearch node 3 influxdb nodes 1 infra alerting node (apache2/nagios3) How to reproduce: just deploy the env described above Actual result: Some service status are : "UNKNOWN: No data received for at least 130 seconds " (and flap OK -> UNKN -> OK ..) The operator receive false alerts * CPU 100% usage * high fork rate ~110/s Expected result: services status stays OK or at least have "stable" status Diagnostic: Apache cannot handle the load: all nodes send their status (AFD) directly to Nagios through CGI and the aggregator send cluster status (GSE) There are 1109 afd/gse with post message to apache every 10 seconds: ~111 req/s	MOS 8.0 build 589, Infrastructure Alerting plugin from origin/master Environment: 3 controllers 193 compute (20 of them are also ceph nodes) 3 elasticsearch node 3 influxdb nodes 1 infra alerting node (apache2/nagios3) How to reproduce: just deploy the env described above Actual result: Some service status are : "UNKNOWN: No data received for at least 130 seconds " (and flap OK -> UNKN -> OK ..) The operator receive false alerts * CPU 100% usage * high fork rate ~110/s Expected result: services status stays OK or at least have "stable" status Diagnostic: Apache cannot handle the load: all nodes send their status (AFD) directly to Nagios through CGI and the aggregator send cluster status (GSE) There are 1109 afd/gse with post message to apache every 10 seconds: ~111 req/s
2016-05-12 08:46:05	OpenStack Infra	lma-toolchain: status	Confirmed	In Progress
2016-05-12 08:46:05	OpenStack Infra	lma-toolchain: assignee	LMA-Toolchain Fuel Plugins (mos-lma-toolchain)	Swann Croiset (swann-w)
2016-05-13 16:23:35	OpenStack Infra	lma-toolchain: status	In Progress	Fix Committed
2016-06-17 11:45:18	Patrick Petit	lma-toolchain: milestone		0.10.0
2016-07-26 09:09:25	Simon Pasquier	lma-toolchain: status	Fix Committed	Fix Released