heka_monitoring_filter out of memory
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StackLight |
Fix Released
|
High
|
guillaume thouvenin |
Bug Description
The problem reproduced after this https:/
From logs:
2016/02/15 13:55:47 Plugin 'heka_monitorin
2016/02/15 13:55:47 Plugin 'heka_monitorin
2016/02/15 13:55:47 Plugin 'heka_monitorin
So 10 minutes after restarting LMA the heka_monitoring
Design explanation:
"The problem is that we reached a limit in the size of memory used by a plugin filter. The heka_monitoring is becoming to big and the result is that no data are sent. It is the first effect. The second effect is that there is a bug in our code and we keep data that cannot be send and we add new ones. So the heka_monitoring filter is eating more and more memory. At some points it is killed by heka. So the fact the the filter ran out of memory is not linked to the issue with elasticsearch. It is another bug."
Problem observation:
Open Grafana "LMA self-monitoring" dashboard for any controller. Check "ENCODER PLUGINS" row. The collection of metrics there should be stopped after plugin crash.
Environment:
3 controllers
15 compute + ceph nodes
1 elasticsearch node
1 influxdb node
Changed in lma-toolchain: | |
status: | New → Confirmed |
assignee: | nobody → LMA-Toolchain Fuel Plugins (mos-lma-toolchain) |
importance: | Undecided → Medium |
Changed in lma-toolchain: | |
assignee: | LMA-Toolchain Fuel Plugins (mos-lma-toolchain) → guillaume thouvenin (guillaume-thouvenin) |
Changed in lma-toolchain: | |
status: | In Progress → Fix Committed |
Changed in lma-toolchain: | |
status: | Fix Committed → Fix Released |
This found on Fuel 8.0 build 552, LMA toolchain from origin/master