Ceilometer

Bug #1666072
Comment #10

Comment 10 for bug 1666072

Revision history for this message

Sagi (Sergey) Shnaidman (sshnaidm) wrote on 2017-02-21:

#10

invest.txt Edit (17.1 KiB, text/plain)

After an investigation, I found some problems with apache threads on controller: http://paste.openstack.org/show/599888/ (the same file is attached)

Seems like all 32 possible apache "preforks" (and ServerLimit is 32) are busy with request GET /v1/capabilities/ HTTP/1.1 which is initiated to gnocchi_wsgi:
/var/log/httpd/gnocchi_wsgi_access.log:192.168.24.11 - - [21/Feb/2017:18:27:26 +0000] "GET /v1/capabilities/ HTTP/1.1" 500 531 "-" "ceilometer-agent-notification keystoneauth1/2.18.0 python-requests/2.11.1 CPython/2.7.5"

Processing them in parallel, apache can't fork another preforks (because it reached the limit) and no new connection is accepted, including those of nova, neutron, etc - of pingtest. Restarting apache usually fixes it.
I tried to run the same jobs without gnocchi enabled and they passed:
https://review.openstack.org/#/c/436497/

So problem is most likely in gnocchi or its interconnection with ceilometer.
So I see a few solutions:
1) Investigate and solve it in gnocchi/ceilometer projects
2) Increase apache ServerLimit to something more than 32 in low-memory-usage template: https://github.com/openstack/tripleo-heat-templates/blob/c99c48b84e20925b4f4b728e9b103d6c8bcb3d11/environments/low-memory-usage.yaml#L15-L15
3) To set up apache to work with mpm_event instead of mpm_prefork.