check_rabbitmq_cluster partition check is not enabled by default (due to management_plugin=false)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack RabbitMQ Server Charm |
In Progress
|
Undecided
|
Trent Lloyd |
Bug Description
The check_rabbitmq_
= Justification/
Partitions are a frequent source of problems in deployments, especially since the default cluster_
I looked into why this check depends on the management_plugin, which is because the nagios checks run as the 'nrpe' user which does not have access to run 'rabbitmqctl cluster_status'. So the Management API gives a HTTP API to make the request from the unprivileged user to get the same info. Someone did contribute an alternative that runs the cluster_status with cron to output a file that is then read by the nrpe check but it was abandoned and never reviewed (Bug #1548679, https:/
The management API is also useful to have enabled generally, as you can get some good statistics and information from it, to help with support cases including which queues/users are busy, etc, and we have sometimes wanted it in the course of a support case. The API is available to the network over HTTP and currently does not have (at least charmed) SSL support however it does setup authenticated users with a random password, and the user only has 'monitoring' access and there are no users with the administrator tag created by default.
So I think it's safe and sensible to enable by default, but it will result in an extra network service appearing after a charm upgrade, so I guess that should be considered. But I think over-all it would be a positive change especially as the API is otherwise useful and the check is really quite critical.
Changed in charm-rabbitmq-server: | |
assignee: | nobody → Trent Lloyd (lathiat) |
I'm happy to propose the Merge request to toggle the change, but wanted input if there are any objections to doing so.