check_octavia_loadbalancers needs more controls to reduce undesired alerts
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
charm-openstack-service-checks |
Fix Released
|
Medium
|
Unassigned |
Bug Description
The current Octavia load balancer checks don't allow for controlling which load balancers we monitor. It's either on (along with all the other octavia checks, via the check-octavia config option), or off (again, along with all the other octavia checks).
A few use cases to consider:
* Load balancers may be used by the end users, but are not necessary for correct functionality of the cloud. We may want to turn off monitoring of such user-created load balancers, so as to avoid alerting when a load balancer goes into a degraded state due to user actions, such as e.g. adding members without removing no-longer used members.
* Certain load balancers may be critical for overall cloud functionality, e.g. the load balancers associated with k8s instances. We may want to allow filtering based upon domain, project, and/or specific load balancer IDs so we can monitor those load balancers without having alerts fire for other non-critical load balancers.
Related branches
- Chris Sanders (community): Approve
-
Diff: 802 lines (+370/-138)9 files modifiedREADME.md (+41/-0)
config.yaml (+20/-0)
files/plugins/check_octavia.py (+137/-89)
lib/lib_openstack_service_checks.py (+37/-15)
tests/unit/conftest.py (+4/-11)
tests/unit/test_check_cinder_services.py (+1/-5)
tests/unit/test_check_contrail_analytics_alarms.py (+12/-14)
tests/unit/test_check_nova_services.py (+1/-4)
tests/unit/test_check_octavia.py (+117/-0)
Changed in charm-openstack-service-checks: | |
assignee: | nobody → Adam Dyess (addyess) |
Changed in charm-openstack-service-checks: | |
status: | Confirmed → Fix Committed |
Changed in charm-openstack-service-checks: | |
milestone: | none → 20.08 |
Changed in charm-openstack-service-checks: | |
assignee: | Adam Dyess (addyess) → nobody |
./files/ plugins/ check_octavia. py doesn't do any such filtering of bad loadbalancers.
It also only signals the first load_balancer with an issue.
I propose that ALL of the crit|warn loadbalancers be listed in the check. And then a filtering process be applied by configuration similar to contrail_ ignored_ alarms.