Provide ease of use for responding to IPMI alerts
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
hw-health-charm |
Won't Fix
|
Wishlist
|
Unassigned |
Bug Description
When investigating alerts from Nagios for IPMI alerts, we are able to utilize the ops actions from the charm such as:
juju run-action --wait <hw-health/X> [show-sel|
However, determining the proper hw-health unit name from the alert requires several steps from the operator to first determine which juju machine matches the hostname (which is not included in juju machines output for machines deployed to older juju models), then they must query status of charms on that machine and find the hw-health subordinate of one of the principle applications deployed to that machine number.
It would be very helpful to operations for either alerts for hw-health to include the unit name (while this is possible with nrpe configurations, it's not commonly used for physical nrpe monitors) or provide the hostname of the machine being monitored in the juju status of the unit.
For example:
hw-health/0 active idle 10.0.0.1 ready
could become:
hw-health/0 active idle 10.0.0.1 ready - Monitoring $(hostname)
or alert status sutput could provide the hw-health unit name in ipmi.py output.
For example:
OK: IPMI Status: OK | '*somedata here*'
could become:
OK: hw-health/0 IPMI Status: OK
summary: |
- Add hostname of server being monitored to juju status + Provide ease of use for responding to IPMI alerts |
Changed in charm-hw-health: | |
importance: | Undecided → Wishlist |
This charm is no longer being actively maintained. Please consider using the new hardware- observer- operator instead. (https:/ /github. com/canonical/ hardware- observer- operator)
This issue is not critical, therefore, I mark it as "won't fix"