IPMI sensors alert on Add-On slots missing cards
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| hw-health-charm |
High
|
Unassigned |
Bug Description
It was discovered that the updated ipmi checks appear to alert on Presence sensor "Entity Absent".
When investigating this issue, I found that the presence sensor was checking on all PCI slots whether there was a card present or not.
As an example:
This line created an error on all units of a cloud that didn't have PCI slot 3 filled with a card (which is a valid config):
ubuntu@
ID | Name | Type | Reading | Units | Event
50 | Presence | Entity Presence | N/A | N/A | 'Entity Absent'
ubuntu@
ID | Name | Type | Reading | Units | Event
50 | Add-in Card 3 Presence | Entity Presence | N/A | N/A | 'Entity Absent'
This is a present card and is not alerting:
ubuntu@
ID | Name | Type | Reading | Units | Event
49 | Add-in Card 1 Presence | Entity Presence | N/A | N/A | 'Entity Present'
This is able to be ignored as a workaround with:
ipmi_check_
This specifically tells the ipmi-sensors command to ignore sensor number 50. This works, but maybe "Entity Absent" reports on Presence sensors should be automatically binned with "--ignore-
Drew Freiberger (afreiberger) wrote : | #1 |
Changed in charm-hw-health: | |
importance: | Undecided → High |
Changed in charm-hw-health: | |
assignee: | nobody → David O Neill (dmzoneill) |
status: | New → In Progress |
Changed in charm-hw-health: | |
assignee: | David O Neill (dmzoneill) → nobody |
status: | In Progress → New |
Xav Paice (xavpaice) wrote : | #2 |
it's possible to work around this with:
juju config hw-health ipmi_check_
Changed in charm-hw-health: | |
status: | New → Triaged |
The code may need to add "--noentityabsent" to avoid false positives across all environments for pci slots missing cards.