IPMI sensors alert on Add-On slots missing cards
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
hw-health-charm |
Won't Fix
|
Low
|
Unassigned |
Bug Description
It was discovered that the updated ipmi checks appear to alert on Presence sensor "Entity Absent".
When investigating this issue, I found that the presence sensor was checking on all PCI slots whether there was a card present or not.
As an example:
This line created an error on all units of a cloud that didn't have PCI slot 3 filled with a card (which is a valid config):
ubuntu@
ID | Name | Type | Reading | Units | Event
50 | Presence | Entity Presence | N/A | N/A | 'Entity Absent'
ubuntu@
ID | Name | Type | Reading | Units | Event
50 | Add-in Card 3 Presence | Entity Presence | N/A | N/A | 'Entity Absent'
This is a present card and is not alerting:
ubuntu@
ID | Name | Type | Reading | Units | Event
49 | Add-in Card 1 Presence | Entity Presence | N/A | N/A | 'Entity Present'
This is able to be ignored as a workaround with:
ipmi_check_
This specifically tells the ipmi-sensors command to ignore sensor number 50. This works, but maybe "Entity Absent" reports on Presence sensors should be automatically binned with "--ignore-
Changed in charm-hw-health: | |
importance: | Undecided → High |
Changed in charm-hw-health: | |
assignee: | nobody → David O Neill (dmzoneill) |
status: | New → In Progress |
Changed in charm-hw-health: | |
assignee: | David O Neill (dmzoneill) → nobody |
status: | In Progress → New |
Changed in charm-hw-health: | |
status: | New → Triaged |
tags: | added: bseng-481 |
The code may need to add "--noentityabsent" to avoid false positives across all environments for pci slots missing cards.