cron_ipmi_sensors.py can get blocked if PID file not removed

Bug #1838562 reported by Zachary Zehring
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
hw-health-charm
Fix Released
High
Unassigned

Bug Description

Brought up 18 new machines with charm installed. 3 out of the 18 servers failed to generate /var/lib/nagios/ipmi_sensors.out file and alerted nagios. Found that cron_ipmi_sensors.py would not continuously run due to there being an existing PID file at /var/run/nagios/check_ipmi_sensors.pid . However, process was not running. Removed PID file manually and worked again. Looks like if there's an exception in the try block in cron_ipmi_sensors.py, os.remove(CHECK_IPMI_PID) is not called.

Related branches

Xav Paice (xavpaice)
Changed in hw-health-charm:
importance: Undecided → High
Xav Paice (xavpaice)
Changed in hw-health-charm:
status: New → Confirmed
Xav Paice (xavpaice)
Changed in hw-health-charm:
assignee: nobody → Xav Paice (xavpaice)
Xav Paice (xavpaice)
Changed in hw-health-charm:
status: Confirmed → In Progress
Xav Paice (xavpaice)
Changed in hw-health-charm:
status: In Progress → Fix Committed
assignee: Xav Paice (xavpaice) → nobody
Joe Guo (guoqiao)
Changed in charm-hw-health:
assignee: nobody → Xav Paice (xavpaice)
assignee: Xav Paice (xavpaice) → nobody
Revision history for this message
Andrea Ieri (aieri) wrote :

this was released in cs:hw-health-1 (20.05)

Changed in charm-hw-health:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.