nagios3 crashes with livestatus and downtimed checks

Bug #1507471 reported by Brad Marshall
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
nagios3 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Issue
-------
When nagios3 is configured to have livestatus from check-mk-livestatus as a broker module, and checks have a downtime applied to them it will crash when the logs rotate. This shows up in /var/log/nagios3/nagios.log as:

   [1445238000] Caught SIGSEGV, shutting down...

Steps to reproduce
-----------------------------
* Install nagios3 and check-mk-livestatus

* Edit /etc/nagios3/nagios.cfg to enable livestatus:

    broker_module=/usr/lib/check_mk/livestatus.o /var/lib/nagios3/livestatus/socket

* To speed up testing, edit /etc/nagios3/nagios.cfg to set:

    log_rotation_method=h

This will cause log rotation to occur hourly, rather than weekly or daily.

* Restart nagios to apply these fixes.

* Apply a downtime on any host or service to last until the top of the next hour

* Wait until that time, and see that nagios crashes with the SIGSEGV error.

Solution
-------------
After some searching around I found a patch at http://lists.mathias-kettner.de/pipermail/checkmk-en/2013-December/011087.html which I have applied to nagios source and thrown up into a PPA at ppa:brad-marshall/nagios for testing. Reproducing the steps above but with the upgraded package means it no longer crashes.

The testing was done with latest patched Trusty, specifically:

$ lsb_release -d
Description: Ubuntu 14.04.3 LTS

$ dpkg-query -W nagios3
nagios3 3.5.1-1ubuntu1

$ dpkg-query -W check-mk-livestatus
check-mk-livestatus 1.2.2p3-1

Please let us know if you need any further information, or any testing done.

Revision history for this message
Robie Basak (racb) wrote :

Patching nagios3 is the wrong thing to do and may cause other things to break by changing nagios' ABI. The bug is in check-mk which assumes an incorrect nagios3 ABI. See duplicate bug.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nagios3 (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.