Ubuntu

nrpe removes its PID when scanned by nmap

Reported by Simon Déziel on 2013-02-16
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
nagios-nrpe (Debian)
Fix Released
Unknown
nagios-nrpe (Ubuntu)
High
Unassigned
Precise
Undecided
Unassigned

Bug Description

[Impact]

When scanning NRPE's port with nmap, the NRPE server removes its PID file and log an error. This cause the init script to lose track of the daemon because it normally uses its PID. This behavior is probably what caused many bug reports about "Network server bind failure (98: Address already in use)".

This problematic behavior is especially annoying when combined with vulnerability scanner performing DC-wide port scan (ex: OpenVAS).

The proposed fix (backported from Saucy) makes sure the daemon does not remove its PID when a TCP connection does not complete.

[Test Case]

1. Make sure NRPE is installed and running
  sudo apt-get install nagios-nrpe-server
2. Run nmap TCP Connect scan on the NRPE port from *another machine in the same LAN*
  sudo apt-get install nmap
  sudo nmap <target IP> -p 5666 -sT -PN
3. Notice those messages in the target's syslog
  May 30 17:20:22 log01 nrpe[19313]: Error: Network server getpeername() failure (107: Transport endpoint is not connected)
  May 30 17:20:22 log01 nrpe[19313]: Daemon shutdown

Note: the daemon shutdown message is wrong as it's still running.

[Regression Potential]

The proposed fix allows child processes to gracefully handle partial/incomplete TCP connections. The modified code is not used during normal operation so regular Nagios monitoring shouldn't be impacted.

The patch was also tested to work well on Precise so regression risk is fairly low.

[Other Info]

Along with the graceful handling of incomplete TCP connections, the hardening flags passed to the linker are corrected in the merge proposal. This other fix has very low regression risk as it is included in Ubuntu since Quantal (see LP: #1000379) and was meant to be included in Precise. It is a typo fix (with some side effects).

--- original bug report ---

During a nmap scan, NRPE logs this error and removes its PID :

Feb 15 22:35:05 pm nrpe[2917]: Error: Network server getpeername() failure (107: Transport endpoint is not connected)
Feb 15 22:35:05 pm nrpe[2917]: Daemon shutdown

Despite what it logs, the daemon is still running but since the PID file is gone, the init script stop working:

# ps aux| grep nrpe
nagios 2908 0.0 0.3 25344 1144 ? Ss 22:34 0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d

# /etc/init.d/nagios-nrpe-server status
 * nagios-nrpe is not running

Scanning the NRPE port again with nmap further confirms the PID removal behaviour (the PID logged is incremented too?):

Feb 15 22:36:19 pm nrpe[2922]: Error: Network server getpeername() failure (107: Transport endpoint is not connected)
Feb 15 22:36:19 pm nrpe[2922]: Cannot remove pidfile '/var/run/nagios/nrpe.pid' - check your privileges.
Feb 15 22:36:19 pm nrpe[2922]: Daemon shutdown

# ps aux| grep nrpe
nagios 2908 0.0 0.3 25344 1144 ? Ss 22:34 0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d

This problematic behaviour was confirmed on Lucid, Precise, Quantal and Raring.

Simon Déziel (sdeziel) wrote :

Here is a debdiff to apply the patch from Hiren Patel to Raring's version. I confirmed this to fix the bug.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nagios-nrpe (Ubuntu):
status: New → Confirmed

The attachment "nagios-nrpe-lp1126890.debdiff" of this bug report has been identified as being a patch in the form of a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Brian Murray (brian-murray) wrote :

I was trying to determine where this patch came from and this is the most likely thing I can find:

http://comments.gmane.org/gmane.network.nagios.devel/6774

Is the mailing list where you found it? Additionally, it does not look like this patch made it into upstream so it'd be good if you were to remind them about it. Then we could add some information to the patch about where it came from and in which upstream version it is fixed it so that in the future we can more easily drop the patch. Thanks!

Changed in nagios-nrpe (Ubuntu):
status: Confirmed → Incomplete
importance: Undecided → High
Iain Lane (laney) wrote :

I'm going to unsubscribe ubuntu-sponsors from the bug. Once you have provided the requsted information, please resubscribe the team so that someone can look at this bug again. Thanks! :-)

Simon Déziel (sdeziel) wrote :

Hi Brian,

Yes, the patch I submitted is based on the thread you found. I tried posting on the nagios-devel mailing list as you suggested but my membership was not accepted in 2 weeks and my non-member post never made it either. I tried subscribing with 2 different email addresses and no dice. Got no reply after emailing the mailing list admin contact and hanging around on #nagios and #nagios-devel didn't help.

A frustrating experience but I'd really like to see the patch included in Ubuntu. I'd appreciate you help with this. Thanks

Simon Déziel (sdeziel) wrote :

Finally got my nagios-devel mailing list subscription activated so I sent the patch for NRPE 2.14 in this thread http://sourceforge.net/mailarchive/forum.php?thread_name=517ED7B3.1040607%40gmail.com&forum_name=nagios-devel

Changed in nagios-nrpe (Ubuntu):
status: Incomplete → Confirmed
Jamon Camisso (jamon) wrote :

This bug breaks monitoring for me on over 30 production 12.04 servers. The patch should be reviewed for SRU inclusion since nagios3 (in main) and icinga (universe) are both affected by it. It requires manual intervention to bring the service back on each individual host, so this patch is fairly important to have back ported.

Michael Terry (mterry) wrote :

I sponsored an upload to precise by Simon Déziel for this. Subscribing ubuntu-sru.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nagios-nrpe - 2.13-3ubuntu2

---------------
nagios-nrpe (2.13-3ubuntu2) saucy; urgency=low

  * debian/patches/09_noremove_pid.dpatch:
    - Do not remove the PID file after a connection error
      (original patch from Hiren Patel). (LP: #1126890)
 -- Michael Terry <email address hidden> Fri, 24 May 2013 17:01:05 -0400

Changed in nagios-nrpe (Ubuntu):
status: Confirmed → Fix Released
Martin Pitt (pitti) on 2013-05-27
Changed in nagios-nrpe (Ubuntu Precise):
status: New → In Progress

Thanks for uploading the fix for this bug report to -proposed. However, when reviewing the package in -proposed and the details of this bug report I noticed that the bug description is missing information required for the SRU process. You can find full details at http://wiki.ubuntu.com/StableReleaseUpdates#Procedure but essentially this bug is missing some of the following: a statement of impact, a test case and details regarding the regression potential. Thanks in advance!

Simon Déziel (sdeziel) wrote :

Brian, I've added the SRU information. Let me know if I missed anything, thanks.

description: updated

Hello Simon, or anyone else affected,

Accepted nagios-nrpe into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/nagios-nrpe/2.12-5ubuntu1.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nagios-nrpe (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Simon Déziel (sdeziel) wrote :

The precise-proposed packages works well, many thanks!

tags: added: verification-done
removed: verification-needed

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nagios-nrpe - 2.12-5ubuntu1.2

---------------
nagios-nrpe (2.12-5ubuntu1.2) precise; urgency=low

  * Do not remove the PID file after a connection error
    (original patch from Hiren Patel). (LP: #1126890)
  * Fixed compiler hardening configuration. (LP: #1000379)
 -- Simon Deziel <email address hidden> Wed, 22 May 2013 10:03:21 -0400

Changed in nagios-nrpe (Ubuntu Precise):
status: Fix Committed → Fix Released
Changed in nagios-nrpe (Debian):
status: Unknown → New
Changed in nagios-nrpe (Debian):
status: New → Fix Committed
Changed in nagios-nrpe (Debian):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.