nrpe plugin in bionic fails with "Error - Could not complete SSL handshake"

Bug #1782650 reported by andrei caraman
110
This bug affects 19 people
Affects Status Importance Assigned to Milestone
nagios-nrpe (Debian)
Fix Released
Unknown
nagios-nrpe (Ubuntu)
Won't Fix
Medium
Unassigned
Bionic
Won't Fix
Medium
Unassigned

Bug Description

On a new instal of 18.04/bionic, check_nrpe fails with

CHECK_NRPE: (ssl_err != 5) Error - Could not complete SSL handshake with 192.168.2.3: 1

on any destination machine that is not also Ubuntu/bionic. I have tried destinations with Ubuntu 18.04, 16.04, 14.04, Debian and Redhat. Everything fails except to 18.04 destinations.

The nrpe server on an Ubuntu 16.04 destination logs this:

Jul 19 16:31:11 a0 nrpe[24029]: Connection from 192.168.3.2 port 28397
Jul 19 16:31:11 a0 nrpe[24029]: Host address is in allowed_hosts
Jul 19 16:31:11 a0 nrpe[24029]: Handling the connection...
Jul 19 16:31:11 a0 nrpe[24029]: Error: Could not complete SSL handshake. 1

(Everything works fine the other way around: check_nrpe running on the 16.04 against an 18.04 destination. In fact check_nrpe works fine with all destinations I've tried, suggesting the problem occured somewhere between xenial's NRPE v2.15 and bionic's v3.2.1)

Regards,
adc

Tags: bionic
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
I'm certainly not a nagios-nrpe expert but isn't that just (1) of [1].
to quote "Newer versions of NRPE are usually not backward compatible with older versions."
I expected something like it and search engines immediately returned just such cases.

So could it just be that?

[1]: https://nazeems.wordpress.com/2012/04/20/correcting-ssl-handshake-error-in-nagios/

Changed in nagios-nrpe (Ubuntu):
status: New → Incomplete
tags: added: bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for nagios-nrpe (Ubuntu) because there has been no activity for 60 days.]

Changed in nagios-nrpe (Ubuntu):
status: Incomplete → Expired
Revision history for this message
John Smith (random534) wrote :

Hi,

I've got the same problem here. Indeed, there's a difference between 18.04 (nrpe v3) & older releases which contain nrpe v2. However the nrpe client has an option to force backward compatibility: check_nrpe -2, which should work with older server. The error stays exactly the same nevertheless.

Changed in nagios-nrpe (Ubuntu):
status: Expired → New
Revision history for this message
John Smith (random534) wrote :

I've done some more digging.

It's definitely related to the upgrade from v2 to v3. My syslog from the nagios server reports errors such as:
check_nrpe: Error: (!log_opts) Could not complete SSL handshake with xxx.xxx.xxx.xxx: dh key too small

This page describes the compatibility of v3: https://support.nagios.com/kb/article/nrpe-v3-compatibility-with-previous-versions-516.html. It states:

"A 2048-bit DH key is used instead of a 512-bit key"

which very likely is the cause of the issue. The same pages provides a workaround:

"Force the plugin to send v2 packets
Using the -2 argument will force the plugin to connect with v2 packets
/usr/local/nagios/libexec/check_nrpe -2 -H centos12"

This workaround doesn't work on 18.04. I also tried with -P 1024 as suggested in some other places, to no avail.

Revision history for this message
John Peach (john-launchpad) wrote :

I've just been bitten by this one and it's practically a show-stopper. I've tried all the suggestions I can find without success and the idea of building nrpe server for multiple other platforms does not seem like any kind of solution. I guess I'll need to downgrade check_nrpe.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nagios-nrpe (Ubuntu):
status: New → Confirmed
Revision history for this message
Simon Déziel (sdeziel) wrote :

It looks like the Bionic TLS client rejects the server picked DH param (512 bits) as being too small. We can see this at work in the attached pcap where 172.22.30.2 is Xenial/TLS server/NRPE server and 172.22.30.66 is the Bionic/TLS client/check_nrpe.

Revision history for this message
Simon Déziel (sdeziel) wrote :

It seems the Bionic's OpenSSL version will always reject the small DH params proposed by the Xenial side so the only workaround I can think of for now is to disable TLS on both sides.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thanks for reporting this bug and troubleshooting it this far. It looks like a real issue and we have added it to the server team backlog. Maybe there could be an ssl option to have the bionic client accept the smaller DH key.

Changed in nagios-nrpe (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Revision history for this message
Ken Bowley (kbowley) wrote :

I'll add another "me too" on this bug since I had forgotten about the NRPE incompatibility and broke a bunch of nagios checks by upgrading our production nagios system to 18.04.

Revision history for this message
Jesse Goodier (jessegoodier) wrote :

Does anyone have instructions for disabling TLS for nagios?

Revision history for this message
Simon Déziel (sdeziel) wrote :

@jessegoodier, you will need to change your Nagios checks from check_nrpe to check_nrpe_nossl and tell the old target NRPE server to run without SSL. This can be done by setting DAEMON_OPTS="-n" in /etc/default/nagios-nrpe-server.

Revision history for this message
Stefan Hegedüs (stefan+ubuntuone) wrote :

I would recommend a downgrade to the version shipped with Ubuntu 14.04. (check_nrpe 2.15)
I replaced check_nrpe and now I can monitor successfully again. Better a bad encryption than no encryption at all...

Revision history for this message
Paride Legovini (paride) wrote :

Hello Stefan,

So you replaced the check_nrpe binary *only*, without downgrading the whole package and (what I'm mostly interested in) without downgrading openssl?

Revision history for this message
Stefan Hegedüs (stefan+ubuntuone) wrote :

@legovini

Yes, that's exactly what I did.

Revision history for this message
Ian Gibbs (realflash-uk) wrote :

Here's the full workaround:

 - Go to a machine running a version below 18.04
 - apt install nagios-nrpe-plugin
 - Copy /usr/lib/nagios/plugins/check_nrpe over to your Nagios server
 - On the server:
 - mv /usr/lib/nagios/plugins/check_nrpe /usr/lib/nagios/plugins/check_nrpe.v3
 - mv <wherever_you_copied_check_nrpe_to> /usr/lib/nagios/plugins/check_nrpe

This is quite hacky; if you try to verify nagios-nrpe-plugin on the server it will tell you that check_nrpe isn't what it should be; if you upgrade nagios-nrpe-plugin the hack will be wiped out; etc. etc. This is just enough to keep you going while you upgrade the clients to NRPE v3/Ubuntu 18.04.

You don't need to restart anything, your checks will all just start to come good.

Revision history for this message
Bryce Harrington (bryce) wrote :

This seems to be the same / similar issue, discussed upstream:

https://support.nagios.com/forum/viewtopic.php?f=7&t=50342

Not sure if there's a recommended fix upstream other than upgrading (or downgrading) to keep v3 (or v2) clients and servers in sync.

Changed in nagios-nrpe (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Antonio Oliveira (antonion) wrote :

Had the same issue after upgrading from 14.04 to 16.04.

Even with the planning to upgrade all our Ubuntu Servers to the lastest LTS ( 18.04)
Needed to monitor disk usage from Nagios Server, since its critical file storage service.

My Nagios server is running on a FreeBSD 12 Stable Server with NRPE latest version ( check_nrpe3 )

Looks like nrpe package version from 16.04 offical repositories is a lot older ( 2.x.x ) than the current Nagios supported version ( 3.x)

I fixed my issue by compiling NRPE manually.
The process is very well documented by Nagios Support Team:
https://support.nagios.com/kb/article/nrpe-how-to-install-nrpe-8.html

-> Don`t forget to check and validate your backups, in case something goes wrong.)

1 - sudo apt remove nagios-nrpe-server
2 - sudo apt autoremove
3 - sudo cd /tmp
4 - sudo wget http://assets.nagios.com/downloads/nagiosxi/agents/linux-nrpe-agent.tar.gz
5 - sudo tar xzf linux-nrpe-agent.tar.gz
6 - cd linux-nrpe-agent
7 - sudo ./fullinstall

The ./fullinstall script will warn you that it should be run on a clean system.
After it finishes compiling and seting up nrpe and xinetd, it will also ask you to authorize your Nagios Server IP to Query Local NRPE.

After that, you can go to /usr/local/etc/nagios/nrpe.cfg, adjust your checks and restart xinetd
(service xinetd restart)

Wait or force your Nagios Server to re-check monitored stuff and it should work as expected.

Changed in nagios-nrpe (Debian):
status: Unknown → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

From https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914489#12

> After updating nagios-nrpe-plugin in my monitoring host to
> 3.2.1-1~bpo9+1 most of my monitored instances fail to be checked.

That is due to changes in openssl, we have no control over that.
For machines with an old openssl you need to disable SSL with -n.

---

So it is an admin task to be done if nodes have old SSL versions.
For now following Debian on this and marking it as Won't Fix (as it is a local config change and not a bug to be fixed in the package [unless we want to add way more logic])

Changed in nagios-nrpe (Ubuntu):
status: Triaged → Won't Fix
Changed in nagios-nrpe (Ubuntu Bionic):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.