nrpe plugin in bionic fails with "Error - Could not complete SSL handshake"

Bug #1782650 reported by andrei caraman on 2018-07-19
64
This bug affects 10 people
Affects Status Importance Assigned to Milestone
nagios-nrpe (Ubuntu)
Medium
Unassigned

Bug Description

On a new instal of 18.04/bionic, check_nrpe fails with

CHECK_NRPE: (ssl_err != 5) Error - Could not complete SSL handshake with 192.168.2.3: 1

on any destination machine that is not also Ubuntu/bionic. I have tried destinations with Ubuntu 18.04, 16.04, 14.04, Debian and Redhat. Everything fails except to 18.04 destinations.

The nrpe server on an Ubuntu 16.04 destination logs this:

Jul 19 16:31:11 a0 nrpe[24029]: Connection from 192.168.3.2 port 28397
Jul 19 16:31:11 a0 nrpe[24029]: Host address is in allowed_hosts
Jul 19 16:31:11 a0 nrpe[24029]: Handling the connection...
Jul 19 16:31:11 a0 nrpe[24029]: Error: Could not complete SSL handshake. 1

(Everything works fine the other way around: check_nrpe running on the 16.04 against an 18.04 destination. In fact check_nrpe works fine with all destinations I've tried, suggesting the problem occured somewhere between xenial's NRPE v2.15 and bionic's v3.2.1)

Regards,
adc

Hi,
I'm certainly not a nagios-nrpe expert but isn't that just (1) of [1].
to quote "Newer versions of NRPE are usually not backward compatible with older versions."
I expected something like it and search engines immediately returned just such cases.

So could it just be that?

[1]: https://nazeems.wordpress.com/2012/04/20/correcting-ssl-handshake-error-in-nagios/

Changed in nagios-nrpe (Ubuntu):
status: New → Incomplete
tags: added: bionic
Launchpad Janitor (janitor) wrote :

[Expired for nagios-nrpe (Ubuntu) because there has been no activity for 60 days.]

Changed in nagios-nrpe (Ubuntu):
status: Incomplete → Expired
John Smith (random534) wrote :

Hi,

I've got the same problem here. Indeed, there's a difference between 18.04 (nrpe v3) & older releases which contain nrpe v2. However the nrpe client has an option to force backward compatibility: check_nrpe -2, which should work with older server. The error stays exactly the same nevertheless.

Changed in nagios-nrpe (Ubuntu):
status: Expired → New
John Smith (random534) wrote :

I've done some more digging.

It's definitely related to the upgrade from v2 to v3. My syslog from the nagios server reports errors such as:
check_nrpe: Error: (!log_opts) Could not complete SSL handshake with xxx.xxx.xxx.xxx: dh key too small

This page describes the compatibility of v3: https://support.nagios.com/kb/article/nrpe-v3-compatibility-with-previous-versions-516.html. It states:

"A 2048-bit DH key is used instead of a 512-bit key"

which very likely is the cause of the issue. The same pages provides a workaround:

"Force the plugin to send v2 packets
Using the -2 argument will force the plugin to connect with v2 packets
/usr/local/nagios/libexec/check_nrpe -2 -H centos12"

This workaround doesn't work on 18.04. I also tried with -P 1024 as suggested in some other places, to no avail.

John Peach (john-launchpad) wrote :

I've just been bitten by this one and it's practically a show-stopper. I've tried all the suggestions I can find without success and the idea of building nrpe server for multiple other platforms does not seem like any kind of solution. I guess I'll need to downgrade check_nrpe.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nagios-nrpe (Ubuntu):
status: New → Confirmed
Simon Déziel (sdeziel) wrote :

It looks like the Bionic TLS client rejects the server picked DH param (512 bits) as being too small. We can see this at work in the attached pcap where 172.22.30.2 is Xenial/TLS server/NRPE server and 172.22.30.66 is the Bionic/TLS client/check_nrpe.

Simon Déziel (sdeziel) wrote :

It seems the Bionic's OpenSSL version will always reject the small DH params proposed by the Xenial side so the only workaround I can think of for now is to disable TLS on both sides.

Andreas Hasenack (ahasenack) wrote :

Thanks for reporting this bug and troubleshooting it this far. It looks like a real issue and we have added it to the server team backlog. Maybe there could be an ssl option to have the bionic client accept the smaller DH key.

Changed in nagios-nrpe (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Ken Bowley (kbowley) wrote :

I'll add another "me too" on this bug since I had forgotten about the NRPE incompatibility and broke a bunch of nagios checks by upgrading our production nagios system to 18.04.

Jesse Goodier (jessegoodier) wrote :

Does anyone have instructions for disabling TLS for nagios?

Simon Déziel (sdeziel) wrote :

@jessegoodier, you will need to change your Nagios checks from check_nrpe to check_nrpe_nossl and tell the old target NRPE server to run without SSL. This can be done by setting DAEMON_OPTS="-n" in /etc/default/nagios-nrpe-server.

I would recommend a downgrade to the version shipped with Ubuntu 14.04. (check_nrpe 2.15)
I replaced check_nrpe and now I can monitor successfully again. Better a bad encryption than no encryption at all...

Paride Legovini (legovini) wrote :

Hello Stefan,

So you replaced the check_nrpe binary *only*, without downgrading the whole package and (what I'm mostly interested in) without downgrading openssl?

@legovini

Yes, that's exactly what I did.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers