Keepalived update from 1.2.19 to 1.2.24 breaks support for /dev/tcp health check

Bug #1795420 reported by Nathan Neulinger
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
keepalived (Ubuntu)
Fix Released
High
Unassigned
Xenial
Won't Fix
High
Unassigned
Bionic
Fix Released
High
Unassigned
Cosmic
Won't Fix
High
Unassigned

Bug Description

Previous configuration that works fine:

vrrp_script chk_trigger_port {
    script "</dev/tcp/127.0.0.1/53"
    interval 5
    weight 30
}

This syntax is documented all over, and worked properly in 1.2.19, but update to 1.2.24 package broke it.

Resulting symptom is that keepalived will continuously fail with a FATAL error and respawn, and will not ever come online fully.

Tracing the service, it is getting a no such file or directory on the /dev/tcp script. I have not looked at the code changes, but I suspect there was some optimization to not use bash, or change in how the script is executed - breaking support for that syntax.

I was able to work around with:

    script "/bin/bash -c '</dev/tcp/127.0.0.1/53'"

but that still leaves all of the incorrect documentation.

Looks like it may have already been corrected upstream - please backport this fix to ubuntu 16.04 package train please:

https://github.com/acassen/keepalived/commit/5cd5fff78de11178c51ca245ff5de61a86b85049

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thanks for filing this bug in Ubuntu, and providing a link to a patch. This will need some backporting, but it's an excellent start.

Looks like even cosmic is affected, at version 1.3.9.

Changed in keepalived (Ubuntu):
status: New → Triaged
importance: Undecided → High
tags: added: regression-update
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This wan't touched for quite a while :_/
I realized it missed a tag to get in the common list that is re-iterated - it didn't seem we wait on something right Andreas?

It would be Karl's task to take a look at since he checks the HA stack in general.
For now I'm assigning him, lets take a look or reconsider later on.

Changed in keepalived (Ubuntu):
assignee: nobody → Karl Stenerud (kstenerud)
tags: added: server-next
Changed in keepalived (Ubuntu):
assignee: Karl Stenerud (kstenerud) → nobody
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Still in the queue.

Revision history for this message
Bryce Harrington (bryce) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

Aforementioned patch was included in 2018-04-09 Release 1.4.3

 1:1.2.2-3ubuntu1 | precise
 1:1.2.2-3ubuntu1.1 | precise-updates
 1:1.2.7-1ubuntu1 | trusty
 1:1.2.13-1~ubuntu14.04.1 | trusty-backports
 1:1.2.19-1 | xenial
 1:1.2.24-1ubuntu0.16.04.2 | xenial-security
 1:1.2.24-1ubuntu0.16.04.2 | xenial-updates
 1:1.3.9-1build1 | bionic
 1:1.3.9-1ubuntu0.18.04.2 | bionic-security
 1:1.3.9-1ubuntu0.18.04.2 | bionic-updates
 1:1.3.9-1ubuntu1 | cosmic
 1:1.3.9-1ubuntu1.1 | cosmic-security
 1:1.3.9-1ubuntu1.1 | cosmic-updates
 1:2.0.10-1 | disco
 1:2.0.10-1 | eoan

Looks like backports would be needed for cosmic-updates, bionic-updates, and (presumably) xenial-updates.

Changed in keepalived (Ubuntu Xenial):
status: New → Triaged
Changed in keepalived (Ubuntu Bionic):
status: New → Triaged
Changed in keepalived (Ubuntu Cosmic):
status: New → Triaged
importance: Undecided → High
Changed in keepalived (Ubuntu Bionic):
importance: Undecided → High
Changed in keepalived (Ubuntu Xenial):
importance: Undecided → High
Changed in keepalived (Ubuntu):
status: Triaged → Fix Released
summary: - Keepalived update from 1.2.19 to 1.2.24 breaks support for /dev/tcp
- health check
+ [SRU] Keepalived update from 1.2.19 to 1.2.24 breaks support for
+ /dev/tcp health check
Revision history for this message
Bryce Harrington (bryce) wrote : Re: [SRU] Keepalived update from 1.2.19 to 1.2.24 breaks support for /dev/tcp health check

Cosmic is out of service, so canceling that task. The bug is otherwise still valid for sru for xenial/bionic if/when someone has time to package it and finish the SRU form. Meanwhile, if anyone is interested in helping push this issue towards a fix, the SRU process is documented at https://wiki.ubuntu.com/StableReleaseUpdates.

Changed in keepalived (Ubuntu Cosmic):
status: Triaged → Invalid
Paride Legovini (paride)
Changed in keepalived (Ubuntu Bionic):
assignee: nobody → Paride Legovini (paride)
Revision history for this message
Paride Legovini (paride) wrote :

Hi, I did some digging here and I think we're dealing with two different issues.

(1) I can reproduce the FATAL error given in the original bug description on a Xenial system, but not on Bionic, where keepalived understands the syntax.

(2) However in Bionic I get the following error when trying to use the 'script </dev/tcp' syntax:

  Script </dev/tcp/127.0.0.1/80 cannot be accessed - No such file or directory

I think *this* is the problem that the upstream commit pointed in the bug description fixes (see the upstream issue it fixes: https://github.com/acassen/keepalived/issues/817).

As Xenial is now in ESM the fix for the FATAL error won't land there. The fix for the "No such file or directory" error could instead be SRUd on Bionic, but before proceeding I'd like an ACK on my reasoning.

Changed in keepalived (Ubuntu Bionic):
status: Triaged → Incomplete
Changed in keepalived (Ubuntu Cosmic):
status: Invalid → Won't Fix
Paride Legovini (paride)
Changed in keepalived (Ubuntu Xenial):
status: Triaged → Won't Fix
Paride Legovini (paride)
summary: - [SRU] Keepalived update from 1.2.19 to 1.2.24 breaks support for
- /dev/tcp health check
+ Keepalived update from 1.2.19 to 1.2.24 breaks support for /dev/tcp
+ health check
Revision history for this message
Paride Legovini (paride) wrote :

This specific issue (the FATAL error) is fixed on >= Bionic and therefore I'm marking the Bionic task as Fix Released.

However the

  script "</dev/tcp/127.0.0.1/53"

syntax doesn't currently work on any supported version of Ubuntu, but for a different reason: it relies on /bin/sh being Bash. I filed a separate LP bug for this problem with more details:

  https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1934300

however for the moment my suggestion is to use alternative ways to check if a TCP port is open, e.g. something like:

    script "bash -c '</dev/tcp/127.0.0.1/80'"
    script "nc -z 127.0.0.1 80"

Changed in keepalived (Ubuntu Bionic):
status: Incomplete → Fix Released
assignee: Paride Legovini (paride) → nobody
tags: removed: server-next
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.