rsyslog-relp: imrelp module leaves sockets in CLOSE_WAIT state which leads to file descriptor leak
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
librelp (Ubuntu) |
Fix Released
|
Medium
|
Matthew Ruffell | ||
Focal |
Fix Released
|
Medium
|
Matthew Ruffell | ||
Groovy |
Fix Released
|
Medium
|
Matthew Ruffell | ||
Hirsute |
Fix Released
|
Medium
|
Matthew Ruffell | ||
rsyslog (Ubuntu) |
Fix Released
|
Medium
|
Matthew Ruffell | ||
Focal |
Won't Fix
|
Medium
|
Matthew Ruffell | ||
Groovy |
Fix Released
|
Medium
|
Matthew Ruffell | ||
Hirsute |
Fix Released
|
Medium
|
Matthew Ruffell |
Bug Description
[Impact]
In recent versions of rsyslog and librelp, the imrelp module leaks file descriptors due to a bug where it does not correctly close sockets, and instead, leaves them in the CLOSE_WAIT state.
This causes rsyslogd on busy servers to eventually hit the limit of maximum open files allowed, which locks rsyslogd up until it is restarted.
A workaround is to restart rsyslogd every month or so to manually close all of the open sockets.
Only users of the imrelp module are affected, and not rsyslog users in general.
[Testcase]
Install the rsyslog-relp module like so:
$ sudo apt install rsyslog rsyslog-relp
Next, generate a working directory, and make a config file that loads the relp module.
$ sudo mkdir /workdir
$ cat << EOF >> ./spool.conf
\$LocalHostName spool
\$AbortOnUnclea
\$PreserveFQDN on
global(
workDirecto
maxMessageS
)
main_queue(
module(
input(
type="imrelp"
name="imrelp"
port="601"
ruleset="spool"
MaxDataSize
)
ruleset(
}
# Just so rsyslog doesn't whine that we do not have outputs
ruleset(name="noop" queue.type=
action(
)
}
EOF
Verify that the config is valid, then start a rsyslog server.
$ sudo rsyslogd -f ./spool.conf -N9
$ sudo rsyslogd -f ./spool.conf -i /workdir/
Fetch the rsyslogd PID and check for open files.
$ RLOGPID=$(cat /workdir/
$ sudo ls -l /proc/$RLOGPID/fd
total 0
lr-x------ 1 root root 64 Dec 17 01:22 0 -> /dev/urandom
lrwx------ 1 root root 64 Dec 17 01:22 1 -> 'socket:[41228]'
lrwx------ 1 root root 64 Dec 17 01:22 3 -> 'socket:[41222]'
lrwx------ 1 root root 64 Dec 17 01:22 4 -> 'socket:[41223]'
lrwx------ 1 root root 64 Dec 17 01:22 7 -> 'anon_inode:
We have 3 sockets open by default. Next, use netcat to open 100 connections:
$ for i in {1..100} ; do nc -z 127.0.0.1 601 ; done
Now check for open file descriptors, and there will be an extra 100 sockets
in the list:
$ sudo ls -l /proc/$RLOGPID/fd
https:/
We can check the state of these sockets with:
$ ss -t
https:/
The listening sockets will be in CLOSE-WAIT, and the netcat sockets will be in FIN-WAIT-2.
$ ss -t | grep CLOSE-WAIT | wc -l
100
If you install the test package available in the following ppa:
https:/
When you open connections with netcat, these will be closed properly, and the file descriptor leak will be fixed.
[Where problems could occur]
If a regression were to occur, it would be limited to users of the imrelp module, which is a part of the rsyslogd-relp package, and depends on librelp.
rsyslog-relp is not part of a default installation of rsyslog, and is opt in by changing a configuration file to enable imrelp.
The changes to rsyslog implement a testcase which exercises the problematic code to ensure things are working as expected; this
can be enabled manually on build, and has been verified to pass (#7).
[Other]
Upstream bug list:
https:/
https:/
https:/
https:/
The following commits fix the problem:
rsyslogd
========
commit baee0bd54206493
Author: Andre lorbach <email address hidden>
Date: Thu Apr 9 13:00:35 2020 +0200
Subject: testbench: Add test for imrelp to check broken session handling.
Link: https:/
librelp
=======
commit 7907c9c57f6ed94
Author: Andre lorbach <email address hidden>
Date: Mon May 11 14:59:55 2020 +0200
Subject: fix memory leak on session break.
Link: https:/
commit 4a6ad8637c244fd
Author: Andre lorbach <email address hidden>
Date: Wed Apr 8 15:55:32 2020 +0200
Subject: replsess: fix double free of sendbuf in some cases.
Link: https:/
commit 3797944fb62273f
Author: Ognyan Kulev <email address hidden>
Date: Mon Jun 15 14:10:08 2020 +0300
Subject: Fix FD leak when socket shutdown is one-sided
Link: https:/
Changed in librelp (Ubuntu Focal): | |
status: | New → In Progress |
Changed in librelp (Ubuntu Groovy): | |
status: | New → In Progress |
Changed in librelp (Ubuntu Hirsute): | |
status: | New → In Progress |
Changed in rsyslog (Ubuntu Focal): | |
status: | New → In Progress |
Changed in rsyslog (Ubuntu Groovy): | |
status: | New → In Progress |
Changed in rsyslog (Ubuntu Hirsute): | |
status: | New → In Progress |
Changed in librelp (Ubuntu Focal): | |
importance: | Undecided → Medium |
Changed in librelp (Ubuntu Groovy): | |
importance: | Undecided → Medium |
Changed in librelp (Ubuntu Hirsute): | |
importance: | Undecided → Medium |
Changed in rsyslog (Ubuntu Focal): | |
importance: | Undecided → Medium |
Changed in rsyslog (Ubuntu Groovy): | |
importance: | Undecided → Medium |
Changed in rsyslog (Ubuntu Hirsute): | |
importance: | Undecided → Medium |
Changed in librelp (Ubuntu Focal): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in librelp (Ubuntu Groovy): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in librelp (Ubuntu Hirsute): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in rsyslog (Ubuntu Focal): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in rsyslog (Ubuntu Groovy): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in rsyslog (Ubuntu Hirsute): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
tags: | added: sts |
description: | updated |
Changed in rsyslog (Ubuntu Hirsute): | |
status: | In Progress → Fix Released |
Changed in rsyslog (Ubuntu Groovy): | |
status: | In Progress → Fix Released |
Changed in rsyslog (Ubuntu Focal): | |
status: | In Progress → Won't Fix |
description: | updated |
tags: | added: sts-sponsor |
tags: | added: sts-sponsor-mfo |
tags: |
added: verification-done removed: verification-needed |
Since we only need to change librelp to fix the problem, we won't SRU the testcase to Focal's rsyslog, since there is no need to risk a regression for a testcase, which would effectively be a no-change rebuild.
The testcase is already present in rsyslog on Groovy and Hirsute, marking as released.
I found an extra commit we need for librelp, which is "Fix FD leak when socket shutdown is one-sided". This is needed for the netcat testcase and for basic heartbeat programs like load balancers etc, which open a port to the rsyslog relp port, but doesn't necessarily speak relp protocol over that port.
Building test packages now since my internal tests are now successful once I included "Fix FD leak when socket shutdown is one-sided".