systemd-logind: memory leaks on session's connections (trusty-only)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
systemd (Ubuntu) |
Fix Released
|
Medium
|
Guilherme G. Piccoli | ||
Trusty |
Fix Released
|
Medium
|
Guilherme G. Piccoli | ||
Xenial |
Fix Released
|
Medium
|
Guilherme G. Piccoli | ||
Artful |
Fix Released
|
Medium
|
Guilherme G. Piccoli | ||
Bionic |
Fix Released
|
Medium
|
Guilherme G. Piccoli |
Bug Description
Below the SRU request form. Please refer to the Original Description to a more comprehensive explanation of the problem observed.
[Impact]
* systemd-logind tool is leaking memory at each session connected. The
issues happens in systemd from Trusty (14.04) only.
* Three issues observed:
- systemd-logind is leaking entire sessions, i.e, the sessions are not
feeed after they're closed. In order to fix that, we proactively add
the sessions to systemd garbage collector (gc) when they are closed.
Also, part of the fix is to make cgmanager package a dependency. Refer
to comment #1 to a more thorough explanation of the issue and the fix.
- a small memory leak was observed in the session creation logic of
systemd-logind. The fix for that is the addition of an appropriate
free() call. Refer to comment #2 to more details on the issue and fix.
- another small memory leak was observed in the cgmanager glue code of
systemd-logind - this code is only present in this specific Ubuntu
release of the package, due to necessary compatibility layer with
upstart init system. The fix is to properly call free() in 2
functions. Refer to comment #3 to a deep exposition of the issue and
the fix.
[Test Case]
* The basic test-case is to run the following loop from a remote machine:
while true; do ssh <hostname-target> "whoami"; done
* It's possible to watch the increase in memory consumption from
"systemd-logind" process in the target machine. One can use the
"ps uax" command to verify the RSS of the process, or count its
anonymous pages from /proc/<
[Regression Potential]
* Since the fixes are small and not intrusive, the potential for
regressions are low. More regression considerations on comments #1, #2
and #3 for each fix.
* A potential small regressson is performance-wise, since now we add
sessions to garbage collector proactively.
Changed in systemd (Ubuntu): | |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in systemd (Ubuntu Trusty): | |
assignee: | nobody → Guilherme G. Piccoli (gpiccoli) |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in systemd (Ubuntu Xenial): | |
assignee: | nobody → Guilherme G. Piccoli (gpiccoli) |
importance: | Undecided → Medium |
status: | New → Fix Released |
Changed in systemd (Ubuntu Bionic): | |
status: | In Progress → Fix Released |
Changed in systemd (Ubuntu Artful): | |
assignee: | nobody → Guilherme G. Piccoli (gpiccoli) |
importance: | Undecided → Medium |
status: | New → Fix Released |
description: | updated |
tags: | added: sts-sru-needed |
After some debug, it was observed that the only path in which a session
is freed (other than in user/seat free or manager free) is in garbage
collecting, specifically in the function manager_gc().
It means a closing session should be somehow added to gc in order gc has
a chance to validate if the session is ready to go, and then performs
the clean-up.
To include a session in gc, the function session_ add_to_ gc_queue( ) should
be called. This function is currently called from 3 paths:
* from session_stop();
* from manager startup routines;
* and from the empty cgroup notification handler.
Since we are not using systemd as init system in Trusty, the empty cgroup
notifier is not reliable. We rely on cgmanager to perform cgroup
interaction on Trusty, and its release-agent also does not notify when
cgroups are empty. Finally, commit
efdb023 ("core: unified cgroup hierarchy support")
made a refactor in cgroup management and it claims for the first time,
cgroup empty notifications are reliable, meaning until this commit we
certainly couldn't rely on it.
The startup path of manager is executed only in logind start; session_stop()
could execute from Terminate dbus events (like loginctl terminate-session),
and from user/seat stopping routines. Meaning there's no way a session is
added to gc queue except the cgroup empty notification (if it worked) in
this version of systemd.
The simple test-case mentioned in the first comment leaks entire sessions
and logind consumes an increasing amount of memory.
--
In order to prevent that, I wrote a small patch that proactively puts a remove_ fifo(). This function is always called in
session in gc on session_
the event of Release a session - when the ssh ends, for example.
This way, we guarantee closing sessions have a chance to be cleared.
But also, it was noticed that kernel may take a small time window to
clear the tasks' cgroup tracking for the session - hence, the gc
might fail in freeing the session, so the patch adds a check on gc too
in order to re-add the session on gc in case session is on closing state.
This way stopping sessions are always freed and no leak is observed.
The patch:
diff --git a/src/login/ logind- session. c b/src/login/ logind- session. c logind- session. c logind- session. c remove_ fifo(Session *s) {
free( s->fifo_ path);
s->fifo_ path = NULL; add_to_ gc_queue( s);
index 662273b..1a7389a 100644
--- a/src/login/
+++ b/src/login/
@@ -951,6 +952,7 @@ void session_
}
+ session_
}
int session_ check_gc( Session *s, bool drop_not_started) { logind. c b/src/login/ logind. c logind. c logind. c check_gc( session, drop_not_started) == 0) {
session_ stop(session) ;
session_ free(session) ; get_state( session) == SESSION_CLOSING) { add_to_ gc_queue( session) ;
diff --git a/src/login/
index 333a779..0b9e165 100644
--- a/src/login/
+++ b/src/login/
@@ -1495,6 +1495,8 @@ void manager_gc(Manager *m, bool drop_not_started) {
if (session_
+ } else if (session_
+ session_
}
}
This diverges from upstream since we have refactors in systemd code,
specially...