Comment 1 for bug 1750013

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

After some debug, it was observed that the only path in which a session
is freed (other than in user/seat free or manager free) is in garbage
collecting, specifically in the function manager_gc().
It means a closing session should be somehow added to gc in order gc has
a chance to validate if the session is ready to go, and then performs
the clean-up.

To include a session in gc, the function session_add_to_gc_queue() should
be called. This function is currently called from 3 paths:

* from session_stop();
* from manager startup routines;
* and from the empty cgroup notification handler.

Since we are not using systemd as init system in Trusty, the empty cgroup
notifier is not reliable. We rely on cgmanager to perform cgroup
interaction on Trusty, and its release-agent also does not notify when
cgroups are empty. Finally, commit
efdb023 ("core: unified cgroup hierarchy support")
made a refactor in cgroup management and it claims for the first time,
cgroup empty notifications are reliable, meaning until this commit we
certainly couldn't rely on it.

The startup path of manager is executed only in logind start; session_stop()
could execute from Terminate dbus events (like loginctl terminate-session),
and from user/seat stopping routines. Meaning there's no way a session is
added to gc queue except the cgroup empty notification (if it worked) in
this version of systemd.

The simple test-case mentioned in the first comment leaks entire sessions
and logind consumes an increasing amount of memory.
--

In order to prevent that, I wrote a small patch that proactively puts a
session in gc on session_remove_fifo(). This function is always called in
the event of Release a session - when the ssh ends, for example.
This way, we guarantee closing sessions have a chance to be cleared.

But also, it was noticed that kernel may take a small time window to
clear the tasks' cgroup tracking for the session - hence, the gc
might fail in freeing the session, so the patch adds a check on gc too
in order to re-add the session on gc in case session is on closing state.
This way stopping sessions are always freed and no leak is observed.

The patch:

diff --git a/src/login/logind-session.c b/src/login/logind-session.c
index 662273b..1a7389a 100644
--- a/src/login/logind-session.c
+++ b/src/login/logind-session.c
@@ -951,6 +952,7 @@ void session_remove_fifo(Session *s) {
                 free(s->fifo_path);
                 s->fifo_path = NULL;
         }
+ session_add_to_gc_queue(s);
 }

 int session_check_gc(Session *s, bool drop_not_started) {
diff --git a/src/login/logind.c b/src/login/logind.c
index 333a779..0b9e165 100644
--- a/src/login/logind.c
+++ b/src/login/logind.c
@@ -1495,6 +1495,8 @@ void manager_gc(Manager *m, bool drop_not_started) {
                 if (session_check_gc(session, drop_not_started) == 0) {
                         session_stop(session);
                         session_free(session);
+ } else if (session_get_state(session) == SESSION_CLOSING) {
+ session_add_to_gc_queue(session);
                 }
         }

This diverges from upstream since we have refactors in systemd code,
specially in session shutdown logic on commit 5f41d1f
("logind: rework session shutdown logic"). The simple approach of this
patch is less expensive then bring multiple patches to the current code
of Trusty's systemd.

As possible side effects, adding a session to gc has none other than a
possible (non-measured) minimal performance penalty.

Finally, it was observed that "cgmanager" package is necessary to perform
the full clean-up of a session - since we don't handle cgroups with systemd
in Trusty, cgmanager is the responsible one to clear the cgroup directory
for each cleaned-up session.

After testing the solution above (with cgmanager), memory consumption was
drastically reduced, since systemd-logind is not "leaking" entire
sessions anymore.

More small leaks were observed though, so investigation continues.