Memory leak (/run file system filling up)

Bug #1687507 reported by Martin Winter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Expired
Undecided
Unassigned
systemd (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

(see also related discussion on forum:
https://forum.snapcraft.io/t/memory-run-memory-file-system-leaking-with-snap-install-remove/429 )

I have a CI system which tests my snap and does a lot install/remove of snap packages as part
of the operation on some small virtual systems. My /tmp is 200MB

This is on latest Ubuntu Server 16.04 (all packages updated), with the alternative 4.8 kernel
from the Ubuntu Repo)

At this time, all packages are updated to latest version. I have this bug for a long time (since I've started building up a CI infrastructure for the Snap - approx 6 months ago), so this is
not a new bug, but only the more frequent snap testing made it obvious.

In my system the /tmp filesystem fills up - within approx 2..7 days I'm out of space on it.
All the space is used up under /run/udev/data/ and approx 95% of the files (around 45'000)
start with +cgroup prefix

I can provide access to a VM in this state if requested (IPv6 only, contact me with SSH key)

Here is some current output (not yet out of space... may need another day)

root@ci-comp17-dut:~# df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 1002892 0 1002892 0% /dev
tmpfs 204796 148196 56600 73% /run
/dev/vda1 6060608 3462628 2267076 61% /
tmpfs 1023976 0 1023976 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 1023976 0 1023976 0% /sys/fs/cgroup
/dev/loop0 80256 80256 0 100% /snap/core/1577
/dev/loop1 77056 77056 0 100% /snap/core/1337
/dev/loop2 80256 80256 0 100% /snap/core/1441
tmpfs 204796 0 204796 0% /run/user/0
/dev/loop3 14592 14592 0 100% /snap/frr/x1

Attached is a full dir output of /run/udev/data (ls_run_udev_data.log)

Revision history for this message
Martin Winter (mwinter-osr) wrote :
Revision history for this message
Oliver Grawert (ogra) wrote :

looking at the listing as well as the fact that the syslog excerpt on the forum is full of:

Apr 28 09:36:48 ci-comp11-dut systemd[1]: Started Session 1816 of user root.

this is either a systemd bug or a bug with the way systemd is used in the CI ... moving it to systemd to have a systemd maintainer take a look.

Changed in snapd:
status: New → Invalid
summary: - Memory leak (/tmp file system filling up)
+ Memory leak (/run file system filling up)
Revision history for this message
Oliver Grawert (ogra) wrote :

expanding on: "looking at the listing"

i meant to say, there are only very few related snap bits in there, many are simply from the OS itself including apt updates and the like that are completely unrelated to snaps.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

is this leaks from snaps onto host system, generated by mounting/unmounting of snaps? Does snapd insures to close/stop all sessions that are started for each snap?

If normal snapd operations result in pam_logind creating sessions which are never ended, and these sessions leak onto the host system, this certainly may result in data leaks.

Is this an Ubuntu core or classic system?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Please provide steps to reproduce the issue.

Changed in snapd:
status: Invalid → Incomplete
Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Martin Winter (mwinter-osr) wrote :

Sorry for slow response - I'm still trying to figure out the steps to reproduce this reliable.
It does happen when I use my snap in the CI system (installing, testing, removing).

It happens reliable with my CI system testing my snap (frr) when I do run it against a commercial protocol compliance suite. Challenge is to figure out which specific part of the commands executed cause the problem.

This is seen on Ubuntu 16.04 server (classic). Normally testing with the alternative 4.8.0-49
kernel (every other package updated to latest). I'm running the alternative 4.8 kernel as I need something >= 4.5 for my MPLS networking tests.

At this time it does look like the issue is related to the 4.8 kernel as I'm currently not able
to reproduce the same issue with the standard 4.4 kernel (testing 4.4.0-77)

I'll update again in a few days when I've done enough tests to be certain that it is depending on the kernel.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for snapd because there has been no activity for 60 days.]

Changed in snapd:
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for systemd (Ubuntu) because there has been no activity for 60 days.]

Changed in systemd (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.