Memory leaking when running kubernetes cronjobs

Bug #1792349 reported by Daniel McGinnes on 2018-09-13
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Bionic
High
Unassigned
Cosmic
High
Unassigned
linux-azure (Ubuntu)
High
Unassigned

Bug Description

We are using Kubernetes V1.8.15 with docker 18.03.1-ce.
We schedule 50 Kubernetes cronjobs to run every 5 minutes. Each cronjob will create a simple busybox container, echo hello world, then terminate.

In the data attached to the bug I let this run for 1 hour, and in this time the Available memory had reduced from 31256704 kB to 30461224 kB - so a loss of 776 MB. From previous longer runs we observe the available memory continues to drop.

There doesn't appear to be any processes left behind, or any growth in any other processes to explain where the memory has gone.

echo 3 > /proc/sys/vm/drop_caches causes some of the memory to be returned, but the majority remains leaked, and the only way to free it appears to be to reboot the system.

We are currently running Ubuntu 4.15.0-32.35-generic 4.15.18 and have previously observed similar issues on Ubuntu 16.04 with Kernel 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 and Debian 9.4 running 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02)

The leak was more severe on the Debian system, and investigations there showed leaks in pcpu_get_vm_areas and were related to memory cgroups. Running with Kernel 4.17 on debian showed a leak at a similar rate to what we now observe on Ubuntu 18. This leak causes us issues as we need to run the cronjobs regularly and want the systems to remain up for months.

Kubernetes will create a new cgroup each time the cronjob runs, but these are removed when the job completes (which takes a few seconds). If I use systemd-cgtop I don't see any increase in cgroups over time - but if I monitor /proc/cgroups over time I can see num_cgroups for memory increases.

For the duration of the test I collected slabinfo, meminfo, vmallocinfo & cgroups - which I will attach to the bug. Each file is suffixed with the number of seconds since the start.

*.0 & *.600 were taken before the test was started. The test was stopped shortly after the *.4200 files were generated. I then left the system idle for 10 minutes. I then ran echo 3 > /proc/sys/vm/drop_caches after *.4800 was generated. This seemed to free ~240MB - but this still leaves ~500MB lost. I then left the system idle for a further 20 minutes, and MemoryAvailable didn't seem to be increasing significantly.

Note, the data attached is from running on kernel 4.18.7-041807-generic #201809090930 SMP Sun Sep 9 09:33:16 UTC 2018 (which I ran to verify the issue still exists in latest kernel) - however I was unable to run ubuntu-bug linux on this kernel as it complained about:
*** Problem in linux-image-4.18.7-041807-generic

The problem cannot be reported:

This report is about a package that is not installed.

So I switched back to 4.15.0-32.35-generic to raise the bug.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-32-generic 4.15.0-32.35
ProcVersionSignature: Ubuntu 4.15.0-32.35-generic 4.15.18
Uname: Linux 4.15.0-32-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Sep 13 08:55 seq
 crw-rw---- 1 root audio 116, 33 Sep 13 08:55 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
Date: Thu Sep 13 08:55:46 2018
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Xen HVM domU
PciMultimedia:

ProcEnviron:
 LANG=C.UTF-8
 SHELL=/bin/bash
 TERM=xterm
 PATH=(custom, no user)
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-32-generic root=UUID=6a84f0e4-8522-41cd-8ecb-d4a6fbecef8a ro earlyprintk
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-32-generic N/A
 linux-backports-modules-4.15.0-32-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
WifiSyslog:

dmi.bios.date: 08/13/2018
dmi.bios.vendor: Xen
dmi.bios.version: 4.7.5-1.21
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.7.5-1.21:bd08/13/2018:svnXen:pnHVMdomU:pvr4.7.5-1.21:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.7.5-1.21
dmi.sys.vendor: Xen

Daniel McGinnes (djmcginnes) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: xenial
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc3

tags: added: kernel-da-key
Daniel McGinnes (djmcginnes) wrote :

I re-ran on 4.19.0-041900rc3-generic #201809120832 SMP Wed Sep 12 12:35:08 UTC 2018 and am still seeing the leak.

tags: added: kernel-bug-exists-upstream
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

Please follow the instructions on the wiki page[0]. The first step is to email the appropriate mailing list. If no response is received, then a bug may be opened on bugzilla.kernel.org.

Once this bug is reported upstream, please add the tag: 'kernel-bug-reported-upstream'.

[0] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
status: Incomplete → Triaged
tags: added: kernel-bug-reported-upstream
Daniel McGinnes (djmcginnes) wrote :

Bug reported - link to email is here ->
https://www.spinics.net/lists/cgroups/msg20593.html

I got a pretty positive response:

Thank you for the very detailed and full report!
We've experienced the same (or very similar problem), when memory cgroups
were staying in the dying state for a long time, so that the number of
dying cgroups grew steadily with time.

I've investigated the issue and found several problems in the memory
reclaim and accounting code.

The following commits from the next tree are solving the problem in our case:

010cb21d4ede math64: prevent double calculation of DIV64_U64_ROUND_UP() arguments
f77d7a05670d mm: don't miss the last page because of round-off error
d18bf0af683e mm: drain memcg stocks on css offlining
71cd51b2e1ca mm: rework memcg kernel stack accounting
f3a2fccbce15 mm: slowly shrink slabs with a relatively small number of objects

Changed in linux (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Medium
Changed in linux (Ubuntu):
importance: Medium → High
Changed in linux (Ubuntu Bionic):
importance: Medium → High
tags: added: kernel-key
removed: kernel-da-key
tags: added: kernel-da-key
removed: kernel-key
Daniel McGinnes (djmcginnes) wrote :

Hi, as per this update -> https://www.spinics.net/lists/cgroups/msg20660.html

I have a set of patches on top of Kernel 4.19.rc3 that appear to resolve the issue. What is the process for getting these backported to a 4.15 Kernel build for Ubuntu 18?

The list of patches is:

https://lkml.org/lkml/2018/10/7/84

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce7ea4af0838ffd4667ecad4eb5eec7a25342f1e

https://marc.info/?l=linux-netdev&m=153900037804969

010cb21d4ede math64: prevent double calculation of DIV64_U64_ROUND_UP()
arguments
f77d7a05670d mm: don't miss the last page because of round-off error
d18bf0af683e mm: drain memcg stocks on css offlining
71cd51b2e1ca mm: rework memcg kernel stack accounting
f3a2fccbce15 mm: slowly shrink slabs with a relatively small number of
objects

Daniel McGinnes (djmcginnes) wrote :

Hi, any update on what needs to happen to get these patches backported to a 4.15 Kernel build for Ubuntu 18?

Joseph Salisbury (jsalisbury) wrote :

Because there are a few and they affect the memory management layer, it might be best to submit these patches to the Ubuntu Kernel Team mailing list for feedback.

<email address hidden>

Joshua R. Poulson (jrp) wrote :

This issue also exists on the Linux-azure kernel series.

Dexuan Cui (decui) wrote :

More patches are required: https://lkml.org/lkml/2018/11/2/182
It looks we'll have to wait for some time, before the kernel stabilizes...

Changed in linux-azure (Ubuntu):
status: New → Triaged
Changed in linux-azure (Ubuntu Bionic):
status: New → Triaged
Changed in linux-azure (Ubuntu Cosmic):
status: New → Triaged
no longer affects: linux-azure (Ubuntu Cosmic)
no longer affects: linux-azure (Ubuntu Bionic)
Changed in linux-azure (Ubuntu):
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers