Memory resource controller oom killing not functioning

Bug #1303683 reported by Glyn Normington
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
High
Unassigned

Bug Description

This problem reproduces on Ubuntu 13.10 with a 3.11 kernel but, for comparison, works ok on RHEL 7 with a 3.10 kernel.

Steps to reproduce:

1. Make a directory, for example /home/glyn/cgh1.
2. Switch to root user.
3. Attach the memory resource controller subsystem to a cgroup hierarchy:

# mount -t cgroup -o memory none /home/glyn/cgh1

4. cd /home/glyn/cgh1

5. Createa a child cgroup:

# mkdir example

6. cd example

7. check oom killing is enabled:

# cat memory.oom_control
oom_kill_disable 0
under_oom 0

8. Set a memory limit for the example cgroup of approx. 1 MB:

# echo 1000000 > memory.limit_in_bytes
# cat memory.limit_in_bytes
1003520

9. Move the current process into the example cgroup and check it has moved:

# echo $$ > tasks
# cat tasks
7357
8449
# cat tasks
7357
8450

10. Run a process which will exceed 1 MB of memory:

# perl -e 'for ($i = 0; $i < 10; $i++) { $foo .= "A" x (1024 * 1024); }'
#

This terminates successfully whereas it should have been killed. On RHEL 7 (and on earlier Ubuntu versions I have tried), the results look like this:

# perl -e 'for ($i = 0; $i < 10; $i++) { $foo .= "A" x (1024 * 1024); }'
killed
#

I will attempt to reproduce this bug on an upstream kernel, but don't want to lose the bug report during the upgrade, hence I am filing the report now.

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: linux-image (not installed)
ProcVersionSignature: Ubuntu 3.11.0-17.31-generic 3.11.10.3
Uname: Linux 3.11.0-17-generic x86_64
ApportVersion: 2.12.5-0ubuntu2.2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: glyn 1587 F.... pulseaudio
Date: Mon Apr 7 10:16:24 2014
HibernationDevice: RESUME=UUID=18692506-8752-46bb-8b44-1fb2f0ba5e03
InstallationDate: Installed on 2014-03-05 (32 days ago)
InstallationMedia: Ubuntu 13.10 "Saucy Salamander" - Release amd64 (20131016.1)
IwConfig:
 eth0 no wireless extensions.

 lo no wireless extensions.

 docker0 no wireless extensions.
Lsusb:
 Bus 001 Device 004: ID 80ee:0021 VirtualBox USB Tablet
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: innotek GmbH VirtualBox
MarkForUpload: True
ProcEnviron:
 LANGUAGE=en_GB:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.11.0-17-generic root=UUID=24214f2e-a703-46ea-9fdd-0f64e110b628 ro quiet splash vt.handoff=7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-3.11.0-17-generic N/A
 linux-backports-modules-3.11.0-17-generic N/A
 linux-firmware 1.116.2
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.board.name: VirtualBox
dmi.board.vendor: Oracle Corporation
dmi.board.version: 1.2
dmi.chassis.type: 1
dmi.chassis.vendor: Oracle Corporation
dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:rvnOracleCorporation:rnVirtualBox:rvr1.2:cvnOracleCorporation:ct1:cvr:
dmi.product.name: VirtualBox
dmi.product.version: 1.2
dmi.sys.vendor: innotek GmbH

Revision history for this message
Glyn Normington (gnormington) wrote :
Revision history for this message
Glyn Normington (gnormington) wrote :

Note that the above instructions assuming swapping is off, otherwise memory.memsw.limit_in_bytes would also need to be set to 1000000.

The above testing was done on a 3.11.0-17-generic kernel. Re-testing on an upstream kernel, 3.14.0-031400-generic ("3.14-trusty"), showed the same problem - the process which should have been killed oom again ran successfully.

Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.14 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14-trusty/

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
Revision history for this message
Glyn Normington (gnormington) wrote :

Hi Joseph

As per comment #2, this bug reproduced on the latest upstream kernel.

Regards,
Glyn

tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Glyn Normington (gnormington) wrote :

It occurred to me that the behaviour observed could have been due to an optimisation in a later version of perl, but Ubuntu 13.10 is running perl v5.14.2 whereas RHEL 7 is running v5.16.3. It seems unlikely, therefore, that the version of perl is significant to this problem.

Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Glyn Normington (gnormington) wrote :

This is not a bug.

It turns out the behaviour I was seeing was due to the fact that swap accounting is disabled by default and so the perl process was swapping rather than being killed when its RAM limit was reached.

If I turn swap off (swapoff -a), the process is killed correctly. Similarly, if I inhibit swapping by setting memory.swappiness to 0 in the "example" cgroup, the process is killed correctly.

Alternatively, if I turn swap accounting on (by setting `GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"` in /etc/default/grub, running update-grub, and rebooting), then memory.memsw.* files appear in the "example" cgroup. If I set memory.limit_in_bytes and memory.memsw.limit_in_bytes both to 3000000, then the perl process is killed correctly.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.