bug of "memory.kmem.limit_in_bytes" and "memory.kmem.usage_in_bytes"

Bug #1568592 reported by Wei Tsui
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

In ubuntuxenial with linux kernel 4.4,

"memory.kmem.limit_in_bytes" cannot be written,

and the value of "memory.kmem.usage_in_bytes" is always 0.

This issue doesn't happen in ubuntu wily (kernel = 4.2)

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-generic 4.4.0.18.19
ProcVersionSignature: Ubuntu 4.4.0-18.34-generic 4.4.6
Uname: Linux 4.4.0-18-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.4.0-18-generic.
ApportVersion: 2.20.1-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ghostplant 2724 F.... pulseaudio
 /dev/snd/controlC1: ghostplant 2724 F.... pulseaudio
CurrentDesktop: GNOME-Flashback:Unity
Date: Mon Apr 11 03:24:49 2016
JournalErrors:
 Error: command ['journalctl', '-b', '--priority=warning', '--lines=1000'] failed with exit code 1: Hint: You are currently not seeing messages from other users and the system.
       Users in the 'systemd-journal' group can see all messages. Pass -q to
       turn off this notice.
 No journal files were opened due to insufficient permissions.
MachineType: Micro-Star International Co., Ltd. GE60 2PG
PciMultimedia:

ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-18-generic root=UUID=9b9acac2-b16b-4328-abc9-407fe4ce4e4d ro quiet swapaccount=1
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-18-generic N/A
 linux-backports-modules-4.4.0-18-generic N/A
 linux-firmware 1.157
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
WifiSyslog:

dmi.bios.date: 12/01/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: E16GFIMS.626
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: MS-16GF
dmi.board.vendor: Micro-Star International Co., Ltd.
dmi.board.version: REV:0.B
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrE16GFIMS.626:bd12/01/2014:svnMicro-StarInternationalCo.,Ltd.:pnGE602PG:pvrREV1.0:rvnMicro-StarInternationalCo.,Ltd.:rnMS-16GF:rvrREV0.B:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: GE60 2PG
dmi.product.version: REV:1.0
dmi.sys.vendor: Micro-Star International Co., Ltd.

Revision history for this message
Wei Tsui (ghostplant) wrote :
description: updated
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1568592

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Wei Tsui (ghostplant) wrote :

Just a fresh installation from "http://cdimage.ubuntu.com/daily-live/current/xenial-desktop-amd64.iso" will definitly have this issue, not my PC's problem.

After installing "http://cdimage.ubuntu.com/daily-live/current/xenial-desktop-amd64.iso",

# echo 1000000000 > /sys/fs/cgroup/memory/system.slice/lxc-net.service/memory.kmem.limit_in_bytes

returning "baash: echo: write error: Device or resource busy"

# cat /sys/fs/cgroup/memory/memory.kmem.limit_in_bytes

returning "0"

Revision history for this message
Wei Tsui (ghostplant) wrote :

This bug is critical because docker 1.10 will crash when running "docker run --kernel-memory 1G ubuntu bash"

Revision history for this message
Seth Forshee (sforshee) wrote :

I tried to reproduce in a freshly installed and fully updated xenial vm (admittedly not a desktop installation). Starting docker containers with the --kernel-memory argument works fine, and the value in /sys/fs/cgroup/memory/memory.kmem.limit_in_bytes is as expected. I get the EBUSY error writing to /sys/fs/cgroup/memory/system.slice/lxc-net.service/memory.kmem.limit_in_bytes, but I strongly suspect this is unrelated.

My kernel is:

$ uname -a
Linux lp1568592 4.4.0-18-generic #34-Ubuntu SMP Wed Apr 6 14:01:02 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Can you update, then see if your docker problem remains? If so, does it work without the --kernel-memory argument? Thanks.

Revision history for this message
Seth Forshee (sforshee) wrote :

Btw attaching logs does more than identify your hardware, it attaches a lot of logs from the system. If you still have the problem after updating, please do go ahead and run the apport-collect command from comment #2.

Revision history for this message
Wei Tsui (ghostplant) wrote :

Hi, after full upgrade, "memory.kmem.limit_in_bytes" and "memory.kmem.usage_in_bytes" works not after executing docker with "--kernel-memory" argument.

--

But if you run a command like "docker run -it --rm ubuntu bash" without --kernel-memory argument,

are you able to get the correct value of "cat /sys/fs/cgroup/memory/docker/e4ed03caa58b3035ea028de5c17015012aea476f166d004440eccb266f208c4a/memory.kmem.usage_in_bytes" ?

Also, "echo 1G > /sys/fs/cgroup/memory/docker/e4ed03caa58b3035ea028de5c17015012aea476f166d004440eccb266f208c4a/memory.kmem.usage_in_bytes" will also be denied.

Revision history for this message
Wei Tsui (ghostplant) wrote :

Sorry, "not after executing docker with" should be "only after executing docker with"

Revision history for this message
Seth Forshee (sforshee) wrote : Re: [Bug 1568592] Re: bug of "memory.kmem.limit_in_bytes" and "memory.kmem.usage_in_bytes"

On Mon, Apr 11, 2016 at 03:44:37PM -0000, Cui Wei wrote:
> Hi, after full upgrade, "memory.kmem.limit_in_bytes" and
> "memory.kmem.usage_in_bytes" works not after executing docker with
> "--kernel-memory" argument.

So your problem then is that you want to start a container without the
--kernel-memory argument the change it later by writing to
memory.kmem.usage_in_bytes directly? That wasn't clear from your earlier
comments.

> But if you run a command like "docker run -it --rm ubuntu bash" without
> --kernel-memory argument,
>
> are you able to get the correct value of "cat
> /sys/fs/cgroup/memory/docker/e4ed03caa58b3035ea028de5c17015012aea476f166d004440eccb266f208c4a/memory.kmem.usage_in_bytes"
> ?

Well, I get 9223372036854771712 or 0x7FFFFFFFFFFFF000. The default value
in pages is (LONG_MAX / PAGE_SIZE), or 0x7FFFFFFFFFFFF. Muliply that by
PAGE_SIZE (i.e. 4096) and you get the number I read, so yes, this seems
correct.

> Also, "echo 1G >
> /sys/fs/cgroup/memory/docker/e4ed03caa58b3035ea028de5c17015012aea476f166d004440eccb266f208c4a/memory.kmem.usage_in_bytes"
> will also be denied.

I'm pretty sure that what you're hitting here is this code in
memcg_activate_kmem() from the kernel:

        /*
         * For simplicity, we won't allow this to be disabled. It also can't
         * be changed if the cgroup has children already, or if tasks had
         * already joined.
         *
         * If tasks join before we set the limit, a person looking at
         * kmem.usage_in_bytes will have no way to determine when it took
         * place, which makes the value quite meaningless.
         *
         * After it first became limited, changes in the value of the limit are
         * of course permitted.
         */
        mutex_lock(&memcg_create_mutex);
        if (cgroup_is_populated(memcg->css.cgroup) ||
            (memcg->use_hierarchy && memcg_has_children(memcg)))
                err = -EBUSY;
        mutex_unlock(&memcg_create_mutex);
        if (err)
                goto out;

So the rules seem to be:

 1. If the limit is set before any tasks join the cgroup, allow the
    limit to be changed, even after tasks are added.
 2. If tasks join the cgroup before the limit is changed it can never be
    changed.

So, in the case where you pass --kernel-memory docker must set up the
limit before adding any tasks, therefore it is allowed and future
updates are also allowed. Without --kernel-memory it seems docker does
not set anything up in the cgroup before adding tasks so any updates to
the limits are not allowed.

All of this seems to be intentional and unchanged since kernel version
4.2. So if it works in wily then I have to wonder whether it's actually
caused by a change in docker and not a change in the kernel. What
happens if you install the wily 4.2 kernel in xenial and try the same
thing?

Revision history for this message
Wei Tsui (ghostplant) wrote :

Thanks, but a strange thing is that lxc performs different from docker. Can you try the following?

# sudo -i
# lxc-start -n ubuntu1
# echo 1g > /sys/fs/cgroup/memory/lxc/ubuntu1/memory.limit_in_bytes
# echo 1g > /sys/fs/cgroup/memory/lxc/ubuntu1/memory.memsw.limit_in_bytes
# lxc-freeze -n ubuntu1
# echo 0 > /sys/fs/cgroup/memory/lxc/ubuntu1/memory.limit_in_bytes
# cat /sys/fs/cgroup/memory/lxc/ubuntu1/memory.usage_in_bytes
0

Then, why the last command returns "0" ? Where is kmem? It makes me think that kmem has been drop. Do you know why the whole memory data including kmem is moved to swap, or a bug?

Revision history for this message
Wei Tsui (ghostplant) wrote :

Previously, I am always thinking the following:

1) memory.usage_in_bytes = memory.kmem.usage_in_bytes + "Non-Kernel Memory Data"

2) the data of "memory.kmem.usage_in_bytes" cannot be moved to swap

which indicates that memory.usage_in_bytes should be always >0 since memory.kmem.usage_in_bytes should be always >0. Then how comes the result of the last comment? Or is there anything wrong of my previous understanding?

Revision history for this message
Wei Tsui (ghostplant) wrote :

So I think there is still bug about memory.kmem.limit_in_bytes.

If I don't limit the kmem of lxc container at the beginning, "memory.limit_in_bytes" can be set to zero by the above step.

If I limit the kmem of lxc container at the beginning, "memory.limit_in_bytes" cannot be set to zero, also by the above step.

Do you think it is reasonable?

Revision history for this message
Seth Forshee (sforshee) wrote :

The behavior seems odd on the face of it, but I'm not an expert on the memory controller cgroup implementation. Our implementation is the same as in upstream Linux, so if you feel there are problems then it's probably best to raise these with the upstream developers directly. If they make changes then we would consider porting these changes back to xenial.

For now I'm going to close this bug as invalid. If there are upstream changes in the future that you'd like to see backported to xenial, please open a new bug and assign it to me. Thanks!

Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.