starting any container with umask 007 breaks host system shutdown. lxc-stop just hangs.

Bug #1642767 reported by Forest on 2016-11-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Undecided
Unassigned

Bug Description

If I have umask 007 (or any other value that masks the world-execute bit) when I run lxc-start for the first time after logging in, my host system enters a state with the following problems:

* lxc-stop hangs forever instead of stopping any container, even one that wasn't started with umask 007.
* lxc-stop --kill --nolock hangs in the same way.
* Attempts to reboot or shut down the host system fail, requiring a hard reset to recover.

When lxc-stop hangs, messages like these appear in syslog every couple of minutes:

Nov 17 01:22:11 hostbox kernel: [ 3360.091624] INFO: task systemd:12179 blocked for more than 120 seconds.
Nov 17 01:22:11 hostbox kernel: [ 3360.091629] Tainted: P OE 4.4.0-47-generic #68-Ubuntu
Nov 17 01:22:11 hostbox kernel: [ 3360.091631] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 17 01:22:11 hostbox kernel: [ 3360.091633] systemd D ffff8800c6febb58 0 12179 12168 0x00000104
Nov 17 01:22:11 hostbox kernel: [ 3360.091638] ffff8800c6febb58 ffff8800d318d280 ffff88040c649b80 ffff8800d318d280
Nov 17 01:22:11 hostbox kernel: [ 3360.091641] ffff8800c6fec000 ffff8800345bc088 ffff8800345bc070 ffffffff00000000
Nov 17 01:22:11 hostbox kernel: [ 3360.091644] fffffffe00000001 ffff8800c6febb70 ffffffff81830f15 ffff8800d318d280
Nov 17 01:22:11 hostbox kernel: [ 3360.091647] Call Trace:
Nov 17 01:22:11 hostbox kernel: [ 3360.091653] [<ffffffff81830f15>] schedule+0x35/0x80
Nov 17 01:22:11 hostbox kernel: [ 3360.091657] [<ffffffff81833b62>] rwsem_down_write_failed+0x202/0x350
Nov 17 01:22:11 hostbox kernel: [ 3360.091662] [<ffffffff812899a0>] ? kernfs_sop_show_options+0x40/0x40
Nov 17 01:22:11 hostbox kernel: [ 3360.091666] [<ffffffff81403fa3>] call_rwsem_down_write_failed+0x13/0x20
Nov 17 01:22:11 hostbox kernel: [ 3360.091669] [<ffffffff8183339d>] ? down_write+0x2d/0x40
Nov 17 01:22:11 hostbox kernel: [ 3360.091672] [<ffffffff812104a0>] grab_super+0x30/0xa0
Nov 17 01:22:11 hostbox kernel: [ 3360.091674] [<ffffffff81210a32>] sget_userns+0x152/0x450
Nov 17 01:22:11 hostbox kernel: [ 3360.091677] [<ffffffff81289a20>] ? kernfs_sop_show_path+0x50/0x50
Nov 17 01:22:11 hostbox kernel: [ 3360.091680] [<ffffffff81289c8e>] kernfs_mount_ns+0x7e/0x230
Nov 17 01:22:11 hostbox kernel: [ 3360.091685] [<ffffffff811187ab>] cgroup_mount+0x2eb/0x7f0
Nov 17 01:22:11 hostbox kernel: [ 3360.091687] [<ffffffff81211af8>] mount_fs+0x38/0x160
Nov 17 01:22:11 hostbox kernel: [ 3360.091691] [<ffffffff8122db57>] vfs_kern_mount+0x67/0x110
Nov 17 01:22:11 hostbox kernel: [ 3360.091694] [<ffffffff81230329>] do_mount+0x269/0xde0
Nov 17 01:22:11 hostbox kernel: [ 3360.091698] [<ffffffff812311cf>] SyS_mount+0x9f/0x100
Nov 17 01:22:11 hostbox kernel: [ 3360.091701] [<ffffffff81834ff2>] entry_SYSCALL_64_fastpath+0x16/0x71

When system shutdown hangs, similar messages appear on the console every couple of minutes.

I can reproduce this at will with a freshly-installed and fully-updated host OS in VirtualBox, and with either an old-ish container or a new one.

I'm running lxc 2.0.5-0ubuntu1~ubuntu16.04.2 on xubuntu 16.04.1 LTS amd64.

My containers are all unprivileged.

My umask at container creation time does not seem to matter. As far as I have seen, my umask only matters the first time I start a container in my login session.

I can work around the bug by manually setting my umask to something more permissive before I start my first container of the day, and then setting it back again, but that's rather a hassle. (Even worse, it's very easy to forget this workaround and be left with containers that can't be stopped and a host system that won't shut down cleanly.)

Forest (foresto) wrote :

Possibly related: when the problem is triggered, I notice that my guest instances start with no /etc/resolv.conf and no inet address.

Forest (foresto) on 2016-11-22
description: updated
Forest (foresto) on 2016-11-22
description: updated
Forest (foresto) on 2016-11-22
description: updated
Christian Brauner (cbrauner) wrote :

This sounds like a kernel bug to me. Can you please provide the output of:

uname -a

and try to reproduce this on a newer kernel version and report back?

Forest (foresto) wrote :

Linux xenialbox 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Forest (foresto) wrote :

I reproduced it with the latest mainline kernel as well:

Linux xenialbox 4.9.0-040900rc6-generic #201611201731 SMP Sun Nov 20 22:33:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Here's the slightly different syslog output:

Nov 22 14:36:55 xenialbox kernel: [ 484.506570] INFO: task systemd:3086 blocked for more than 120 seconds.
Nov 22 14:36:55 xenialbox kernel: [ 484.506578] Not tainted 4.9.0-040900rc6-generic #201611201731
Nov 22 14:36:55 xenialbox kernel: [ 484.506579] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 22 14:36:55 xenialbox kernel: [ 484.506582] systemd D 0 3086 3076 0x00000104
Nov 22 14:36:55 xenialbox kernel: [ 484.506589] ffff8eaa1810bc00 ffff8eaa58f4d800 ffff8eaa57f59d00 ffff8eaa5fc19340
Nov 22 14:36:55 xenialbox kernel: [ 484.506593] ffff8eaa54bec880 ffff9bab80f17b78 ffffffff8bc87183 ffffffff8c244480
Nov 22 14:36:55 xenialbox kernel: [ 484.506596] 00ff8eaa57f59d00 ffff8eaa5fc19340 0000000000000000 ffff8eaa57f59d00
Nov 22 14:36:55 xenialbox kernel: [ 484.506600] Call Trace:
Nov 22 14:36:55 xenialbox kernel: [ 484.506622] [<ffffffff8bc87183>] ? __schedule+0x233/0x6e0
Nov 22 14:36:55 xenialbox kernel: [ 484.506626] [<ffffffff8bc87666>] schedule+0x36/0x80
Nov 22 14:36:55 xenialbox kernel: [ 484.506629] [<ffffffff8bc8a4da>] rwsem_down_write_failed+0x20a/0x380
Nov 22 14:36:55 xenialbox kernel: [ 484.506634] [<ffffffff8b463f3e>] ? kvm_sched_clock_read+0x1e/0x30
Nov 22 14:36:55 xenialbox kernel: [ 484.506643] [<ffffffff8b6ba5e0>] ? kernfs_sop_show_options+0x40/0x40
Nov 22 14:36:55 xenialbox kernel: [ 484.506651] [<ffffffff8b8265a7>] call_rwsem_down_write_failed+0x17/0x30
Nov 22 14:36:55 xenialbox kernel: [ 484.506655] [<ffffffff8bc89b1d>] down_write+0x2d/0x40
Nov 22 14:36:55 xenialbox kernel: [ 484.506658] [<ffffffff8b639fa0>] grab_super+0x30/0xa0
Nov 22 14:36:55 xenialbox kernel: [ 484.506661] [<ffffffff8b63a58f>] sget_userns+0x18f/0x4d0
Nov 22 14:36:55 xenialbox kernel: [ 484.506663] [<ffffffff8b6ba670>] ? kernfs_sop_show_path+0x50/0x50
Nov 22 14:36:55 xenialbox kernel: [ 484.506666] [<ffffffff8b6ba89e>] kernfs_mount_ns+0x7e/0x230
Nov 22 14:36:55 xenialbox kernel: [ 484.506674] [<ffffffff8b5230b8>] cgroup_mount+0x328/0x840
Nov 22 14:36:55 xenialbox kernel: [ 484.506679] [<ffffffff8b6036f5>] ? alloc_pages_current+0x95/0x140
Nov 22 14:36:55 xenialbox kernel: [ 484.506682] [<ffffffff8b63b578>] mount_fs+0x38/0x150
Nov 22 14:36:55 xenialbox kernel: [ 484.506686] [<ffffffff8b659177>] vfs_kern_mount+0x67/0x110
Nov 22 14:36:55 xenialbox kernel: [ 484.506688] [<ffffffff8b65baf1>] do_mount+0x1e1/0xcb0
Nov 22 14:36:55 xenialbox kernel: [ 484.506691] [<ffffffff8b6330df>] ? __check_object_size+0xff/0x1d6
Nov 22 14:36:55 xenialbox kernel: [ 484.506695] [<ffffffff8b60efe7>] ? kmem_cache_alloc_trace+0xd7/0x190
Nov 22 14:36:55 xenialbox kernel: [ 484.506697] [<ffffffff8b65c8d3>] SyS_mount+0x83/0xd0
Nov 22 14:36:55 xenialbox kernel: [ 484.506700] [<ffffffff8b403b6b>] do_syscall_64+0x5b/0xc0
Nov 22 14:36:55 xenialbox kernel: [ 484.506702] [<ffffffff8bc8c46f>] entry_SYSCALL64_slow_path+0x25/0x25

Forest (foresto) wrote :

Problem still exists in Ubuntu 16.10.

summary: - starting any container with umask 007 breaks lxc-stop and prevents host
- system shutdown
+ starting any container with umask 007 breaks host system shutdown. lxc-
+ stop just hangs.

I was able to reproduce this bug on Debian unstable (lxc=2.0.7-2, libpam-cgfs=2.0.6-1, systemd=232-22, linux-image-4.9.0-2-amd64=4.9.18-1 or even using 4.11.0-rc6-1, libpam-cgm not installed, cgmanager not installed) with Debian Jessie unprivileged container (created using download template [1]). Systemd version in the container: 215-17+deb8u6.

In addition to the three symptoms listed in the bug description, here is another: in the container "/sys/fs/cgroup/systemd" isn't mounted (systemctl command fails and any attempt to manually mount it will hangs forever).

It appears there are two problems:

1. When using umask, create lxc cgroups before running lxc-start:

$ mkdir /sys/fs/cgroup/systemd/user.slice/user-$UID.slice/session-$XDG_SESSION_ID.scope/lxc
$ mkdir /sys/fs/cgroup/{freezer,memory}/user/$USER/0/lxc

# replace <subgid>
$ sudo chgrp <subgid> /sys/fs/cgroup/systemd/user.slice/user-$UID.slice/session-$XDG_SESSION_ID.scope/lxc
$ sudo chgrp <subgid> /sys/fs/cgroup/{freezer,memory}/user/$USER/0/lxc

$ chmod g+x /sys/fs/cgroup/systemd/user.slice/user-$UID.slice/session-$XDG_SESSION_ID.scope/lxc
$ chmod g+x /sys/fs/cgroup/{memory,freezer}/user/$USER/0/lxc

Start the container, systemd will be able to mount /sys/fs/cgroup/systemd/:

$ lxc-start -n <name>

And lxc-stop works, host is able to reboot without hard reset.

2. About the kernel related problem: systemd try to mount "/sys/fs/cgroup/systemd/" twice ([2]: mount_table and mount_setup): once using "none,name=systemd,xattr" options then if the first try fails there is another using "none,name=systemd". The first try returns "permission denied" and then systemd become stuck at the second try.

Without manually creating the lxc cgroups, I was able to reproduce this problem using unprivileged container Alpine edge (Alpine doesn't use systemd):

$ lxc-attach -n alpine_container --clear-env
# mount -t tmpfs tmpfs /sys/fs/cgroup
# mkdir /sys/fs/cgroup/systemd
# mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
-> mount command hangs, lxc-stop hangs, host needs a hard reset

[1] http://images.linuxcontainers.org/
[2] https://github.com/systemd/systemd/blob/1b59cf04aee20525179f81928f1e1794ce970551/src/core/mount-setup.c#L104

My previous comment is unclear, the two problems are:

1. 'lxc' directories below '/sys/fs/cgroup/' are created according to the umask setting

2. then mounting '/sys/fs/cgroup/systemd' in the container hangs (and attempts to reboot or shut down the host system fail, a hard reset is required).

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers