starting any container with umask 007 breaks host system shutdown. lxc-stop just hangs.

Bug #1642767 reported by Forest
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

If I have umask 007 (or any other value that masks the world-execute bit) when I run lxc-start for the first time after logging in, my host system enters a state with the following problems:

* lxc-stop hangs forever instead of stopping any container, even one that wasn't started with umask 007.
* lxc-stop --kill --nolock hangs in the same way.
* Attempts to reboot or shut down the host system fail, requiring a hard reset to recover.

When lxc-stop hangs, messages like these appear in syslog every couple of minutes:

Nov 17 01:22:11 hostbox kernel: [ 3360.091624] INFO: task systemd:12179 blocked for more than 120 seconds.
Nov 17 01:22:11 hostbox kernel: [ 3360.091629] Tainted: P OE 4.4.0-47-generic #68-Ubuntu
Nov 17 01:22:11 hostbox kernel: [ 3360.091631] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 17 01:22:11 hostbox kernel: [ 3360.091633] systemd D ffff8800c6febb58 0 12179 12168 0x00000104
Nov 17 01:22:11 hostbox kernel: [ 3360.091638] ffff8800c6febb58 ffff8800d318d280 ffff88040c649b80 ffff8800d318d280
Nov 17 01:22:11 hostbox kernel: [ 3360.091641] ffff8800c6fec000 ffff8800345bc088 ffff8800345bc070 ffffffff00000000
Nov 17 01:22:11 hostbox kernel: [ 3360.091644] fffffffe00000001 ffff8800c6febb70 ffffffff81830f15 ffff8800d318d280
Nov 17 01:22:11 hostbox kernel: [ 3360.091647] Call Trace:
Nov 17 01:22:11 hostbox kernel: [ 3360.091653] [<ffffffff81830f15>] schedule+0x35/0x80
Nov 17 01:22:11 hostbox kernel: [ 3360.091657] [<ffffffff81833b62>] rwsem_down_write_failed+0x202/0x350
Nov 17 01:22:11 hostbox kernel: [ 3360.091662] [<ffffffff812899a0>] ? kernfs_sop_show_options+0x40/0x40
Nov 17 01:22:11 hostbox kernel: [ 3360.091666] [<ffffffff81403fa3>] call_rwsem_down_write_failed+0x13/0x20
Nov 17 01:22:11 hostbox kernel: [ 3360.091669] [<ffffffff8183339d>] ? down_write+0x2d/0x40
Nov 17 01:22:11 hostbox kernel: [ 3360.091672] [<ffffffff812104a0>] grab_super+0x30/0xa0
Nov 17 01:22:11 hostbox kernel: [ 3360.091674] [<ffffffff81210a32>] sget_userns+0x152/0x450
Nov 17 01:22:11 hostbox kernel: [ 3360.091677] [<ffffffff81289a20>] ? kernfs_sop_show_path+0x50/0x50
Nov 17 01:22:11 hostbox kernel: [ 3360.091680] [<ffffffff81289c8e>] kernfs_mount_ns+0x7e/0x230
Nov 17 01:22:11 hostbox kernel: [ 3360.091685] [<ffffffff811187ab>] cgroup_mount+0x2eb/0x7f0
Nov 17 01:22:11 hostbox kernel: [ 3360.091687] [<ffffffff81211af8>] mount_fs+0x38/0x160
Nov 17 01:22:11 hostbox kernel: [ 3360.091691] [<ffffffff8122db57>] vfs_kern_mount+0x67/0x110
Nov 17 01:22:11 hostbox kernel: [ 3360.091694] [<ffffffff81230329>] do_mount+0x269/0xde0
Nov 17 01:22:11 hostbox kernel: [ 3360.091698] [<ffffffff812311cf>] SyS_mount+0x9f/0x100
Nov 17 01:22:11 hostbox kernel: [ 3360.091701] [<ffffffff81834ff2>] entry_SYSCALL_64_fastpath+0x16/0x71

When system shutdown hangs, similar messages appear on the console every couple of minutes.

I can reproduce this at will with a freshly-installed and fully-updated host OS in VirtualBox, and with either an old-ish container or a new one.

I'm running lxc 2.0.5-0ubuntu1~ubuntu16.04.2 on xubuntu 16.04.1 LTS amd64.

My containers are all unprivileged.

My umask at container creation time does not seem to matter. As far as I have seen, my umask only matters the first time I start a container in my login session.

I can work around the bug by manually setting my umask to something more permissive before I start my first container of the day, and then setting it back again, but that's rather a hassle. (Even worse, it's very easy to forget this workaround and be left with containers that can't be stopped and a host system that won't shut down cleanly.)

Revision history for this message
Forest (foresto) wrote :

Possibly related: when the problem is triggered, I notice that my guest instances start with no /etc/resolv.conf and no inet address.

Forest (foresto)
description: updated
Forest (foresto)
description: updated
Forest (foresto)
description: updated
Revision history for this message
Christian Brauner (cbrauner) wrote :

This sounds like a kernel bug to me. Can you please provide the output of:

uname -a

and try to reproduce this on a newer kernel version and report back?

Revision history for this message
Forest (foresto) wrote :

Linux xenialbox 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Forest (foresto) wrote :

I reproduced it with the latest mainline kernel as well:

Linux xenialbox 4.9.0-040900rc6-generic #201611201731 SMP Sun Nov 20 22:33:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Here's the slightly different syslog output:

Nov 22 14:36:55 xenialbox kernel: [ 484.506570] INFO: task systemd:3086 blocked for more than 120 seconds.
Nov 22 14:36:55 xenialbox kernel: [ 484.506578] Not tainted 4.9.0-040900rc6-generic #201611201731
Nov 22 14:36:55 xenialbox kernel: [ 484.506579] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 22 14:36:55 xenialbox kernel: [ 484.506582] systemd D 0 3086 3076 0x00000104
Nov 22 14:36:55 xenialbox kernel: [ 484.506589] ffff8eaa1810bc00 ffff8eaa58f4d800 ffff8eaa57f59d00 ffff8eaa5fc19340
Nov 22 14:36:55 xenialbox kernel: [ 484.506593] ffff8eaa54bec880 ffff9bab80f17b78 ffffffff8bc87183 ffffffff8c244480
Nov 22 14:36:55 xenialbox kernel: [ 484.506596] 00ff8eaa57f59d00 ffff8eaa5fc19340 0000000000000000 ffff8eaa57f59d00
Nov 22 14:36:55 xenialbox kernel: [ 484.506600] Call Trace:
Nov 22 14:36:55 xenialbox kernel: [ 484.506622] [<ffffffff8bc87183>] ? __schedule+0x233/0x6e0
Nov 22 14:36:55 xenialbox kernel: [ 484.506626] [<ffffffff8bc87666>] schedule+0x36/0x80
Nov 22 14:36:55 xenialbox kernel: [ 484.506629] [<ffffffff8bc8a4da>] rwsem_down_write_failed+0x20a/0x380
Nov 22 14:36:55 xenialbox kernel: [ 484.506634] [<ffffffff8b463f3e>] ? kvm_sched_clock_read+0x1e/0x30
Nov 22 14:36:55 xenialbox kernel: [ 484.506643] [<ffffffff8b6ba5e0>] ? kernfs_sop_show_options+0x40/0x40
Nov 22 14:36:55 xenialbox kernel: [ 484.506651] [<ffffffff8b8265a7>] call_rwsem_down_write_failed+0x17/0x30
Nov 22 14:36:55 xenialbox kernel: [ 484.506655] [<ffffffff8bc89b1d>] down_write+0x2d/0x40
Nov 22 14:36:55 xenialbox kernel: [ 484.506658] [<ffffffff8b639fa0>] grab_super+0x30/0xa0
Nov 22 14:36:55 xenialbox kernel: [ 484.506661] [<ffffffff8b63a58f>] sget_userns+0x18f/0x4d0
Nov 22 14:36:55 xenialbox kernel: [ 484.506663] [<ffffffff8b6ba670>] ? kernfs_sop_show_path+0x50/0x50
Nov 22 14:36:55 xenialbox kernel: [ 484.506666] [<ffffffff8b6ba89e>] kernfs_mount_ns+0x7e/0x230
Nov 22 14:36:55 xenialbox kernel: [ 484.506674] [<ffffffff8b5230b8>] cgroup_mount+0x328/0x840
Nov 22 14:36:55 xenialbox kernel: [ 484.506679] [<ffffffff8b6036f5>] ? alloc_pages_current+0x95/0x140
Nov 22 14:36:55 xenialbox kernel: [ 484.506682] [<ffffffff8b63b578>] mount_fs+0x38/0x150
Nov 22 14:36:55 xenialbox kernel: [ 484.506686] [<ffffffff8b659177>] vfs_kern_mount+0x67/0x110
Nov 22 14:36:55 xenialbox kernel: [ 484.506688] [<ffffffff8b65baf1>] do_mount+0x1e1/0xcb0
Nov 22 14:36:55 xenialbox kernel: [ 484.506691] [<ffffffff8b6330df>] ? __check_object_size+0xff/0x1d6
Nov 22 14:36:55 xenialbox kernel: [ 484.506695] [<ffffffff8b60efe7>] ? kmem_cache_alloc_trace+0xd7/0x190
Nov 22 14:36:55 xenialbox kernel: [ 484.506697] [<ffffffff8b65c8d3>] SyS_mount+0x83/0xd0
Nov 22 14:36:55 xenialbox kernel: [ 484.506700] [<ffffffff8b403b6b>] do_syscall_64+0x5b/0xc0
Nov 22 14:36:55 xenialbox kernel: [ 484.506702] [<ffffffff8bc8c46f>] entry_SYSCALL64_slow_path+0x25/0x25

Revision history for this message
Forest (foresto) wrote :

Problem still exists in Ubuntu 16.10.

summary: - starting any container with umask 007 breaks lxc-stop and prevents host
- system shutdown
+ starting any container with umask 007 breaks host system shutdown. lxc-
+ stop just hangs.
Revision history for this message
Pierre-Louis Bonicoli (pierre-louis-bonicoli) wrote :

I was able to reproduce this bug on Debian unstable (lxc=2.0.7-2, libpam-cgfs=2.0.6-1, systemd=232-22, linux-image-4.9.0-2-amd64=4.9.18-1 or even using 4.11.0-rc6-1, libpam-cgm not installed, cgmanager not installed) with Debian Jessie unprivileged container (created using download template [1]). Systemd version in the container: 215-17+deb8u6.

In addition to the three symptoms listed in the bug description, here is another: in the container "/sys/fs/cgroup/systemd" isn't mounted (systemctl command fails and any attempt to manually mount it will hangs forever).

It appears there are two problems:

1. When using umask, create lxc cgroups before running lxc-start:

$ mkdir /sys/fs/cgroup/systemd/user.slice/user-$UID.slice/session-$XDG_SESSION_ID.scope/lxc
$ mkdir /sys/fs/cgroup/{freezer,memory}/user/$USER/0/lxc

# replace <subgid>
$ sudo chgrp <subgid> /sys/fs/cgroup/systemd/user.slice/user-$UID.slice/session-$XDG_SESSION_ID.scope/lxc
$ sudo chgrp <subgid> /sys/fs/cgroup/{freezer,memory}/user/$USER/0/lxc

$ chmod g+x /sys/fs/cgroup/systemd/user.slice/user-$UID.slice/session-$XDG_SESSION_ID.scope/lxc
$ chmod g+x /sys/fs/cgroup/{memory,freezer}/user/$USER/0/lxc

Start the container, systemd will be able to mount /sys/fs/cgroup/systemd/:

$ lxc-start -n <name>

And lxc-stop works, host is able to reboot without hard reset.

2. About the kernel related problem: systemd try to mount "/sys/fs/cgroup/systemd/" twice ([2]: mount_table and mount_setup): once using "none,name=systemd,xattr" options then if the first try fails there is another using "none,name=systemd". The first try returns "permission denied" and then systemd become stuck at the second try.

Without manually creating the lxc cgroups, I was able to reproduce this problem using unprivileged container Alpine edge (Alpine doesn't use systemd):

$ lxc-attach -n alpine_container --clear-env
# mount -t tmpfs tmpfs /sys/fs/cgroup
# mkdir /sys/fs/cgroup/systemd
# mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
-> mount command hangs, lxc-stop hangs, host needs a hard reset

[1] http://images.linuxcontainers.org/
[2] https://github.com/systemd/systemd/blob/1b59cf04aee20525179f81928f1e1794ce970551/src/core/mount-setup.c#L104

Revision history for this message
Pierre-Louis Bonicoli (pierre-louis-bonicoli) wrote :

My previous comment is unclear, the two problems are:

1. 'lxc' directories below '/sys/fs/cgroup/' are created according to the umask setting

2. then mounting '/sys/fs/cgroup/systemd' in the container hangs (and attempts to reboot or shut down the host system fail, a hard reset is required).

Revision history for this message
geez (geez) wrote :

I'm running umask 027 (set in `.bashrc` for my user, which is copied to my root session after using `su`) and can report this issue still occurs for unprivileged containers under Debian 9 (stretch, stable) using LXC 2.0.7-2+deb9u2 and kernel 4.19.0-0.bpo.4-amd64. I can also confirm umask at container creation time does not matter. Another symptom of this issue is that after it occurred, `sync` (both the utility and system call) and `update-initramfs` (which calls `sync`) start hanging.

The comments to the [launchpad bug](https://discuss.linuxcontainers.org/t/cannot-stop-unprivileged-container-not-even-kill-9-its-systemd-process-on-host/1079) by the original author of this issue and [this](https://discuss.linuxcontainers.org/t/cannot-stop-unprivileged-container-not-even-kill-9-its-systemd-process-on-host/1079) thread contain a few discussions with regards to possible reasons this occurs. Also, is #2277 perhaps related?

It took a long time before I traced this down to the umask, because the symptoms are bewildering. My original issue:
https://superuser.com/questions/1439108/lxc-start-stop-hangs-and-filesystem-sync-hangs/1440273

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxc (Ubuntu):
status: New → Confirmed
Revision history for this message
Stéphane Graber (stgraber) wrote :

Moving over to the kernel as a userspace process shouldn't be able to cause such a hang regardless of what it does so this looks like a kernel bug (lock related by the looks of it).

affects: lxc (Ubuntu) → linux (Ubuntu)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.