Unable to install core snap in an ephemeral boot: cannot create namespace group directory /run/snapd/ns: Bad file descriptor

Bug #1665808 reported by Mike Pontillo
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
AppArmor
New
Undecided
Unassigned
snapd
Triaged
High
Unassigned

Bug Description

After trying to install a snap inside an ephemeral boot (MAAS commissioning), I see the following error:

$ sudo snap install core
error: cannot perform the following tasks:
- Run configure hook of "core" snap if present (cannot create namespace group directory /run/snapd/ns: Bad file descriptor)

Here's some more context; I tried with the current snap-confine and snapd in xenial-updates, then tried the package from 17.04 in case a fix had already been uploaded:

$ snap info core
name: core
summary: "snapd runtime environment"
publisher: canonical
description: |
  The core runtime environment for snapd
type: core
channels:
  stable: 16.04.1 (1222) 79MB -
  candidate: 16.04.1 (1222) 79MB -
  beta: 16.04.1 (1222) 79MB -
  edge: 16.04.1 (1222) 79MB -

$ apt-cache policy snapd
snapd:
  Installed: 2.22.3+17.04
  Candidate: 2.22.3+17.04
  Version table:
 *** 2.22.3+17.04 100
        100 /var/lib/dpkg/status
     2.22.2 500
        500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
     2.0.2 500
        500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages

$ uname -a
Linux maas-ephemeral 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

$ sudo cat /proc/self/mountinfo
18 25 0:17 / /sys rw,nosuid,nodev,noexec,relatime shared:10 - sysfs sysfs rw
19 25 0:4 / /proc rw,nosuid,nodev,noexec,relatime shared:16 - proc proc rw
20 25 0:6 / /dev rw,nosuid,relatime shared:4 - devtmpfs udev rw,size=4055524k,nr_inodes=1013881,mode=755
21 20 0:14 / /dev/pts rw,nosuid,noexec,relatime shared:5 - devpts devpts rw,gid=5,mode=620,ptmxmode=000
22 25 0:18 / /run rw,nosuid,noexec,relatime shared:8 - tmpfs tmpfs rw,size=815132k,mode=755
23 25 8:32 / /media/root-ro ro,relatime shared:2 - squashfs /dev/disk/by-path/ip-172.24.25.1:3260-iscsi-iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-amd64-ga-16.04-xenial-daily-lun-1 ro
24 25 0:19 / /media/root-rw rw,relatime shared:3 - tmpfs tmpfs-root rw
25 0 0:20 / / rw,relatime shared:1 - overlay overlayroot rw,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_
28 25 0:21 / /lib/modules rw,relatime shared:7 - tmpfs copymods rw
29 18 0:12 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:11 - securityfs securityfs rw
30 20 0:22 / /dev/shm rw,nosuid,nodev shared:6 - tmpfs tmpfs rw
31 22 0:23 / /run/lock rw,nosuid,nodev,noexec,relatime shared:9 - tmpfs tmpfs rw,size=5120k
32 18 0:24 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:12 - tmpfs tmpfs ro,mode=755
33 32 0:25 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd
34 18 0:26 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:14 - pstore pstore rw
35 18 0:27 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime shared:15 - efivarfs efivarfs rw
36 32 0:28 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,net_cls,net_prio
37 32 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,freezer
38 32 0:30 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,cpuset
39 32 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,memory
40 32 0:32 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,pids
41 32 0:33 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:22 - cgroup cgroup rw,devices
42 32 0:34 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,perf_event
43 32 0:35 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,blkio
44 32 0:36 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:25 - cgroup cgroup rw,cpu,cpuacct
45 32 0:37 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:26 - cgroup cgroup rw,hugetlb
47 20 0:38 / /dev/hugepages rw,relatime shared:27 - hugetlbfs hugetlbfs rw
48 20 0:16 / /dev/mqueue rw,relatime shared:28 - mqueue mqueue rw
49 18 0:7 / /sys/kernel/debug rw,relatime shared:29 - debugfs debugfs rw
50 49 0:9 / /sys/kernel/debug/tracing rw,relatime shared:30 - tracefs tracefs rw
51 18 0:39 / /sys/fs/fuse/connections rw,relatime shared:31 - fusectl fusectl rw
46 25 0:41 / /var/lib/lxcfs rw,nosuid,nodev,relatime shared:81 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
53 22 0:42 / /run/user/1000 rw,nosuid,nodev,relatime shared:83 - tmpfs tmpfs rw,size=815132k,mode=700,uid=1000,gid=1000

sudo strace snap install core: http://paste.ubuntu.com/24016829/

Revision history for this message
Michael Vogt (mvo) wrote :

Could you please attach the syslog file after the error happens? The strace is nice but unfortunately we would need the strace of the server (snapd) side of the operation. If you want to do that please make sure to use strace -f as go will spawn a lot of processes - and its going to be very chatty :) The syslog is probably sufficient to get a first idea.

Changed in snappy:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Well, that's interesting. The behavior has changed slightly, and I'm not sure why. Now when I install the core snap, I get:

    core 16-2 from 'canonical' installed

(I used to see the same error when installing the core snap.)

Next I tried installing the snap I *really* wanted to install (via `sudo snap install --devmode --edge sonic-lool`) and it still fails; and I see a syslog message very similar to the original error as follows (see like 45 in the paste):

    http://paste.ubuntu.com/24045264/

Based on the syslog message, it looks like a problem with the AppArmor profile's lack of consideration for /overlay/... as a full path in the ephemeral environment. (Not that it should have been obvious!)

Revision history for this message
Zygmunt Krynicki (zyga) wrote : Re: [Bug 1665808] Unable to install core snap in an ephemeral boot: cannot create namespace group directory /run/snapd/ns: Bad file descriptor

> Wiadomość napisana przez Mike Pontillo <email address hidden> w dniu 22.02.2017, o godz. 08:57:
>
> Well, that's interesting. The behavior has changed slightly, and I'm not
> sure why. Now when I install the core snap, I get:
>
> core 16-2 from 'canonical’ installed
This is most likely caused by the removal of the configure hook.

The error only happens when you run any hook/application.

The root of the error is: Feb 22 07:50:32 maas-commission snap[11789]: cannot create namespace group directory /run/snapd/ns: Bad
 file descriptor

I strongly suspect that snap-confine doesn’t work on top of overlayfs. I can investigate this but I cannot offer you any immediate fixes.

>
> (I used to see the same error when installing the core snap.)
>
> Next I tried installing the snap I *really* wanted to install (via `sudo
> snap install --devmode --edge sonic-lool`) and it still fails; and I see
> a syslog message very similar to the original error as follows (see like
> 45 in the paste):
>
> http://paste.ubuntu.com/24045264/
>
> Based on the syslog message, it looks like a problem with the AppArmor
> profile's lack of consideration for /overlay/... as a full path in the

overlayfs doesn’t support apparmor at all, it is not something that we can trivially fix.

> ephemeral environment. (Not that it should have been obvious!)
>
> --
> You received this bug notification because you are a member of Snappy
> Developers, which is subscribed to Snappy.
> Matching subscriptions: xxx-bugs-on-snapd
> https://bugs.launchpad.net/bugs/1665808
>
> Title:
> Unable to install core snap in an ephemeral boot: cannot create
> namespace group directory /run/snapd/ns: Bad file descriptor
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/snappy/+bug/1665808/+subscriptions

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Thanks for the reply.

As this is an ephemeral environment (and will as such be destroyed very soon after its creation) I think an acceptable workaround would be to simply disable AppArmor (or portions of it) during this operation.

If I don't hear any additional ideas on this bug report, I'll look into doing that tomorrow.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

The following steps allowed me to work around the issue:

sudo apt-get install -yu apparmor-utils
sudo aa-complain /usr/lib/snapd/snap-confine

I'm happy with the workaround, though it would be nice if AppArmor was more flexible, or could improve its UX in some way (such as logging a warning instead of raising an error if someone tries to use it on overlayfs).

Revision history for this message
John Johansen (jjohansen) wrote :

UX wise something that might help is running aa-notify from apparmor-notify package that will provide notifications for apparmor denials.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Forgive my ignorance, but isn't aa-notify how we figured out via syslog that AppArmor was causing the problem? Or was that some other mechanism?

I see two options for a better user experience, in order of preference (again, speaking somewhat from ignorance about what's possible here):

(1) Provide a feature (or just make it a default setting, if deemed to be safe) that lets me ignore portions of a profile which can be determined to be unsupported. That way I can keep portions of a profile that will work, but ignore portions that don't make sense for the underlying filesystem.

(2) Allow AppArmor to [log lots of scary warnings and] operate in a warn-only mode if it encounters a filesystem it cannot support, such as overlayfs.

(3) Provide a mechanism to allow me to determine if a particular AppArmor profile will not work because the filesystem is not supported. That way I could write a script to determine if I should invoke the `aa-complain` workaround and disable the profile (which is certain to be hopeless at protecting anything anyway).

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

We have observed snapd working with aufs on zesty, which is a functionally similar environment -- but is the difference the use of aufs or the use of zesty (most likely the kernel)?

Michael Vogt (mvo)
affects: snappy → snapd
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.