Using snaps in lxd can be problematic during refresh (using squashfuse)

Bug #1668659 reported by Charles Butler
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Fix Released
Undecided
Zygmunt Krynicki

Bug Description

I recently learned of a work-around to get snaps working in lxd by installing squashfuse. This worked out fantastically well at first glance but it appears it may still need some apparmor tweaking.

I deployed an in-dev etcd charm with snap support, and attempted to move channels. This is consistent regardless of juju or being performed manually

snap install etcd --channel=2.3/stable
snap refresh etcd --channel=3.0/stable

You should see the following in the hosts syslog

Feb 28 09:23:02 makoto kernel: [ 3343.766075] audit: type=1400 audit(1488295382.430:355): apparmor="STATUS" operation="profile_replace" label="lxd-juju-eb2e38-1_</var/lib/lxd>//&:lxd-juju-eb2e38-1_<var-lib-lxd>://unconfined" name="snap.etcd.etcd" pid=9478 comm="apparmor_parser"
Feb 28 09:23:02 makoto kernel: [ 3343.773237] audit: type=1400 audit(1488295382.434:356): apparmor="STATUS" operation="profile_replace" label="lxd-juju-eb2e38-1_</var/lib/lxd>//&:lxd-juju-eb2e38-1_<var-lib-lxd>://unconfined" name="snap.etcd.etcdctl" pid=9480 comm="apparmor_parser"

Which appears that during profile replacement, apparmor is blocking the operation which causes the snap to appear broken. However, if doing a clean placement (install) and not a replacement (refresh) the operation succeeds.

$ snap list
Name Version Rev Developer Notes
canonical-livepatch 7 21 canonical -
charm 2.2 11 charms classic
conjure-up 2.2-dev 104 canonical classic
core 16-2 1337 canonical -
juju-crashdump 1.0.0 4 johnsca classic
rclone current 60 fireeye -
snapcraft 2.27 3 canonical classic
telegram-sergiusens 1.0.14 23 sergiusens -

$ dpkg --list lxd
ii lxd 2.9.3-0ubuntu2~ubun amd64

$ uname -ar
Linux makoto 4.8.0-39-generic #42~16.04.1-Ubuntu SMP Mon Feb 20 15:06:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

description: updated
Revision history for this message
Stéphane Graber (stgraber) wrote :

Both of those are "STATUS" messages, indicating the replacement of an apparmor profile. They are not "DENIED" messages so nothing actually got blocked.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Looking into this again now, I can't reproduce any kind of failure here:

```
root@snapcraft:~# apt install squashfuse
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  squashfuse
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 17.4 kB of archives.
After this operation, 54.3 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 squashfuse amd64 0.1.100-0ubuntu1~ubuntu16.04.1 [17.4 kB]
Fetched 17.4 kB in 0s (75.3 kB/s)
Selecting previously unselected package squashfuse.
(Reading database ... 35208 files and directories currently installed.)
Preparing to unpack .../squashfuse_0.1.100-0ubuntu1~ubuntu16.04.1_amd64.deb ...
Unpacking squashfuse (0.1.100-0ubuntu1~ubuntu16.04.1) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up squashfuse (0.1.100-0ubuntu1~ubuntu16.04.1) ...
root@snapcraft:~# snap install etcd --channel=2.3/stable
etcd (2.3/stable) 2.3.8 from 'canonical' installed
root@snapcraft:~# snap refresh etcd --channel=3.0/stable
etcd (3.0/stable) 3.0.17 from 'canonical' refreshed
```

This does however reproduce the two mentioned STATUS messages.

Revision history for this message
Charles Butler (lazypower) wrote :

Thanks for taking a look Stéphane.

My mistake on presuming it was blocked. You are correct its just a status message. Did you use the etcd.etcdctl command after running the upgrade?

The failure doesn't appear to manifest itself until you attempt to actually use the command its updated. The only indicator something has happened is in syslog with those profile replacements. Which look fine.

# snap list
Name Version Rev Developer Notes
core 16-2 1337 canonical -
etcd 2.3.8 22 canonical -

# sudo snap refresh etcd --channel=3.0/stable
etcd (3.0/stable) 3.0.17 from 'canonical' refreshed
# /snap/bin/etcd.etcdctl cluster-health
cannot snap-exec: cannot read info for "etcd": cannot find installed snap "etcd" at revision 24

description: updated
Revision history for this message
Stéphane Graber (stgraber) wrote :

Ok, tracked this down.

The problem has to do with snap-confine re-using mount namespaces. Specifically, it's re-attaching to the previous mount namespace which lacks the mount entry for the new version of the snap.

This may have to do with the way mount propagation works with fuse filesystems vs normal filesystems.

Revision history for this message
Stéphane Graber (stgraber) wrote :

1966 25 7:3 / /snap/canonical-livepatch/21 rw,relatime shared:43 - squashfs /dev/loop3 ro

^ that's for a snap on a host

714 672 0:125 / /snap/lxd/1416 ro,relatime - fuse.squashfuse squashfuse ro,user_id=0,group_id=0,allow_other
^ that's for a snap in a container

the problem is the missing MS_SHARED flag

Revision history for this message
Stéphane Graber (stgraber) wrote :

So that's effectively a regression in snapd from when they added namespace re-using to snap-confine...

Revision history for this message
Stéphane Graber (stgraber) wrote :

Right, so the problem is that the filesystem root in containers isn't marked rshared.

That's deliberate as a lot of Linux distributions still misbehave when it's marked rshared and we instead expect those who do need it to simply mark the root rshared after the fact.

Looks like snapd does require the root to be rshared so that mount propagation works when re-using its mount namespaces, but it doesn't appear to be making any checks for it.

I've confirmed that if I run "sudo mount --make-rshared/" prior to installing any snap package, things work as expected.

Makes me wonder how snap refreshes work on Ubuntu 14.04 then as upstart/mountall don't set the root to be shared (neither does sysvinit or any init system other than systemd).

Revision history for this message
Stéphane Graber (stgraber) wrote :

Ah, on Ubuntu 14.04 /lib/systemd/upstart/snap.mount.service calls make-rshared to workaround this issue.

I think the right thing to do overall is to have snapd check if /snap is on a MS_SHARED mount point. If not, then bind-mount /snap onto itself and set MS_SHARED on the resulting mountpoint. That will then cause all mounts under /snap (and ONLY /snap) to be propagated to the snap mount namespaces.

Revision history for this message
Stéphane Graber (stgraber) wrote :

This would be a no-op for systems that have their / MS_SHARED and would fix every other systems to work properly without interfering with the wishes of the administrator (if they turned off MS_SHARED for /).

This solution would also allow for snap.mount.service to be removed from the 14.04 package.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Hmm, that approach isn't going to work since snaps will be mounted before snapd is starting, so the systemd unit approach may be the best one.

Revision history for this message
Stéphane Graber (stgraber) wrote :

root@snapcraft:~# cat /lib/systemd/system/snap.mount
[Unit]
Description=Setup /snap mountpoint

[Mount]
What=/snap
Where=/snap
Type=none
Options=bind,shared

[Install]
WantedBy=multi-user.target

Revision history for this message
Stéphane Graber (stgraber) wrote :

That makes sure that /snap is created and mounted onto itself as a shared mountpoint. This works just fine regardless of whether your host is using rshared itself or not.

Revision history for this message
Stéphane Graber (stgraber) wrote :

I'd recommend snapd ships with the unit above which will also let you drop the Ubuntu 14.04 workaround unit (/lib/systemd/upstart/snap.mount.service).

I've just tested it on a Xenial host, in a Xenial container and on a Trusty host, things work as expected on all of those.

Revision history for this message
Michael Vogt (mvo) wrote :
Changed in snapd:
status: New → In Progress
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I'll have a look at making this work. The trivial approach of shipping this on 16.04 is causing regressions.

Changed in snapd:
assignee: nobody → Zygmunt Krynicki (zyga)
Revision history for this message
Michael Vogt (mvo) wrote :
Revision history for this message
Stéphane Graber (stgraber) wrote :

Pretty sure this is in current snapd, marking as fix released.

Changed in snapd:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.