snapd is not autofs aware and fails with nfs home dir

Bug #1784774 reported by Andrew Conway
100
This bug affects 20 people
Affects Status Importance Assigned to Milestone
snapd
Fix Released
Medium
Zygmunt Krynicki
firefox (Ubuntu)
Confirmed
High
Unassigned
snapd (Ubuntu)
Confirmed
High
Unassigned

Bug Description

This is similar to bugs 1662552 and 1782873. In 1782873, jdstrand asked me to open a new bug for this specific issue.

In 1662552, snapd fails for nfs mounted home directories as network permissions are not enabled. A work around was implemented that works if the mount is done via a /home mount at boot. However this does not work if people mount home directories via autofs. This is probably the fundamental problem for 1782873 although there may be other issues.

[ Why use autofs? If some but not all of users want to use nfs homes. In particular, I have a local user on all my accounts that does not require the nfs server to be up or the kerberos server to be up, or kerberos working on the client machines, etc. It is very useful when something goes wrong. It means I mount /home/user rather than /home (for several users). ]

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in snapd (Ubuntu):
status: New → Confirmed
Revision history for this message
Santiago Castro (bryant1410) wrote :

I created a patch to workaround this problem: https://github.com/bryant1410/snapd/commit/a9831da29aba7a4905647ff582061ac399a3b240 It's basically hardcoding a value. I'm not sure what the correct behavior should be for all the cases that /home is autofs, or if it should check that /home/user is NFS instead.

So, as a workaround, you can clone the repo, checkout that commit, build from source as stated in the repo, place the snapd executable in /usr/local/bin and edit the systemctl service file (/lib/systemd/system/snapd.service) to point to that executable instead.

Revision history for this message
Santiago Castro (bryant1410) wrote :

With snap and snapd 2.35.2, it seems to be working for me now.

Revision history for this message
Gabriel Devenyi (ace-staticwave) wrote :

This is still broken on bionic, AutoFS mounted home directories are not detected

Revision history for this message
John Lenton (chipaca) wrote :

snapd will not work properly with users' homes that are unavailable if the user is not logged in. As a workaround, schedule refreshes to only happen when all the local users are logged in.

Revision history for this message
Gabriel Devenyi (ace-staticwave) wrote :

The issue is NFS detection only happens at snapd start, and on boot no NFS shares are mounted yet.

If an AutoFS NFS user logs in, and then snapd is restarted, NFS support is enabled

Zygmunt Krynicki (zyga)
Changed in snapd (Ubuntu):
assignee: nobody → Zygmunt Krynicki (zyga)
assignee: Zygmunt Krynicki (zyga) → nobody
Changed in snapd:
assignee: nobody → Zygmunt Krynicki (zyga)
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Andrew Conway (acubuntuone) wrote :

I just tested restartung snapd while I am logged in via kerberos with an autofs home directlry. It doesn't seem to help. In particular, I tried launching system monitor (which uses snap) unsuccessfully. Using 18.04 with kerberos, and /home/<my-user-name> mounted via autofs.

Checking that /home is autofs will not solve the problem, if /home/user is autofs, which is useful in the case of having a local user that has a home directory in the standard place.

Revision history for this message
Philippe Clérié (pclerie) wrote :

I'd like to confirm Andrew Conway's report. Restarting snapd does not change anything.

Could you at least up the priority of the bug. After all it's been there for quite a while, it's been confirmed and people have worked and are working on it. Just a little push to solve it. Pretty please! :-)

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I'm sorry for not fixing this yet. A few higher priority bugs kept me busy lately. I will look at fixing it next Monday, with a bit of hope it may be easy and I can get it done without shuffling other planned work.

Revision history for this message
Marc Kolly (makuser) wrote :

@zyga any update on this?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

With the additional information in https://bugs.launchpad.net/snapd/+bug/1821193 I think we can fix this issue quickly.

1) The mountinfo parser will now look for an automount point that mentions autofs, an example line is mentioned here:

137 29 0:50 / /home rw,relatime shared:87 - autofs /etc/auto.master.d/home rw,fd=7,pgrp=22588,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=173399

2) We could optionally look at the referenced file (/etc/auto.master.d/home) and parse it, in this case it contains this line (among others)

* -fstype=nfs,vers=4,rw,soft,rsize=8192,wsize=8192 prodpeda-samba.domain.fr:/home/&

With this information we can enable the workaround reliably.

I will not do 2) at first, unless reviewers deem it necessary.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :
Changed in snapd:
status: Triaged → In Progress
milestone: none → 2.46
Zygmunt Krynicki (zyga)
Changed in snapd:
status: In Progress → Fix Committed
Revision history for this message
Andrew Conway (acubuntuone) wrote :

Thanks for fixing this! Much appreciated.

I tried to check that it worked, but possibly it has not gotten into updates yet. How would I check?

[ running snap-store from the command line in home dir causes the error "cannot open path of the current working directory: Permission denied". Running from the GUI has no effect. ]

While I am here, this is probably unrelated, but a couple of days after the above commit, nfs home directories on the current kernel caused the machine to freeze shortly after logging in. I have put a link to that report below on the off chance you can think of a reason that this change could cause it.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1886775

Thanks,
Andrew.

Changed in snapd:
status: Fix Committed → Fix Released
Changed in snapd (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Rohde Fischer (rohdef) wrote :

I'm sorry to be that guy. But the fix does not seem to work for me :(

rohdef@ubuntu ~> snap version
snap 2.49.1
snapd 2.49.1
series 16
ubuntu 20.10
kernel 5.8.0-1019-raspi

rohdef@ubuntu ~> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu Hirsute Hippo (development branch)
Release: 21.04
Codename: hirsute

The nfs mount in question is for the home dir as others have described, and the login is performed using ldap (don't know if that creates a significant difference?)

Furthermore, I'm actually wondering about the sanity of using the home dir for snap resources? It might be my (admittedly) limited knowledge of snap, but doesn't the usage of the home dir raise some potentially troublesome questions, for instance when a home dir (which this issue proves happens) is a shared resource across multiple systems? There might even be relevant use cases, where that share is used multiple places at the same time (e.g. a multicluster MicroK8S), won't that also cause some issues?

Again maybe my considerations shows my lack of knowledge about how snap actually works, but if I'm right in my understanding, wouldn't an inherently local directory be the most sensible solution?

Revision history for this message
Adam Weremczuk (adamwms) wrote :

Still an issue in:

$ snap version
snap 2.54.3+20.04.1ubuntu0.2
snapd 2.54.3+20.04.1ubuntu0.2
series 16
ubuntu 20.04
kernel 5.13.0-30-generic

$ pycharm-community
cannot create user data directory: /home/adam/snap/pycharm-community/267: Stale file handle

$ brackets
cannot create user data directory: /home/adam/snap/brackets/138: Stale file handle

Revision history for this message
Andrew Conway (acubuntuone) wrote :

I never got it to work in 20.04, so I don't know whether your fix ever made it in.

I have just installed Jammy Jellyfish (22.04), and can confirm snaps don't work in it when using autofs and nfs mounted home directories.

The prior work around was just never use any snap applications, which was OK as nothing important was in snaps prior to 22.04.

This is harder in 22.04 as firefox is distributed as a snap, and so firefox doesn't work in 22.04 if you have autofs NFS home directories.

Work around is to use a different source for firefox, https://ubuntuhandbook.org/index.php/2022/04/install-firefox-deb-ubuntu-22-04/ but I don't know whether that will get security updates as quickly, so this is a serious problem.

Revision history for this message
Andrew Conway (acubuntuone) wrote :

I did some more investigating, and I think there are two independent problems here:
(1) The problem as believed so far, network access permissions
(2) New insight: Kerberos doesn't work with snaps.
This explains why fixing (1) didn't help me (or Adam).

Background: Kerberos is the authentication mechanism used for NFS. Assuming you are using authentication (as almost everyone does), then when you access NFS contents, you need to provide kerberos credentials. These are stored outside of your home directory (after all, home directories are one of the most common reasons to use NFS, so you can't store them there). I believe snaps restrict access to just your home directory, so you can't access the Kerberos key and therefore can't access your home directory.

This is supported by various bugs like https://bugs.launchpad.net/ubuntu/+source/chromium-browser/+bug/1849346 (unresolved) which is a different but relevant issue - people who don't use NFS but do use Kerberos features in Firefox found they don't work post snap conversion.

Revision history for this message
Markus Kuhn (markus-kuhn) wrote :

It may also be worth noting that Kerberos/GSSAPI authentication for NFS works a bit differently from Kerberos/GSSAPI authentication for applications, because NFS happens in the kernel, and therefore the kernel needs to get access to the Kerberos tickets (credentials). The kernel's NFS client does so via a helper process called rpc.gssd, which is usually started at boot time as a daemon by systemd, then listens on /run/rpc_pipefs for kernel requests (upcalls) to authenticate an NFS RPCSEC_GSS session via Kerberos, and then does execure the Kerberos protocol in user space, and passes the resulting session key back to the RPCSEC_GSS code in the kernel, which then uses it to cryptographically protect the NFS RPC requests that go over the network. While doing this, the rpc.gssd process takes on the effective UID of the user application that made the NFS file/IO request, and as such reads the Kerberos ticket, traditionally from the user's credential cache at /tmp/krb5cc_$UID or (depending on the value of $KRB5CCNAME) from some other credential cache, e.g. the kernel keyring.

So the question is: if rpc.gssd does take over the control flow, UID, and environment variables of an NFS I/O request made by a snap application, does it effectively run, from AppArmor's point of view, as the snap application that tried to access a file via Kerberized NFS? Does then the AppArmor profile of the snap application therefore also apply to rpc.gssd and therefore have to enable it to read the Kerberos ticket, such that rpc.gssd can inherit that right as well?

If you run AppArmor in complaint mode rather than enforce mode, does it report in the log that it would have prevented rpc.gssd from doing its thing (reading the ticket, using the network, etc.)
?

KRB5CCNAME syntax: https://web.mit.edu/kerberos/krb5-1.17/doc/basic/ccache_def.html

Revision history for this message
Miles (mubuntuone) wrote :

Running the snap 'hello' gives the following message in /var/log/kern.log:

kernel: [25832.721379] audit: type=1400 audit(1650864430.846:486): apparmor="ALLOWED" operation="open" profile="/usr/sbin/sssd" name="/proc/26293/cmdline" pid=827 comm="sssd_nss" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

We disabled AppArmor with the same result.

Hello gave the error message:

cannot open path of the current working directory: Permission denied

Revision history for this message
Andrew Conway (acubuntuone) wrote :

I got exactly the same errors as Miles above; a simple permission denied error stopping things before AppArmor got involved.

I.e., the answer to Markus Kuhn's question is no, in fact even in enforce mode there are no denied apparmor complaints.

I don't know whether this is because the gating problem is not being able to read the ticket in /tmp, or whether being in the kernel solves some of the apparmor issues, but the greater pickiness of kerberos user definition is an issue. Do snaps run as a different uid?

Revision history for this message
Sebastien Bacher (seb128) wrote :

reopening for snapd since it seems it's not firefox specific and still an issue

Changed in snapd (Ubuntu):
status: Fix Released → New
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in firefox (Ubuntu):
status: New → Confirmed
Changed in snapd (Ubuntu):
status: New → Confirmed
Revision history for this message
Alberto Mardegan (mardy) wrote :

I believe that this is a duplicate of bug 1884299.

A few questions to people affected by this bug:

1) are you using autofs?

2) can you please try running "sudo systemctl restart snapd" after logging in into your $HOME, and then try running a snap again?

3) If that still fails, can you paste the journal logs here (from the time that snapd was restarted)?

Changed in snapd (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Erik Meitner (eamuwmath) wrote :
Download full text (3.6 KiB)

Yes, we are using Autofs.

thisisme@jammy:~$ cat /etc/auto.staff
* -rw,nosuid nfshome.domain.edu:/nfshome/staff/&

thisisme@jammy:~$ cat /etc/auto.master
/fac auto.fac --timeout=120
/staff auto.staff --timeout=120

thisisme@jammy:~$ pwd
/staff/thisisme

thisisme@jammy:~$ mount|grep staff
nfshome.domain.edu:/nfshome/staff/thisisme on /staff/thisisme type nfs4 (rw,nosuid,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=XX.YY.ZZ.66,local_lock=none,addr=XX.YY.ZZ.16)

thisisme@jammy:~$ snap list
Name Version Rev Tracking Publisher Notes
bare 1.0 5 latest/stable canonical✓ base
chromium 101.0.4951.64 1993 latest/stable canonical✓ -
core20 20220329 1434 latest/stable canonical✓ base
gnome-3-38-2004 0+git.1f9014a 99 latest/stable canonical✓ -
gtk-common-themes 0.1-59-g7bca6ae 1519 latest/stable canonical✓ -
snapd 2.55.3 15534 latest/stable canonical✓ snapd

thisisme@jammy:~$ chromium-browser
cannot open path of the current working directory: Permission denied

root@jammy:~# journalctl -f
May 13 08:58:31 jammy snapd[128770]: main.go:155: Exiting on terminated signal.
May 13 08:58:31 jammy snapd[128770]: overlord.go:504: Released state lock file
May 13 08:58:31 jammy systemd[1]: Stopping Snap Daemon...
May 13 08:58:31 jammy systemd[1]: snapd.service: Deactivated successfully.
May 13 08:58:31 jammy systemd[1]: Stopped Snap Daemon.
May 13 08:58:31 jammy systemd[1]: Starting Snap Daemon...
May 13 08:58:31 jammy snapd[128952]: AppArmor status: apparmor is enabled and all features are available
May 13 08:58:31 jammy snapd[128952]: overlord.go:263: Acquiring state lock file
May 13 08:58:31 jammy snapd[128952]: overlord.go:268: Acquired state lock file
May 13 08:58:31 jammy snapd[128952]: daemon.go:247: started snapd/2.55.3+22.04ubuntu1 (series 16; classic) ubuntu/22.04 (amd64) linux/5.15.0-30-generic.
May 13 08:58:31 jammy kernel: loop6: detected capacity change from 0 to 8
May 13 08:58:31 jammy systemd[1]: tmp-sanity\x2dmountpoint\x2d2266021507.mount: Deactivated successfully.
May 13 08:58:31 jammy snapd[128952]: daemon.go:340: adjusting startup timeout by 1m0s (pessimistic estimate of 30s plus 5s per snap)
May 13 08:58:31 jammy systemd[1]: Started Snap Daemon.
May 13 08:58:31 jammy dbus-daemon[495]: [system] Activating via systemd: service name='org.freedesktop.timedate1' unit='dbus-org.freedesktop.timedate1.service' requested by ':1.3815' (uid=0 pid=128952 comm="/usr/lib/snapd/snapd " label="unconfined")
May 13 08:58:31 jammy systemd[1]: Starting Time & Date Service...
May 13 08:58:31 jammy dbus-daemon[495]: [system] Successfully activated service 'org.freedesktop.timedate1'
May 13 08:58:31 jammy systemd[1]: Started Time & Date Service.
May 13 08:58:34 jammy systemd[127902]: Started snap.chromium.chromium.7948c287-207b-4d96-b9af-02061a62addc.scope.
May 13 08:58:34 jammy audit[128990]: AVC apparmor="DENIED" operation="sendmsg" profile="/usr/lib/snapd/snap-confine" pid=128990 comm="snap-confine" laddr=XX.YY.ZZ.66 lport...

Read more...

Revision history for this message
Andrew Conway (acubuntuone) wrote :
Download full text (8.2 KiB)

Using NVFv4, kerberos authenticated, mounted by autofs:

arc@andrewshoreham:~$ hello
cannot open path of the current working directory: Permission denied

[ Then as user with sudo privs, sudo systemctl restart snapd ]

arc@andrewshoreham:~$ hello
cannot open path of the current working directory: Permission denied

Logs since just before restarting snapd

syslog
------
May 15 14:54:09 andrewshoreham kernel: [12319.195323] audit: type=1400 audit(1652590449.676:183): apparmor="ALLOWED" operation="open" profile="/usr/sbin/sssd" name="/proc/24886/cmdline" pid=910 comm="sssd_nss" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
May 15 14:54:09 andrewshoreham systemd[1]: Stopping Snap Daemon...
May 15 14:54:09 andrewshoreham snapd[726]: main.go:155: Exiting on terminated signal.
May 15 14:54:09 andrewshoreham snapd[726]: overlord.go:504: Released state lock file
May 15 14:54:09 andrewshoreham systemd[1]: snapd.service: Deactivated successfully.
May 15 14:54:09 andrewshoreham systemd[1]: Stopped Snap Daemon.
May 15 14:54:09 andrewshoreham systemd[1]: snapd.service: Consumed 2.753s CPU time.
May 15 14:54:09 andrewshoreham systemd[1]: Starting Snap Daemon...
May 15 14:54:09 andrewshoreham snapd[24890]: AppArmor status: apparmor is enabled and all features are available
May 15 14:54:09 andrewshoreham snapd[24890]: overlord.go:263: Acquiring state lock file
May 15 14:54:09 andrewshoreham snapd[24890]: overlord.go:268: Acquired state lock file
May 15 14:54:09 andrewshoreham snapd[24890]: daemon.go:247: started snapd/2.55.3+22.04 (series 16; classic) ubuntu/22.04 (amd64) linux/5.15.0-25-generic.
May 15 14:54:09 andrewshoreham kernel: [12319.270748] loop11: detected capacity change from 0 to 8
May 15 14:54:09 andrewshoreham snapd[24890]: daemon.go:340: adjusting startup timeout by 1m10s (pessimistic estimate of 30s plus 5s per snap)
May 15 14:54:09 andrewshoreham systemd[1]: tmp-sanity\x2dmountpoint\x2d2760788470.mount: Deactivated successfully.
May 15 14:54:09 andrewshoreham snapd[24890]: backend.go:133: snapd enabled NFS support, additional implicit network permissions granted
May 15 14:54:10 andrewshoreham kernel: [12319.549118] audit: type=1400 audit(1652590450.028:184): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/snapd/snap-confine" pid=24926 comm="apparmor_parser"
May 15 14:54:10 andrewshoreham kernel: [12319.578896] audit: type=1400 audit(1652590450.060:185): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=24926 comm="apparmor_parser"
May 15 14:54:10 andrewshoreham kernel: [12319.969313] audit: type=1400 audit(1652590450.448:186): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/snap/snapd/15534/usr/lib/snapd/snap-confine" pid=24946 comm="apparmor_parser"
May 15 14:54:10 andrewshoreham kernel: [12319.983029] audit: type=1400 audit(1652590450.464:187): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/snap/snapd/15534/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=24946 comm="apparmor_parser"
May 15 14:54:10 andrewshoreham kernel: [12320.165228] audit: type=...

Read more...

Revision history for this message
Alberto Mardegan (mardy) wrote :

Thanks Erik and Andrew for the logs!

It looks like we have more than one bug here:

1) Andrew's logs show that after restarting the snapd service, NFS was correctly detected and there are no more AppArmor denials on that. But on Erik's machine, for some reason, that did not happen.

2) Even if the network rules are added, snaps still fail to start; I believe that the root cause is the same as in https://bugs.launchpad.net/bugs/1973321 (despite the fact that that bug is about sshfs)

I will continue investigating the first issue here. Whereas, for the second one, Andrew, can you please check if snaps start once you "cd" to another directory, like "/snap"?

Revision history for this message
Alberto Mardegan (mardy) wrote :

Hi Erik, I see now what's the problem (this is specific to your setup -- this probably does not concern other people who commented here): your home directories are under /staff, whereas currently snaps only support the traditional /home/<user>/ scheme.

Please subscribe to https://bugs.launchpad.net/snapd/+bug/1776800, we are working on it :-)

Revision history for this message
Andrew Conway (acubuntuone) wrote :

Thanks Alberto. I tried running "hello" in a different directory, and you were correct:

arc@andrewfairfield:~$ hello
cannot open path of the current working directory: Permission denied
arc@andrewfairfield:~$ cd /
arc@andrewfairfield:/$ hello
Hello, world!
arc@andrewfairfield:/$

[ This is in 20.04, not 22.04 ]

Yay! that is the first time I have seen a snap actually work with my normal user account.

This feels like significant progress in working out what is going on!

Of course firefox needs access to the home directory to load the profile and store downloads. Is the whole process run as some other user (a la sudo) or is there just some starting stub running as some other user doing something that returns to the actual user after doing something that thinks it needs access to the current directory but could get by without it?

Actually, I can sort of answer that - I tried running "musescore" as a snap, starting from /
It successfully ran. I tried saving something, and it sort of did... but in a new, empty "home" directory in a /home/arc/snap/musescore/216/ that the save file dialog went to when I pressed the home button. Is this normal behaviour for a snap? Regardless of the inconvenience of the subdirectory, that is running over nfs successfully. I can close Musescore and load it again. But not with cwd=/home/arc.

So that is fairly strong evidence supporting your idea that it is the same root cause as https://bugs.launchpad.net/bugs/1973321 . I will add a comment there.

Thanks for the insight Alberto!

Changed in snapd (Ubuntu):
status: Incomplete → Confirmed
importance: Undecided → High
Changed in firefox (Ubuntu):
importance: Undecided → High
Revision history for this message
Alberto Mardegan (mardy) wrote :

I'm closing this bug and setting it as a duplicate of https://bugs.launchpad.net/snapd/+bug/1917348

There are several issues related to NFS and remote filsystems in general, and we will be tackling them one at a time.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.