Invalid pipefs-directory prevents rpc-gssd.service from starting

Bug #1971935 reported by Anders Larsson
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
nfs-utils (Debian)
Fix Released
Unknown
nfs-utils (Ubuntu)
Fix Released
Medium
Andreas Hasenack
Jammy
Triaged
Medium
Andreas Hasenack
Kinetic
Fix Released
Medium
Andreas Hasenack

Bug Description

Ubuntu 22.04 Server
Package version: 1:2.6.1-1ubuntu1

Package nfs-common/nfs-utils provides /etc/nfs.conf and /lib/systemd/system/rpc-gssd.service.
/etc/nfs.conf (and seems to be copied from /usr/share/nfs-common/conffiles/nfs.conf) has the configuration:
...
[general]
pipefs-directory=/run/rpc_pipefs
...

When attempting to start rpc-gssd it gives the following error:
...
ERROR: opendir(/run/rpc_pipefs) failed: No such file or directory
...

There is a systemd unit called var-lib-nfs-rpc_pipefs.mount which mounts this directory as /var/lib/nfs/rpc_pipefs. However this does not match with the configuration in nfs.conf

It's worth mentioning that sometimes it seems a systemd unit is generated (run-rpc_pipefs.mount) which ensures /run/rpc_pipefs is created and everything works as expected. Seems to be random.

Related branches

description: updated
description: updated
Revision history for this message
Anders Larsson (anderslarsson) wrote :

After restarting the system /run/rpc_pipefs exists because /run/systemd/generator/run-rpc_pipefs.mount is created and rpc-gssd works correctly.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nfs-utils (Ubuntu):
status: New → Confirmed
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Can you elaborate a bit on your scenario? Since you are trying to start gssd, I assume you have a kerberos environment, and the server has a keytab in /etc/krb5.keytab (which is a condition for the gssd service to start).

In my tests, I also usually reboot the server after installing all the nfs and kerberos-client bits, exactly because the services have dependencies between them and all these conditions on files.

I'll check the incorrect path you spotted, see if it's supposed to be the same thing indeed.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The commit that introduced the generator has this explanation:
commit 3892174834ea1a4729348f0ecd3078cc1d5458e4
Author: Scott Mayhew <email address hidden>
Date: Mon Apr 10 07:10:45 2017 -0400

    systemd: add a generator for the rpc_pipefs mountpoint

    The nfs.conf has config options for the rpc_pipefs mountpoint.
    Currently, changing these from the default also requires manually
    overriding the systemd unit files that are hard-coded to mount the
    filesystem on /var/lib/nfs/rpc_pipefs.

    This patch adds a generator that creates a mount unit file for the
    rpc_pipefs when a non-default value is specified in /etc/nfs.conf, as
    well as a target unit file to override the dependencies for the systemd
    units using the rpc_pipefs. The blkmapd, idmapd, and gssd service unit
    files have been modified to define their dependencies on the rpc_pipefs
    mountpoint indirectly via the rpc_pipefs target unit file. Since both
    rpc-pipefs-generator.c and nfs-server-generator.c need to convert path
    names to unit file names, that functionality has been moved to
    systemd.c.

    This patch also removes the dependency on the rpc_pipefs from the
    rpc-svcgssd.service unit file. rpc.svcgssd uses the sunrpc cache
    mechanism to exchange data with the kernel, not the rpc_pipefs.

I guess one way to avoid this would be for us to ship the var-lib-... mount unit with a path matching the default we have in /etc/nfs.conf...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

But I would still like to understand how you arrived at gssd complaining about the pipefs mount. Were you just trying to start it manually?

Changed in nfs-utils (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Anders Larsson (anderslarsson) wrote :

Hello,

You're correct. The use-case is to mount Kerberos secured NFS mount points. All other configuration is working as expected since after a reboot (as mentioned earlier) the system in question is able to successfully mount with krb5.

Thanks.

Revision history for this message
Anders Larsson (anderslarsson) wrote :

We're using Puppet to configure our systems and one of the steps is to ensure gss-rpcd.service is up and running.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> We're using Puppet to configure our systems and one of the steps is to ensure gss-rpcd.service is
> up and running.

Can you come up with the ordering in which this nfs host is provisioned, in terms of nfs and kerberos configuration and such? I concur that having to reboot is not ideal.

Revision history for this message
Anders Larsson (anderslarsson) wrote :

I can add more information on Monday when I'm working again. I suspect we found this issue because it was manually developed/tested to ensure that everything works. That is without going through the normal provisioning chain.

As part of our provisioning chain one step is usually to restart the system (at least when the system is initially provisioned). From that perspective it would not be an issue.

Revision history for this message
Stefan Staeglich (staeglis) wrote :

We are also affected by this issue. I would like to avoid to schedule a reboot in my automation script as this can run multiple times as this maintains the whole sssd config. So I would like to avoid unexpected reboots.

Revision history for this message
Stefan Staeglich (staeglis) wrote :

Adjusting the nfs.conf mitigates the issue as there is a mount unit var-lib-nfs-rpc_pipefs.mount:
[general]
pipefs-directory=/var/lib/nfs/rpc_pipefs/

So the package postinst script should just call the generator or the default of pipefs-directory should match the default mount unit

Revision history for this message
Anders Larsson (anderslarsson) wrote :

Sorry for the late response. From what I can see we don't really implement any dependency chain when configuring NFS/Kerberos on Ubuntu. We haven't seen this issue in earlier releases of Ubuntu (20.04 and earlier) though. But seems like the process worked differently at that point since /etc/nfs.conf didn't exist.

Changed in nfs-utils (Ubuntu):
status: Incomplete → Confirmed
Changed in nfs-utils (Ubuntu):
assignee: nobody → Andreas Hasenack (ahasenack)
tags: added: server-todo
Changed in nfs-utils (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I'm trying to reproduce this. It's clear that there is a discrepancy in rpc_pipefs mountpoints between the generator and the var-lib-* mount unit, but the generator should have kicked in right after installation of the package via the systemct daemon-reload call that all packages do in postinst. And the generator, in turn, checks if the rpc_pipefs config is different from the default, and only then generates the new mount unit.

The default is:
#define RPC_PIPEFS_DEFAULT NFS_STATEDIR "/rpc_pipefs"

And NFS_STATEDIR is the value of --with-statedir given to ./configure at build time, and defaults to /var/lib/nfs (ubuntu/debian's case). So RPC_PIPEFS_DEFAULT is /var/lib/nfs/rpc_pipefs, and this is what we get in the var-lib-*.mount unit:

# grep Where /lib/systemd/system/var-lib-nfs-rpc_pipefs.mount
Where=/var/lib/nfs/rpc_pipefs

The generator will just exit silently if the nfs.conf config matches that default:

    conf_init_file(NFS_CONFFILE);
    s = conf_get_str("general", "pipefs-directory");
    if (!s)
        exit(0);
    if (strlen(s) == strlen(RPC_PIPEFS_DEFAULT) &&
            strcmp(s, RPC_PIPEFS_DEFAULT) == 0)
        exit(0);

In the ubuntu/debian case, it won't match:
# nfsconf --get general pipefs-directory
/run/rpc_pipefs

So the generator kicks in:
# grep Where /run/systemd/generator/run-rpc_pipefs.mount
Where=/run/rpc_pipefs

and I get the mount:
# mount -t rpc_pipefs
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)

So while we shouldn't be needing the generator, it's doing its job when needed. I still suspect there is some ordering issue that is triggering this bug, but I haven't found it yet.

Still looking.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

When this happens, do you guys have rpc_pipefs mounted somewhere, or not at all? I'm really struggling to reproduce this, as the generator kicks in right after nfs-common is installed if the nfs.conf config specifies a pipefs mountpoint other than /var/lib/nfs/rpc_pipefs

Could you have something that is preventing systemctl daemon-reload from running, or some services from starting automatically? Perhaps a https://people.debian.org/~hmh/invokerc.d-policyrc.d-specification.txt file/script? Could puppet be doing something in that regard, perhaps trying to optimize an installation and group postinst scripts together at the end?

Stefan, are you also seeing this during some provisioning, or with manual installs?

Changed in nfs-utils (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Anders Larsson (anderslarsson) wrote :
Download full text (5.1 KiB)

No, did a reinstall on the test system and configured the system up until just before the step to install and manage the NFS stuff and rpc_pipefs is not mounted anywhere before starting to configure NFS+Kerberos with Puppet for the first time.

Some info on what the line does before:
File_line[NFS_SECURITY_GSS]: Ensures NEED_GSSD="yes" is set in /etc/default/nfs-common
File_line[GSSD_OPTIONS]: Ensures GSSD_OPTIONS="-k /path/to/keytab" is set in /etc/default/nfs-common

Below is the relevant lines that Puppet outputs (which debug enabled)
----
Notice: /Stage[main]/Nfsclient/File_line[NFS_SECURITY_GSS]/ensure: created
Info: /Stage[main]/Nfsclient/File_line[NFS_SECURITY_GSS]: Scheduling refresh of Service[rpcbind_service]
Info: /Stage[main]/Nfsclient/File_line[NFS_SECURITY_GSS]: Scheduling refresh of Service[rpc-gssd]
Debug: /Stage[main]/Nfsclient/File_line[NFS_SECURITY_GSS]: The container Class[Nfsclient] will propagate my refresh event
Notice: /Stage[main]/Nfsclient/File_line[GSSD_OPTIONS]/ensure: created
Info: /Stage[main]/Nfsclient/File_line[GSSD_OPTIONS]: Scheduling refresh of Service[rpcbind_service]
Info: /Stage[main]/Nfsclient/File_line[GSSD_OPTIONS]: Scheduling refresh of Service[rpc-gssd]
Debug: /Stage[main]/Nfsclient/File_line[GSSD_OPTIONS]: The container Class[Nfsclient] will propagate my refresh event
Debug: Executing: '/usr/bin/systemctl is-active -- rpcbind'
Debug: Executing: '/usr/bin/systemctl is-enabled -- rpcbind'
Debug: Executing: '/usr/bin/systemctl is-active -- rpcbind'
Debug: Executing: '/usr/bin/systemctl show --property=NeedDaemonReload -- rpcbind'
Debug: Executing: '/usr/bin/systemctl restart -- rpcbind'
Notice: /Service[rpcbind_service]: Triggered 'refresh' from 2 events
Debug: /Service[rpcbind_service]: The container Class[Rpcbind] will propagate my refresh event
Debug: Class[Rpcbind]: The container Stage[main] will propagate my refresh event
Debug: Executing: '/usr/bin/systemctl is-active -- rpc-gssd'
Debug: Executing: '/usr/bin/systemctl is-enabled -- rpc-gssd'
Debug: Unable to enable or disable static service rpc-gssd
Debug: Executing: '/usr/bin/systemctl is-active -- rpc-gssd'
Debug: Executing: '/usr/bin/systemctl show --property=NeedDaemonReload -- rpc-gssd'
Debug: Executing: '/usr/bin/systemctl restart -- rpc-gssd'
Debug: Running journalctl command to get logs for systemd restart failure: journalctl -n 50 --since '5 minutes ago' -u rpc-gssd --no-pager ...

Read more...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> up until just before the step to install and manage the NFS stuff and rpc_pipefs is not mounted
> anywhere before starting to configure NFS+Kerberos with Puppet for the first time.

Are you saying that on a jammy system, after "sudo apt install nfs-common", you don't get rpc_pipefs mounted?

Right after installing nfs-common, I have it:

(...)
Setting up nfs-common (1:2.6.1-1ubuntu1) ...

Creating config file /etc/idmapd.conf with new version

Creating config file /etc/nfs.conf with new version
Adding system user `statd' (UID 112) ...
Adding new user `statd' (UID 112) with group `nogroup' ...
Not creating home directory `/var/lib/nfs'.
Created symlink /etc/systemd/system/multi-user.target.wants/nfs-client.target → /lib/systemd/system/nfs-client.target.
Created symlink /etc/systemd/system/remote-fs.target.wants/nfs-client.target → /lib/systemd/system/nfs-client.target.
auth-rpcgss-module.service is a disabled or a static unit, not starting it.
nfs-idmapd.service is a disabled or a static unit, not starting it.
nfs-utils.service is a disabled or a static unit, not starting it.
proc-fs-nfsd.mount is a disabled or a static unit, not starting it.
rpc-gssd.service is a disabled or a static unit, not starting it.
rpc-statd-notify.service is a disabled or a static unit, not starting it.
rpc-statd.service is a disabled or a static unit, not starting it.
rpc-svcgssd.service is a disabled or a static unit, not starting it.
rpc_pipefs.target is a disabled or a static unit, not starting it.
var-lib-nfs-rpc_pipefs.mount is a disabled or a static unit, not starting it.
Processing triggers for man-db (2.10.2-1) ...
Processing triggers for libc-bin (2.35-0ubuntu3) ...

ubuntu@j-nfs:~$ mount -t rpc_pipefs
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)

And it was the generator that did it, as expected:

ubuntu@j-nfs:~$ systemctl status run-rpc_pipefs.mount
● run-rpc_pipefs.mount - RPC Pipe File System
     Loaded: loaded (/run/systemd/generator/run-rpc_pipefs.mount; generated)
     Active: active (mounted) since Fri 2022-07-01 12:20:29 UTC; 4min 45s ago
      Where: /run/rpc_pipefs
       What: sunrpc
      Tasks: 0 (limit: 1119)
     Memory: 20.0K
        CPU: 1ms
     CGroup: /system.slice/run-rpc_pipefs.mount

Jul 01 12:20:29 j-nfs systemd[1]: Mounting RPC Pipe File System...
Jul 01 12:20:29 j-nfs systemd[1]: Mounted RPC Pipe File System.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Could puppet classes be trying to handle the nfs services individually, and perhaps focusing on the var-lib-nfs-rpc_pipefs.mount unit, instead of letting the system start the dependencies as needed?

After I install nfs-common I seem to have the exact opposite of you: run-rpc_pipefs.mount is activated, and var-lib-nfs-rpc_pipefs.mount is not:

root@j-nfs:~# systemctl status var-lib-nfs-rpc_pipefs.mount
○ var-lib-nfs-rpc_pipefs.mount - RPC Pipe File System
     Loaded: loaded (/lib/systemd/system/var-lib-nfs-rpc_pipefs.mount; static)
     Active: inactive (dead)
      Where: /var/lib/nfs/rpc_pipefs
       What: sunrpc

root@j-nfs:~# systemctl status run-rpc_pipefs.mount
● run-rpc_pipefs.mount - RPC Pipe File System
     Loaded: loaded (/run/systemd/generator/run-rpc_pipefs.mount; generated)
     Active: active (mounted) since Fri 2022-07-01 12:35:05 UTC; 2min 17s ago
      Where: /run/rpc_pipefs
       What: sunrpc
      Tasks: 0 (limit: 1119)
     Memory: 20.0K
        CPU: 2ms
     CGroup: /system.slice/run-rpc_pipefs.mount

Jul 01 12:35:05 j-nfs systemd[1]: Mounting RPC Pipe File System...
Jul 01 12:35:05 j-nfs systemd[1]: Mounted RPC Pipe File System.

Revision history for this message
Anders Larsson (anderslarsson) wrote :

You're correct. If I install nfs-common manually it will automatically start run-rpc_pipefs.mount and the mount point exists. I'm not sure what is happening.

I did some additional testing and we're not directly installing nfs-common but installing autofs which pulls in nfs-common. After I had tested this I did `apt-get purge autofs nfs-common` and did `apt-get install autofs` and run-rpc_pipefs.mount is not automatically started. This could of course be because it was installed and then purged which caused some issue. I'll reinstall the system again and try installing autofs and see what happens.

Revision history for this message
Anders Larsson (anderslarsson) wrote :
Download full text (8.4 KiB)

Was able to reproduce the issue by installing autofs on a newly installed system:

# apt-get install autofs
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  keyutils libnfsidmap1 nfs-common
Suggested packages:
  watchdog
The following NEW packages will be installed:
  autofs keyutils libnfsidmap1 nfs-common
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 620 kB of archives.
After this operation, 2299 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
[......

Read more...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Interesting, that gives me something to work on. Same happened here, and guess where rpc_pipefs is mounted in the end...

$ sudo apt install autofs -y
(...)
Not creating home directory `/var/lib/nfs'.
Created symlink /etc/systemd/system/multi-user.target.wants/nfs-client.target → /lib/systemd/system/nfs-client.target.
Created symlink /etc/systemd/system/remote-fs.target.wants/nfs-client.target → /lib/systemd/system/nfs-client.target.
auth-rpcgss-module.service is a disabled or a static unit, not starting it.
nfs-idmapd.service is a disabled or a static unit, not starting it.
nfs-utils.service is a disabled or a static unit, not starting it.
proc-fs-nfsd.mount is a disabled or a static unit, not starting it.
rpc-gssd.service is a disabled or a static unit, not starting it.
rpc-statd-notify.service is a disabled or a static unit, not starting it.
rpc-statd.service is a disabled or a static unit, not starting it.
rpc-svcgssd.service is a disabled or a static unit, not starting it.
rpc_pipefs.target is a disabled or a static unit, not starting it.
var-lib-nfs-rpc_pipefs.mount is a disabled or a static unit, not starting it.
Processing triggers for man-db (2.10.2-1) ...
Processing triggers for libc-bin (2.35-0ubuntu3) ...

$ mount -t rpc_pipefs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)

$ sudo systemctl status run-rpc_pipefs.mount
○ run-rpc_pipefs.mount - RPC Pipe File System
     Loaded: loaded (/run/systemd/generator/run-rpc_pipefs.mount; generated)
     Active: inactive (dead)
      Where: /run/rpc_pipefs
       What: sunrpc

$ sudo systemctl status var-lib-nfs-rpc_pipefs.mount
● var-lib-nfs-rpc_pipefs.mount - RPC Pipe File System
     Loaded: loaded (/proc/self/mountinfo; static)
     Active: active (mounted) since Tue 2022-07-05 12:45:11 UTC; 33s ago
      Where: /var/lib/nfs/rpc_pipefs
       What: sunrpc
      Tasks: 0 (limit: 1119)
     Memory: 20.0K
        CPU: 1ms
     CGroup: /system.slice/var-lib-nfs-rpc_pipefs.mount

Jul 05 12:45:11 j-autofs-nfs-common systemd[1]: Mounting RPC Pipe File System...
Jul 05 12:45:11 j-autofs-nfs-common systemd[1]: Mounted RPC Pipe File System.

Changed in nfs-utils (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I spent quite some time gathering information about this today, and narrowed it down to this:

if autofs is setup before nfs-common, we get the incorrect rpc_pipefs mount point. If nfs-common is setup first, then it's fine.

Furthermore, for some reason if I purge nfs-common and autofs, and reinstall, this attempt gets the "correct" ordering and the rpc_pipefs mountpoint is correct.

In Debian it's even more strange: after sudo apt install autofs, you get rpc_pipefs mounted twice: once by the generator, and once by that var-lib mount unit.

As a temporary workaround for now, I'd suggest to instead of "apt install autofs", and rely on that to pull in nfs-common, to first "apt install nfs-common", and then the rest that is needed/wanted.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Ok, got a theory:

autofs and nfs-common are both unpacked. Then the setup begins.

If autofs is setup first, it means systemctl daemon-reload will be
called at the end, and that will run all the generators. Since
nfs-common is already unpacked, it's generator is on disk, and will be
run.

But /etc/nfs.conf doesn't exist yet: it's produced by ucf in
nfs-common's postinst. That means the generator will not be able to
fetch the pipefs-directory config, and will just exit silently. But
autofs is being started, and it requires rpc_pipefs.target, but at
this time, that target unit is the one from the nfs-common package
that will trigger the var-lib-nfs-rpc_pipefs.mount unit.

Then nfs-common is setup. This time, it will produce /etc/nfs.conf,
and when systemctl daemon-reload is called and the generator run, it
will find /etc/nfs.conf, and the pipefs-directory setting, and see
it's different from the default, and produce the generated target and
mount files.

From here on there is a difference between debian and ubuntu which I cannot explain yet.

In debian, we end up with /run/rpc_pipefs *also* mounted, i.e., two rpc_pipefs mount points. But in ubuntu, just the first one in /var/lib/nfs/rpc_pipefs remains. Something in debian actually enabled the run-rpc_pipefs.mount unit after the steps above.

I'll leave that troubleshooting detail for tomorrow :)

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

This PPA has a patched nfs-utils: https://launchpad.net/~ahasenack/+archive/ubuntu/nfs-utils-generator

I'll get a build for jammy going too.

Revision history for this message
Stefan Staeglich (staeglis) wrote :

Thanks the patch fixed the issue for me. When we can expect a stable release?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I created a debian PR for this, I'm a bit wary of carrying such a delta in ubuntu alone: https://salsa.debian.org/kernel-team/nfs-utils/-/merge_requests/18

Changed in nfs-utils (Debian):
status: Unknown → New
Changed in nfs-utils (Ubuntu Kinetic):
status: Confirmed → In Progress
Changed in nfs-utils (Ubuntu Jammy):
status: New → Triaged
importance: Undecided → Medium
Changed in nfs-utils (Ubuntu Kinetic):
importance: Undecided → Medium
Changed in nfs-utils (Ubuntu Jammy):
assignee: nobody → Andreas Hasenack (ahasenack)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nfs-utils - 1:2.6.1-2ubuntu3

---------------
nfs-utils (1:2.6.1-2ubuntu3) kinetic; urgency=medium

  * d/p/fix-format-overflow-warning.patch: fix a format-overflow warning
    from gcc which was failing the build

nfs-utils (1:2.6.1-2ubuntu2) kinetic; urgency=medium

  * Rely on the generator units for the rpc_pipefs mount
    (LP: #1971935):
    - d/p/always-run-generator.patch: run the generator even if the
      config differs from the built-in default
    - d/rules: exclude the units we will let the generator produce

 -- Andreas Hasenack <email address hidden> Thu, 28 Jul 2022 20:39:54 +0000

Changed in nfs-utils (Ubuntu Kinetic):
status: In Progress → Fix Released
tags: removed: server-todo
Changed in nfs-utils (Debian):
status: New → Confirmed
Changed in nfs-utils (Debian):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.