If NFS server reboots, client somethings fails to remount the file system properly

Bug #1881847 reported by Robert Dinse
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nfs-utils (Ubuntu)
New
Undecided
Unassigned

Bug Description

If the NFS server reboots, sometimes client servers fail to re-establish the NFS mount properly. This is using NFS v4.2. The client machine will hang if I try to do a df when an NFS mount is in this condition. Attempting to access a file will likewise hang indefinitely. This is a problem with the stock 5.3.0.x kernel and it is a problem with 5.6 still. I have not tried 5.7 yet. I filed this against nfs-kernel, but I am not 100% sure it is not nfs-common but there is some sort of communications not happening between them.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: nfs-kernel-server 1:1.3.4-2.5ubuntu3
Uname: Linux 5.6.0 x86_64
ApportVersion: 2.20.11-0ubuntu27.2
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: MATE
Date: Tue Jun 2 21:34:31 2020
InstallationDate: Installed on 2017-05-27 (1102 days ago)
InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Release amd64 (20170412)
SourcePackage: nfs-utils
UpgradeStatus: Upgraded to focal on 2020-04-26 (38 days ago)

Revision history for this message
Robert Dinse (nanook) wrote :
description: updated
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

With NFSv4, user and group information is sent on the wire as names, instead of uids. And these names are qualified with a domain. So for example an exported directory containing files for the "ubuntu" user will have the ownership sent to NFSv4 clients as "ubuntu@DOMAIN", where "DOMAIN" is the DNS domain of the server.

Whave I have seen in some testing is that after a reboot, for some reason (probably service ordering), the NFS server bits do not know yet the domain of the machine, and then this becomes "localdomain", and the user is sent to the client as "ubuntu@localdomain". The client, being on the same DNS domain, decides that "localdomain" is none of its business, and declares that user as "nobody".

You said there was a "hang", but maybe the issue above is realted somehow? One way to see it is in the client logs, where "nfsidmap" is called. If you configure /etc/request-key.d/id_resolver.conf to call nfidmap with extra "-v -v -v" in the command line, it will be more verbose in /var/log/syslog and say which user it's trying to resolve.

If you see the "localdomain" issue, then try hardcoding the domain in /etc/idmapd.conf on both the server and the client, to your actual DNS domain, and see if that helps.

Hope this helps

Revision history for this message
Robert Dinse (nanook) wrote : Re: [Bug 1881847] Re: If NFS server reboots, client somethings fails to remount the file system properly
Download full text (3.5 KiB)

      I'm sorry it was some time ago when I submitted this. I have since
upgraded to 5.13.19 and 5.16 on some machines and these latter kernels seems
to have resolved the issue.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Mon, 14 Feb 2022, Andreas Hasenack wrote:

> Date: Mon, 14 Feb 2022 20:21:46 -0000
> From: Andreas Hasenack <email address hidden>
> To: <email address hidden>
> Subject: [Bug 1881847] Re: If NFS server reboots,
> client somethings fails to remount the file system properly
>
> With NFSv4, user and group information is sent on the wire as names,
> instead of uids. And these names are qualified with a domain. So for
> example an exported directory containing files for the "ubuntu" user
> will have the ownership sent to NFSv4 clients as "ubuntu@DOMAIN", where
> "DOMAIN" is the DNS domain of the server.
>
> Whave I have seen in some testing is that after a reboot, for some
> reason (probably service ordering), the NFS server bits do not know yet
> the domain of the machine, and then this becomes "localdomain", and the
> user is sent to the client as "ubuntu@localdomain". The client, being on
> the same DNS domain, decides that "localdomain" is none of its business,
> and declares that user as "nobody".
>
> You said there was a "hang", but maybe the issue above is realted
> somehow? One way to see it is in the client logs, where "nfsidmap" is
> called. If you configure /etc/request-key.d/id_resolver.conf to call
> nfidmap with extra "-v -v -v" in the command line, it will be more
> verbose in /var/log/syslog and say which user it's trying to resolve.
>
> If you see the "localdomain" issue, then try hardcoding the domain in
> /etc/idmapd.conf on both the server and the client, to your actual DNS
> domain, and see if that helps.
>
>
> Hope this helps
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1881847
>
> Title:
> If NFS server reboots, client somethings fails to remount the file
> system properly
>
> Status in nfs-utils package in Ubuntu:
> New
>
> Bug description:
> If the NFS server reboots, sometimes client servers fail to re-
> establish the NFS mount properly. This is using NFS v4.2. The client
> machine will hang if I try to do a df when an NFS mount is in this
> condition. Attempting to access a file will likewise hang
> indefinitely. This is a problem with the stock 5.3.0.x kernel and it
> is a problem with 5.6 still. I have not tried 5.7 yet. I filed this
> against nfs-kernel, but I am not 100% sure it is not nfs-common but
> there is some sort of communications not happening between them.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 20.04
> Package: nfs-kernel-server 1:1.3.4-2.5ubuntu3
> Uname: Linux 5.6.0 x86_64
> ApportVersion: 2.20.11-0ubuntu27.2
> Architecture: amd64
> CasperMD5CheckResult: skip
> CurrentDesktop: MATE
> Date: Tue Jun 2 2...

Read more...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thanks for the feedback

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.