rpc.statd causes NFS clients to hang while doing hostname lookups

Bug #1089941 reported by Andrew Phillips
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
nfs-utils (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

It appears that rpc.statd from nfs-utils 1.2.5-3ubuntu3.1 will cause clients to hang if the resolution of the IP address of a client takes several seconds.

We've recently deployed Ubuntu 12.04 NFS servers and shortly after installation clients were hanging. At the time, the server was logging messages like:
Dec 11 20:06:19 nfs-server kernel: [18392.347264] statd: server rpc.statd not responding, timed out
Dec 11 20:06:19 nfs-server kernel: [18392.347298] lockd: cannot monitor client1
Dec 11 20:06:54 nfs-server kernel: [18427.364941] statd: server rpc.statd not responding, timed out
Dec 11 20:06:54 nfs-server kernel: [18427.364972] lockd: cannot monitor client2
Dec 11 20:07:29 nfs-server kernel: [18462.382624] statd: server rpc.statd not responding, timed out
Dec 11 20:07:29 nfs-server kernel: [18462.382654] lockd: cannot monitor client3

I did some stracing of rpc.statd and noticed that it was talking to avahi and then hanging for several seconds. Apparently mdns lookups through avahai have a timeout of 5 seconds. It turns out that a few client systems didn't have working reverse DNS and was causing mdns4 to kick in to attempt to resolve them. The host without the reverse DNS was on a different subnet so it's not surprising that avahi couldn't find it.

At the time our nsswitch.conf had:
hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

After disabling mdns4 and restarting statd the problem did not reoccur. mdns4_minimal does not exhibit the problem as it filters out most IP addresses:
hosts: files mdns4_minimal [NOTFOUND=return] dns

I suspect that if other directory services (DNS, LDAP, whatever) are slow to return results for the hosts table that it will behave in the same way.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nfs-utils (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.