Comment 49 for bug 525154

Revision history for this message
David Mathog (mathog) wrote : Re: mountall for /var races with rpc.statd

On my system /var is not in a separate partition, but I still have the infamous statd error messages on nfs mounts.

Ubuntu 10.04.1 LTS
mountall 2.15
upstart.6.5-6

I tried a bunch of things so far and none have worked reliably to cure it:

1. put a "sleep 2" in /etc/mountall-net.conf to allow time for rpc.statd to start.
2. put in two different tests in /etc/init/statd.conf to make sure statd was working
before it left that script - neither worked. One was by Andrew Edmunds from bug 610863. Then
this one, which verifies the expected ports are open:

post-start script
       while [ true ]
       do
           pcount=$(rpcinfo -p 2>/dev/null | grep -c ' status$' 2>/dev/null)
           if [ "$pcount" == 2 ]
           then
             break;
           else
             sleep 0.1
           fi
       done
end script

3. added --debug to the kernel boot line in grub. With messages spewing every which way I sometimes see the
statd warning messages but in 7 tries it never came up without the NFS volumes mounted. Of course it took much longer
to boot this way, and it looks to a naive user like the system has failed, so it is not acceptable for a production system.

4. added to /etc/rc.local

service statd start
killall -SIGUSR1 mountall

Amazingly I have since rebooted a couple of times with (just) the /etc/rc.local change in place and while it helped, it sometimes came up not only with the infamous statd error messages from the failed nfs mounts, but with mountall still running! That looks to me like upstart (which perhaps should have been named "upchuck") blew up and NEVER EVEN RAN rc.local. Either that or mountall ignored the USR1 signal. Both are really appalling possibilities in software which is supposed to be used in production environments. Heck, for all I know both failures may be present.

Assuming at least one of the fixes from (2) above worked as intended then mountall is starting before the statd.conf script should have allowed it to. At least, that's the case if the post-start script must complete as a precondition of a successful statd start. If that isn't true, then what stanza should be used instead?