statd does not start automatically when needed nor can be forced to start on boot

Bug #581941 reported by Marc Schiffbauer
122
This bug affects 24 people
Affects Status Importance Assigned to Milestone
nfs-utils (Ubuntu)
Triaged
Undecided
Unassigned
Lucid
Triaged
Medium
Canonical Foundations Team
Natty
Triaged
Undecided
Unassigned

Bug Description

I have some user-mountable nfs entries in my /etc/fstab like this:

myserver:/srv/nfs/images /mnt/images nfs defaults,user,noauto,intr,rsize=32768,wsize=32768 0 0

for this to work, statd (from nfs-common) needs to be running, but it is not started at system startup

The scripts claim that it will be started

According to /etc/init/statd.conf rpc.statd should be started either if some NFS filesystem will be mounted or if NEED_STATD is not "no"

When a user tries to mount a NFS share, statd will not be started:

user@ubuntu:~$ mount /mnt/images
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
user@ubuntu:~$

And setting NEED_STATD to yes in /etc/default/nfs-common will NOT start statd on system startup...

I also tried a workaround: Putting "@reboot service start statd" into a system cronjob which also does NOT work...

However, starting it as root manually works, and nfs mounts are working afterwards

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: nfs-common 1:1.2.0-4ubuntu4
Uname: Linux 2.6.34rc3-51-generic i686
Architecture: i386
Date: Mon May 17 22:25:57 2010
ProcEnviron:
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
SourcePackage: nfs-utils

Related branches

Revision history for this message
Marc Schiffbauer (mschiff) wrote :
Revision history for this message
Steve Langasek (vorlon) wrote :

Thank you for taking the time to report this issue and help to improve Ubuntu.

Please post the /etc/fstab for this system.

Also, when you started statd manually, how did you do so?

Changed in nfs-utils (Ubuntu):
status: New → Incomplete
Revision history for this message
Marc Schiffbauer (mschiff) wrote :

Hi!

I started statd like that:
root@ubuntu:~# service statd start
statd start/running, process 2145
root@ubuntu:~#

This is the fstab:

# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# /dev/sda5
UUID=e21e86b9-1f99-4c0f-857d-15c6803ca7a6 / ext3 defaults,errors=remount-ro,relatime 0 1
# /dev/sda1
UUID=0c642888-57c1-4858-8df6-e38966c4c421 /boot ext3 defaults,relatime 0 2
# /dev/sda6
UUID=baa83915-3313-46d9-8d1e-b5adad177e98 /home ext3 defaults,relatime 0 2
# /dev/sda7
UUID=a6be07ad-c1a8-478d-83b0-54ac920f0d46 none swap sw 0 0
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0

lisa:/srv/nfs/docs /mnt/docs nfs defaults,user,noauto,intr,rsize=32768,wsize=32768 0 0
lisa:/srv/nfs/images /mnt/images nfs defaults,user,noauto,intr,rsize=32768,wsize=32768 0 0

Revision history for this message
Steve Langasek (vorlon) wrote :

Hmm, it's not clear then why statd is failing to start at boot.

Please boot with '--verbose' added to the kernel command line, and attach the resulting /var/log/boot.log.

Revision history for this message
Marc Schiffbauer (mschiff) wrote :

Hi Steve,

its failing in pre-start:

"init: statd pre-start process (712) terminated with status 1"

Thats the only line about it in boot.log with --verbose

service portmap is running after boot, so I tried to comment out the "exec sm-notify" line.

After that statd service is runng after reboot.

So why is sm-notify failing here? Is it really required to exec it in this place?

AFAIK sm-notify will be run by statd automatically when it starts up, so IMO the "exec sm-notify" is superfluous anyway.

The sm-notify man page says:
"When rpc.statd is started it will typically started sm-notify but this is configurable. "

and from statd man page:

-L, --no-notify
              Prevents rpc.statd from running the sm-notify command when it starts up, preserving the existing NSM state number and monitor list.

              Note: the sm-notify command contains a check to ensure it runs only once after each system reboot. This prevents spurious reboot notification if
              rpc.statd restarts without the -L option.

-Marc

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

This error bites me too --- here's my boot log

Tweaking sm-notify to get more debug, the error appears to be:

Cannot create /var/lib/nfs/state.new: Read-only file system

Is this problem related to this bug, maybe?

https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/525154 (mountall for /var races with rpc.statd)

Revision history for this message
Steve Langasek (vorlon) wrote :

Marc, when you dropped the 'exec sm-notify' call from the pre-start script, did you also remove the -L from the rpc.statd invocation? (You point out the -L option from the manpage, but don't comment on the fact that we *pass* -L by default.) Can you confirm Dave's comments that /var/lib is read-only when sm-notify tries to run? If so, then this is indeed a race condition that's already been pointed out in bug #525154: we have a tentative fix for this, which is to change the start condition in /etc/init/statd.conf to 'start on local-filesystems', but this needs some more thinking about other possble regressions before I'm willing to upload to lucid.

Revision history for this message
Marc Schiffbauer (mschiff) wrote :

Hi Steve,

sorry, I did not notice that you pass -L to statd.

I now removed the -L option from statd as well and it still is running after boot.

After reverting back to original state (exec sm-notify in pre-start and statd -L) the log says:

init: statd pre-start process (714) terminated with status 1

@Dave: How did you tweak sm-notify to get more debug? I added -d to its start but did not get any messages about "Cannot create ..." or something.

After changing "start on" to
  start on (started portmap and local-filesystems)

it is working here as well but I do not have /var on a separate FS on that machine but / maybe still is mounted ro before local-filesystems?

But why do you start sm-notify manually and then use statd -L instead of just starting statd without -L ?

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

I could find where the log messages were going, so I disabled syslog logging in the source and rebuilt sm-notify.

apt-get source nfs-utils

The find the following lines in utils/statd/sm-notify.c and comment them out:

        log_syslog = 1;
        openlog("sm-notify", LOG_PID, LOG_DAEMON);

Then rebuild with dpkg-buildpackage -B (or whatever) and install the resulting nfs-common deb file.

For good measure, I also edited /etc/init/statd.conf to redirect the output of sm-notify to a spare tty (/dev/tty12 in my case)

Revision history for this message
AlexAD (alex-ad) wrote :

I have noticed the a number of daemons do not start in Ubuntu 10.04 - it seems the since version 6 Ubuntu users upstart. So I have created a temporary fix until it gets sorted.

as root in /etc/init create a file called temp-fix.conf with the following contents:
# Fixes failure to start a number of services in Ubuntu 10.04
#

description "10.04 fixer"

start on (local-filesystems
   and started dbus)
stop on stopping dbus

exec /usr/bin/temp-fix-startup.sh

#end of file

Now in /usr/bin create a file called temp-fix-startup.sh and it should look like

#!/bin/sh

# Give it a little time
sleep 15

exec /sbin/start statd & >> /dev/null 2>&1
exec /sbin/start mythtv-backend & >> /dev/null 2>&1
exec /sbin/start tty1 & >> /dev/null 2>&1
exec /sbin/start tty2 & >> /dev/null 2>&1
exec /sbin/start tty3 & >> /dev/null 2>&1

sleep 120

#end of file

I think that you can start up any of the services you may want including cron

I know this is a bit of a hack, but works for me

Revision history for this message
molostoff (molostoff) wrote :

It seems to me that it happens with my desktop too. I have 2 ubuntu servers 10.04 and one desktop 10.04. And statd problem exists only on desktop itself, servers go fine.

As a simple comparison I see the difference in network interfaces: servers has a static interface config, while desktop has network-manager config, thus network interfaces on desktop are unconfigured until desktop login and there is nowhere to do `sm-notify` nfs cliens about reeboot. So sm-notify respawns in awaiting them to be ready for action, but killed because respawning to fast. This is only a supposition...

Revision history for this message
molostoff (molostoff) wrote :

I have tryed this as a temporary fix:

-start on (started portmap or mounting TYPE=nfs)
+start on (started portmap and net-device-up IFACE!=lo)

and it works - after desktop logon I can see shares that are in rw-mode. But I think that statd should start before autofs (which I have using).

Revision history for this message
Jayen (jayen) wrote :

Hi, I'm having the same problem (statd not starting on boot (after upgrading to lucid)) Should I be using the fix in #5, #7, #10, or #12? Thanks

Revision history for this message
Will (will-berriss) wrote :

Me too! Grrr

Revision history for this message
Will (will-berriss) wrote :

So I just type this before mounting an NFS share:

/etc/init.d/statd restart

and then statd starts and mount will work for me.

But that's very manual :(

Revision history for this message
jason@apps4u.com.au (jason-apps4u) wrote :

I have been having this same issue I have just upgrade from 8.10 to 10.04 over a three week period . When I first upgrade all our mac 10.6 and 10.5 clients stopped auto mounting nfs homes and I used another bug (#540637) report which fixed the issue up untill 3 days ago when I started to have client not able to mount their home dir. after stopping nfs-kernel-server and portmap and restarting then they could mount their home folders but they keeped getting kick off and losing the mount the only way I have been able to fix it is by running rpc.statd -Fd in a termal as root and their mount comes back . .

this is the log from this morning when we tried to log in
Aug 31 09:48:04 server kernel: [85461.640021] statd: server rpc.statd not responding, timed out
Aug 31 09:48:34 server kernel: [85491.640026] statd: server rpc.statd not responding, timed out
Aug 31 09:49:04 server kernel: [85521.642522] statd: server rpc.statd not responding, timed out
Aug 31 09:49:34 server kernel: [85551.641267] statd: server rpc.statd not responding, timed out
Aug 31 09:50:04 server kernel: [85581.640030] statd: server rpc.statd not responding, timed out
Aug 31 09:50:34 server kernel: [85611.642517] statd: server rpc.statd not responding, timed out
Aug 31 09:51:04 server kernel: [85641.642517] statd: server rpc.statd not responding, timed out
Aug 31 09:51:34 server kernel: [85671.640017] statd: server rpc.statd not responding, timed out
Aug 31 09:52:04 server kernel: [85701.640023] statd: server rpc.statd not responding, timed out
Aug 31 09:52:34 server kernel: [85731.640017] statd: server rpc.statd not responding, timed out
Aug 31 09:53:04 server kernel: [85761.640022] statd: server rpc.statd not responding, timed out
Aug 31 09:53:34 server kernel: [85791.640016] statd: server rpc.statd not responding, timed out
Aug 31 09:54:04 server kernel: [85821.640034] statd: server rpc.statd not responding, timed out
Aug 31 09:54:34 server kernel: [85851.640027] statd: server rpc.statd not responding, timed out
 rpc.statd[21644]: segfault at 2011 ip 00007fce39966cba sp 00007fffdecdb740 error 4 in libc-2.11.1.so[7fce39853000+17a000]
Aug 31 10:11:35 server kernel: [86872.690017] statd: server rpc.statd not responding, timed out
Aug 31 10:12:45 server kernel: [86942.606646] rpc.statd[21887]: segfault at 2011 ip 00007f6a45831cba sp 00007fff75c4b9c0 error 4 in libc-2.11.1.so[7f6a4571e000+17a000]
Aug 31 10:13:15 server kernel: [86972.630014] statd: server rpc.statd not responding, timed out
Aug 31 10:13:47 server kernel: [87005.004577] rpc.statd[22381]: segfault at 2011 ip 00007f9568963cba sp 00007fff82db1e90 error 4 in libc-2.11.1.so[7f9568850000+17a000]
Aug 31 10:16:47 server kernel: [87185.006961] rpc.statd[22455]: segfault at 2011 ip 00007fdcda8f1cba sp 00007fffb7995ba0 error 4 in libc-2.11.1.so[7fdcda7de000+17a000]

after stopping nfs and portmap and restarting portmap then nfs I could log in and mount home folder but every 30 to 60 min every one loses their home folder till i run rpc.statd -Fd in termal .

has and one got any ideas on how to fix this.

Revision history for this message
jason@apps4u.com.au (jason-apps4u) wrote :

just to add to other post statd has been runnig the whole time

ps aux | grep statd

root 14801 0.0 0.0 16672 1128 pts/0 S+ 14:13 0:00 rpc.statd -Fd
root 16282 0.0 0.0 4096 592 ? S 14:28 0:00 /bin/sh -c rpc.statd -Fd
root 16283 0.0 0.0 16672 1152 ? S 14:28 0:00 rpc.statd -Fd
root 20487 0.0 0.0 18732 1084 ? Ss Aug30 0:00 rpc.statd -L
root 21539 0.0 0.0 18732 1116 ? Ss Aug30 0:01 rpc.statd
root 29919 0.0 0.0 4096 596 ? S Aug30 0:00 /bin/sh -c rpc.statd -Fd >> /root/test.txt
root 29920 0.0 0.0 16672 1124 ? S Aug30 0:00 rpc.statd -Fd
jason 30710 0.0 0.0 7432 948 pts/3 S+ 16:46 0:00 grep statd

as you can see statd on line 4 has been runing since I rebooted on the 30th.

Changed in nfs-utils (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Changed in nfs-utils (Ubuntu Natty):
status: Triaged → New
Changed in nfs-utils (Ubuntu Lucid):
status: New → Triaged
Changed in nfs-utils (Ubuntu Natty):
importance: Medium → Undecided
Changed in nfs-utils (Ubuntu Lucid):
importance: Undecided → Medium
Changed in nfs-utils (Ubuntu Natty):
assignee: Canonical Foundations Team (canonical-foundations) → nobody
Changed in nfs-utils (Ubuntu Lucid):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
assignee: Canonical Foundations Team (canonical-foundations) → nobody
milestone: none → lucid-updates
assignee: nobody → Canonical Foundations Team (canonical-foundations)
tags: added: regression-release
Changed in nfs-utils (Ubuntu Natty):
status: New → Triaged
Revision history for this message
Brian Murray (brian-murray) wrote :

This seems like a duplicate of bug 690401.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Indeed, I've marked 690401 as a duplicate of this bug. I will update the changelog in the merge proposal accordingly.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Looks like this one is also a duplicate of an even earlier bug, bug #525154

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.