Ubuntu

statd does not start automatically when needed nor can be forced to start on boot

Reported by Marc Schiffbauer on 2010-05-17
118
This bug affects 23 people
Affects Status Importance Assigned to Milestone
nfs-utils (Ubuntu)
Undecided
Unassigned
Lucid
Medium
Canonical Foundations Team
Natty
Undecided
Unassigned

Bug Description

I have some user-mountable nfs entries in my /etc/fstab like this:

myserver:/srv/nfs/images /mnt/images nfs defaults,user,noauto,intr,rsize=32768,wsize=32768 0 0

for this to work, statd (from nfs-common) needs to be running, but it is not started at system startup

The scripts claim that it will be started

According to /etc/init/statd.conf rpc.statd should be started either if some NFS filesystem will be mounted or if NEED_STATD is not "no"

When a user tries to mount a NFS share, statd will not be started:

user@ubuntu:~$ mount /mnt/images
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
user@ubuntu:~$

And setting NEED_STATD to yes in /etc/default/nfs-common will NOT start statd on system startup...

I also tried a workaround: Putting "@reboot service start statd" into a system cronjob which also does NOT work...

However, starting it as root manually works, and nfs mounts are working afterwards

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: nfs-common 1:1.2.0-4ubuntu4
Uname: Linux 2.6.34rc3-51-generic i686
Architecture: i386
Date: Mon May 17 22:25:57 2010
ProcEnviron:
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
SourcePackage: nfs-utils

Related branches

lp:~clint-fewbar/ubuntu/natty/nfs-utils/wait-for-local-filesystems
Merged into lp:ubuntu/natty/nfs-utils at revision 34
James Hunt (community): Needs Fixing on 2010-12-22
Steve Langasek: Pending requested 2010-12-22
Ubuntu branches: Pending requested 2010-12-22
Marc Schiffbauer (mschiff) wrote :
Steve Langasek (vorlon) wrote :

Thank you for taking the time to report this issue and help to improve Ubuntu.

Please post the /etc/fstab for this system.

Also, when you started statd manually, how did you do so?

Changed in nfs-utils (Ubuntu):
status: New → Incomplete
Marc Schiffbauer (mschiff) wrote :

Hi!

I started statd like that:
root@ubuntu:~# service statd start
statd start/running, process 2145
root@ubuntu:~#

This is the fstab:

# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# /dev/sda5
UUID=e21e86b9-1f99-4c0f-857d-15c6803ca7a6 / ext3 defaults,errors=remount-ro,relatime 0 1
# /dev/sda1
UUID=0c642888-57c1-4858-8df6-e38966c4c421 /boot ext3 defaults,relatime 0 2
# /dev/sda6
UUID=baa83915-3313-46d9-8d1e-b5adad177e98 /home ext3 defaults,relatime 0 2
# /dev/sda7
UUID=a6be07ad-c1a8-478d-83b0-54ac920f0d46 none swap sw 0 0
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0

lisa:/srv/nfs/docs /mnt/docs nfs defaults,user,noauto,intr,rsize=32768,wsize=32768 0 0
lisa:/srv/nfs/images /mnt/images nfs defaults,user,noauto,intr,rsize=32768,wsize=32768 0 0

Steve Langasek (vorlon) wrote :

Hmm, it's not clear then why statd is failing to start at boot.

Please boot with '--verbose' added to the kernel command line, and attach the resulting /var/log/boot.log.

Marc Schiffbauer (mschiff) wrote :

Hi Steve,

its failing in pre-start:

"init: statd pre-start process (712) terminated with status 1"

Thats the only line about it in boot.log with --verbose

service portmap is running after boot, so I tried to comment out the "exec sm-notify" line.

After that statd service is runng after reboot.

So why is sm-notify failing here? Is it really required to exec it in this place?

AFAIK sm-notify will be run by statd automatically when it starts up, so IMO the "exec sm-notify" is superfluous anyway.

The sm-notify man page says:
"When rpc.statd is started it will typically started sm-notify but this is configurable. "

and from statd man page:

-L, --no-notify
              Prevents rpc.statd from running the sm-notify command when it starts up, preserving the existing NSM state number and monitor list.

              Note: the sm-notify command contains a check to ensure it runs only once after each system reboot. This prevents spurious reboot notification if
              rpc.statd restarts without the -L option.

-Marc

Dave Martin (dave-martin-arm) wrote :

This error bites me too --- here's my boot log

Tweaking sm-notify to get more debug, the error appears to be:

Cannot create /var/lib/nfs/state.new: Read-only file system

Is this problem related to this bug, maybe?

https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/525154 (mountall for /var races with rpc.statd)

Steve Langasek (vorlon) wrote :

Marc, when you dropped the 'exec sm-notify' call from the pre-start script, did you also remove the -L from the rpc.statd invocation? (You point out the -L option from the manpage, but don't comment on the fact that we *pass* -L by default.) Can you confirm Dave's comments that /var/lib is read-only when sm-notify tries to run? If so, then this is indeed a race condition that's already been pointed out in bug #525154: we have a tentative fix for this, which is to change the start condition in /etc/init/statd.conf to 'start on local-filesystems', but this needs some more thinking about other possble regressions before I'm willing to upload to lucid.

Marc Schiffbauer (mschiff) wrote :

Hi Steve,

sorry, I did not notice that you pass -L to statd.

I now removed the -L option from statd as well and it still is running after boot.

After reverting back to original state (exec sm-notify in pre-start and statd -L) the log says:

init: statd pre-start process (714) terminated with status 1

@Dave: How did you tweak sm-notify to get more debug? I added -d to its start but did not get any messages about "Cannot create ..." or something.

After changing "start on" to
  start on (started portmap and local-filesystems)

it is working here as well but I do not have /var on a separate FS on that machine but / maybe still is mounted ro before local-filesystems?

But why do you start sm-notify manually and then use statd -L instead of just starting statd without -L ?

Dave Martin (dave-martin-arm) wrote :

I could find where the log messages were going, so I disabled syslog logging in the source and rebuilt sm-notify.

apt-get source nfs-utils

The find the following lines in utils/statd/sm-notify.c and comment them out:

        log_syslog = 1;
        openlog("sm-notify", LOG_PID, LOG_DAEMON);

Then rebuild with dpkg-buildpackage -B (or whatever) and install the resulting nfs-common deb file.

For good measure, I also edited /etc/init/statd.conf to redirect the output of sm-notify to a spare tty (/dev/tty12 in my case)

AlexAD (alex-ad) wrote :

I have noticed the a number of daemons do not start in Ubuntu 10.04 - it seems the since version 6 Ubuntu users upstart. So I have created a temporary fix until it gets sorted.

as root in /etc/init create a file called temp-fix.conf with the following contents:
# Fixes failure to start a number of services in Ubuntu 10.04
#

description "10.04 fixer"

start on (local-filesystems
   and started dbus)
stop on stopping dbus

exec /usr/bin/temp-fix-startup.sh

#end of file

Now in /usr/bin create a file called temp-fix-startup.sh and it should look like

#!/bin/sh

# Give it a little time
sleep 15

exec /sbin/start statd & >> /dev/null 2>&1
exec /sbin/start mythtv-backend & >> /dev/null 2>&1
exec /sbin/start tty1 & >> /dev/null 2>&1
exec /sbin/start tty2 & >> /dev/null 2>&1
exec /sbin/start tty3 & >> /dev/null 2>&1

sleep 120

#end of file

I think that you can start up any of the services you may want including cron

I know this is a bit of a hack, but works for me

molostoff (molostoff) wrote :

It seems to me that it happens with my desktop too. I have 2 ubuntu servers 10.04 and one desktop 10.04. And statd problem exists only on desktop itself, servers go fine.

As a simple comparison I see the difference in network interfaces: servers has a static interface config, while desktop has network-manager config, thus network interfaces on desktop are unconfigured until desktop login and there is nowhere to do `sm-notify` nfs cliens about reeboot. So sm-notify respawns in awaiting them to be ready for action, but killed because respawning to fast. This is only a supposition...

molostoff (molostoff) wrote :

I have tryed this as a temporary fix:

-start on (started portmap or mounting TYPE=nfs)
+start on (started portmap and net-device-up IFACE!=lo)

and it works - after desktop logon I can see shares that are in rw-mode. But I think that statd should start before autofs (which I have using).

Jayen (jayen) wrote :

Hi, I'm having the same problem (statd not starting on boot (after upgrading to lucid)) Should I be using the fix in #5, #7, #10, or #12? Thanks

Will (will-berriss) wrote :

Me too! Grrr

Will (will-berriss) wrote :

So I just type this before mounting an NFS share:

/etc/init.d/statd restart

and then statd starts and mount will work for me.

But that's very manual :(

I have been having this same issue I have just upgrade from 8.10 to 10.04 over a three week period . When I first upgrade all our mac 10.6 and 10.5 clients stopped auto mounting nfs homes and I used another bug (#540637) report which fixed the issue up untill 3 days ago when I started to have client not able to mount their home dir. after stopping nfs-kernel-server and portmap and restarting then they could mount their home folders but they keeped getting kick off and losing the mount the only way I have been able to fix it is by running rpc.statd -Fd in a termal as root and their mount comes back . .

this is the log from this morning when we tried to log in
Aug 31 09:48:04 server kernel: [85461.640021] statd: server rpc.statd not responding, timed out
Aug 31 09:48:34 server kernel: [85491.640026] statd: server rpc.statd not responding, timed out
Aug 31 09:49:04 server kernel: [85521.642522] statd: server rpc.statd not responding, timed out
Aug 31 09:49:34 server kernel: [85551.641267] statd: server rpc.statd not responding, timed out
Aug 31 09:50:04 server kernel: [85581.640030] statd: server rpc.statd not responding, timed out
Aug 31 09:50:34 server kernel: [85611.642517] statd: server rpc.statd not responding, timed out
Aug 31 09:51:04 server kernel: [85641.642517] statd: server rpc.statd not responding, timed out
Aug 31 09:51:34 server kernel: [85671.640017] statd: server rpc.statd not responding, timed out
Aug 31 09:52:04 server kernel: [85701.640023] statd: server rpc.statd not responding, timed out
Aug 31 09:52:34 server kernel: [85731.640017] statd: server rpc.statd not responding, timed out
Aug 31 09:53:04 server kernel: [85761.640022] statd: server rpc.statd not responding, timed out
Aug 31 09:53:34 server kernel: [85791.640016] statd: server rpc.statd not responding, timed out
Aug 31 09:54:04 server kernel: [85821.640034] statd: server rpc.statd not responding, timed out
Aug 31 09:54:34 server kernel: [85851.640027] statd: server rpc.statd not responding, timed out
 rpc.statd[21644]: segfault at 2011 ip 00007fce39966cba sp 00007fffdecdb740 error 4 in libc-2.11.1.so[7fce39853000+17a000]
Aug 31 10:11:35 server kernel: [86872.690017] statd: server rpc.statd not responding, timed out
Aug 31 10:12:45 server kernel: [86942.606646] rpc.statd[21887]: segfault at 2011 ip 00007f6a45831cba sp 00007fff75c4b9c0 error 4 in libc-2.11.1.so[7f6a4571e000+17a000]
Aug 31 10:13:15 server kernel: [86972.630014] statd: server rpc.statd not responding, timed out
Aug 31 10:13:47 server kernel: [87005.004577] rpc.statd[22381]: segfault at 2011 ip 00007f9568963cba sp 00007fff82db1e90 error 4 in libc-2.11.1.so[7f9568850000+17a000]
Aug 31 10:16:47 server kernel: [87185.006961] rpc.statd[22455]: segfault at 2011 ip 00007fdcda8f1cba sp 00007fffb7995ba0 error 4 in libc-2.11.1.so[7fdcda7de000+17a000]

after stopping nfs and portmap and restarting portmap then nfs I could log in and mount home folder but every 30 to 60 min every one loses their home folder till i run rpc.statd -Fd in termal .

has and one got any ideas on how to fix this.

just to add to other post statd has been runnig the whole time

ps aux | grep statd

root 14801 0.0 0.0 16672 1128 pts/0 S+ 14:13 0:00 rpc.statd -Fd
root 16282 0.0 0.0 4096 592 ? S 14:28 0:00 /bin/sh -c rpc.statd -Fd
root 16283 0.0 0.0 16672 1152 ? S 14:28 0:00 rpc.statd -Fd
root 20487 0.0 0.0 18732 1084 ? Ss Aug30 0:00 rpc.statd -L
root 21539 0.0 0.0 18732 1116 ? Ss Aug30 0:01 rpc.statd
root 29919 0.0 0.0 4096 596 ? S Aug30 0:00 /bin/sh -c rpc.statd -Fd >> /root/test.txt
root 29920 0.0 0.0 16672 1124 ? S Aug30 0:00 rpc.statd -Fd
jason 30710 0.0 0.0 7432 948 pts/3 S+ 16:46 0:00 grep statd

as you can see statd on line 4 has been runing since I rebooted on the 30th.

Changed in nfs-utils (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Medium
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Changed in nfs-utils (Ubuntu Natty):
status: Triaged → New
Changed in nfs-utils (Ubuntu Lucid):
status: New → Triaged
Changed in nfs-utils (Ubuntu Natty):
importance: Medium → Undecided
Changed in nfs-utils (Ubuntu Lucid):
importance: Undecided → Medium
Changed in nfs-utils (Ubuntu Natty):
assignee: Canonical Foundations Team (canonical-foundations) → nobody
Changed in nfs-utils (Ubuntu Lucid):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
assignee: Canonical Foundations Team (canonical-foundations) → nobody
milestone: none → lucid-updates
assignee: nobody → Canonical Foundations Team (canonical-foundations)
tags: added: regression-release
Changed in nfs-utils (Ubuntu Natty):
status: New → Triaged
Brian Murray (brian-murray) wrote :

This seems like a duplicate of bug 690401.

Clint Byrum (clint-fewbar) wrote :

Indeed, I've marked 690401 as a duplicate of this bug. I will update the changelog in the merge proposal accordingly.

Clint Byrum (clint-fewbar) wrote :

Looks like this one is also a duplicate of an even earlier bug, bug #525154

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers