Bug #181996 “NFS server: lockd: server not responding” : Bugs : linux package : Ubuntu

Denis Sidorov (sidorov-denis) on 2008-01-11

description:

updated

Revision history for this message

Denis Sidorov (sidorov-denis) wrote on 2008-01-16:

#1

Since I have downgraded kernel to 2.6.20 (a week ago), the error does not show up anymore.
It appears to be a bug in kernel, 'cause I used to find a similar issue reported for Fedora Core 7, running 2.6.22.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-17:

#2

I can confirm the same on a brand new Gutsy install with 2.6.22-14. Within 24 hours usually, this will occur. When it happens, Skype, Amarok and other apps will hang first, but eventually all apps will hang. Client reboot does nothing. nfs-kernel-server doesn't help (it doesn't clear the broken lockd, see below). Only a daily server reboot will resolve anything.

On the client side (also 2.6.22-14) I see:
syslog.0:Jan 17 23:23:21 romita kernel: [28308.368819] lockd: server kirby not responding, still trying

On the server side, if I restart nfs-kernel-server, I see:
Jan 18 08:55:37 kirby kernel: [62797.376546] lockd_down: lockd failed to exit, clearing pid

...and on the server side I will now see TWO "[lockd]" processes where before I saw one.

I don't have a 2.6.20 kernel to go back to on my new server. This is basically making my server totally unusable. I'm looking at having to drop nfs and use samba instead. GACK!

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-17:

#3

Feel free to contact me if I can offer help debugging this.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-20:

#4

Download full text (3.5 KiB)

OK, I turned on debug with:

echo "65535" > /proc/sys/sunrpc/nlm_debug

There seems to be a problem when lockd enters garbage collection. Here's the last of the debug seen from lockd on the server side.

[ 2277.091005] lockd: request from 192.168.1.210, port=864
[ 2277.091018] lockd: LOCK called
[ 2277.091022] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091026] lockd: get host romita
[ 2277.091027] lockd: nsm_monitor(romita)
[ 2277.091031] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091035] lockd: creating file for (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091047] lockd: found file f7be3900 (count 0)
[ 2277.091050] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=58832, 1073741824-1073741824, bl=0)
[ 2277.091054] lockd: nlmsvc_lookup_block f=f7be3900 pd=58832 1073741824-1073741824 ty=0
[ 2277.091056] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091058] lockd: get host romita
[ 2277.091062] lockd: created block ecbfe6c0...
[ 2277.091066] lockd: vfs_lock_file returned 0
[ 2277.091068] lockd: freeing block ecbfe6c0...
[ 2277.091069] lockd: release host romita
[ 2277.091071] lockd: nlm_release_file(f7be3900, ct = 2)
[ 2277.091073] lockd: nlmsvc_lock returned 0
[ 2277.091075] lockd: LOCK status 0
[ 2277.091076] lockd: release host romita
[ 2277.091078] lockd: nlm_release_file(f7be3900, ct = 1)

[ 2277.091298] lockd: request from 192.168.1.210, port=864
[ 2277.091302] lockd: LOCK called
[ 2277.091304] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091306] lockd: get host romita
[ 2277.091307] lockd: nsm_monitor(romita)
[ 2277.091310] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091316] lockd: found file f7be3900 (count 0)
[ 2277.091319] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=58832, 1073741826-1073742335, bl=0)
[ 2277.091322] lockd: nlmsvc_lookup_block f=f7be3900 pd=58832 1073741826-1073742335 ty=0
[ 2277.091325] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091327] lockd: host garbage collection
[ 2277.091328] lockd: nlmsvc_mark_resources

Nothing more is seen from the lockd after the start of the GC. Looking at earlier GC runs from the syslog, the pattern is:

[ 2037.388911] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2037.388914] lockd: host garbage collection
[ 2037.388916] lockd: nlmsvc_mark_resources
[ 2037.388920] nlm_gc_hosts skipping romita (cnt 0 use 0 exp 455264)
[ 2037.388922] nlm_gc_hosts skipping ditko (cnt 0 use 0 exp 460016)
[ 2037.388924] lockd: get host romita

So it finds a couple of entries (skips 'em) and then breaks out to carry on immediately with "get host". I'm assuming that GC is invoked as part of lookup handling, and doesn't just get triggered asynchronously.

Anyhow, this looks like a good spot to start digging. I don't see anything running on top (does lockd show on top?) But the process still seems to be in the ps table. It just doesn't do anything...

OK, I turned on debug with:

echo "65535" > /proc/sys/sunrpc/nlm_debug

There seems to be a problem when lockd enters garbage collection.   Here's the last of the debug seen from lockd on the server side.

[ 2277.091005] lockd: request from 192.168.1.210, port=864
[ 2277.091018] lockd: LOCK          called
[ 2277.091022] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091026] lockd: get host romita
[ 2277.091027] lockd: nsm_monitor(romita)
[ 2277.091031] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091035] lockd: creating file for (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091047] lockd: found file f7be3900 (count 0)
[ 2277.091050] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=58832, 1073741824-1073741824, bl=0)
[ 2277.091054] lockd: nlmsvc_lookup_block f=f7be3900 pd=58832 1073741824-1073741824 ty=0
[ 2277.091056] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091058] lockd: get host romita
[ 2277.091062] lockd: created block ecbfe6c0...
[ 2277.091066] lockd: vfs_lock_file returned 0
[ 2277.091068] lockd: freeing block ecbfe6c0...
[ 2277.091069] lockd: release host romita
[ 2277.091071] lockd: nlm_release_file(f7be3900, ct = 2)
[ 2277.091073] lockd: nlmsvc_lock returned 0
[ 2277.091075] lockd: LOCK          status 0
[ 2277.091076] lockd: release host romita
[ 2277.091078] lockd: nlm_release_file(f7be3900, ct = 1)

[ 2277.091298] lockd: request from 192.168.1.210, port=864
[ 2277.091302] lockd: LOCK          called
[ 2277.091304] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091306] lockd: get host romita
[ 2277.091307] lockd: nsm_monitor(romita)
[ 2277.091310] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091316] lockd: found file f7be3900 (count 0)
[ 2277.091319] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=58832, 1073741826-1073742335, bl=0)
[ 2277.091322] lockd: nlmsvc_lookup_block f=f7be3900 pd=58832 1073741826-1073742335 ty=0
[ 2277.091325] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2277.091327] lockd: host garbage collection
[ 2277.091328] lockd: nlmsvc_mark_resources

Nothing more is seen from the lockd after the start of the GC.  Looking at earlier GC runs from the syslog, the pattern is:

[ 2037.388911] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 2037.388914] lockd: host garbage collection
[ 2037.388916] lockd: nlmsvc_mark_resources
[ 2037.388920] nlm_gc_hosts skipping romita (cnt 0 use 0 exp 455264)
[ 2037.388922] nlm_gc_hosts skipping ditko (cnt 0 use 0 exp 460016)
[ 2037.388924] lockd: get host romita

So it finds a couple of entries (skips 'em) and then breaks out to carry on immediately with "get host".  I'm assuming that GC is invoked as part of lookup handling, and doesn't just get triggered asynchronously.

Anyhow, this looks like a good spot to start digging.  I don't see anything running on top (does lockd show on top?)  But the process still seems to be in the ps table.  It just doesn't do  anything any more.

root@kirby:/boot/grub# ps -ef|grep lockd
root      4715     2  0 14:19 ?        00:00:00 [lockd]

root@kirby:/boot/grub# top
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

root@kirby:/boot/grub# uname -a
Linux kirby 2.6.22-14-generic #1 SMP Tue Dec 18 08:02:57 UTC 2007 i686 GNU/Linux

Machine is a dual-core Intel on a Shuttle board.  Hard disk is SATA.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-20:

#5

Output from server's dmsg. Edit (122.1 KiB, text/plain)

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-20:

#6

Output from client's dmesg. Edit (120.6 KiB, text/plain)

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-20:

#7

OK, I added more debug to /usr/src/linux/fs/lockd/host.c and installed a new lockd module. Seems like it's getting lost somewhere in nlmsvc_mark_resources(). I'll keep digging.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-20:

#8

It's getting lost in nlm_inspect_file ().

[ 693.679373] lockd: mutex acquired, checking 128 file hash entries
[ 693.679375] lockd: got entry in list 58
[ 693.679376] lockd: inspecting file

        dprintk("lockd: mutex acquired, checking %d file hash entries\n", FILE_NRHASH);
        for (i = 0; i < FILE_NRHASH; i++) {
                hlist_for_each_entry_safe(file, pos, next, &nlm_files[i], f_list) {
                        dprintk("lockd: got entry in list %d\n", i);
                        file->f_count++;
                        mutex_unlock(&nlm_file_mutex);

                        /* Traverse locks, blocks and shares of this file
                         * and update file->f_locks count */
                        dprintk("lockd: inspecting file\n");
                        if (nlm_inspect_file(host, file, match))
                                ret = 1;

dprintk("lockd: inspection complete\n");

...it never returns from nlm_inspect_file (...).

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-21:

#9

server-kirby-v3.dmsg Edit (6.2 KiB, text/plain)

Right, this appears (unsurprisingly) to be a mutex contention issue, on the file-specific mutex.

See the attached trace: server-kirby-v3.dmsg

Key parts are:

[ 5845.725268] lockd: request from 192.168.1.210, port=860
[ 5845.725272] lockd: LOCK called
[ 5845.725274] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 5845.725504] lockd: get host romita
[ 5845.725506] lockd: found host in cache
[ 5845.725507] lockd: nsm_monitor(romita)
[ 5845.725509] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 5845.725789] lockd: found file f7aa2840 (count 0)
[ 5845.725792] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=80357, 1073741826-1073742335, bl=0)
[ 5845.725806] lockd: nlmsvc_lookup_block f=f7aa2840 pd=80357 1073741826-1073742335 ty=0
[ 5845.725809] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 5845.726186] lockd: host garbage collection
[ 5845.726188] lockd: nlmsvc_mark_resources
[ 5845.726189] lockd: nlm_traverse_files
[ 5845.726255] lockd: mutex acquired, checking 128 file hash entries
[ 5845.726257] lockd: got entry in list 29
[ 5845.726259] lockd: inspecting file f=f7afd480
[ 5845.726260] lockd: traverse blocks
[ 5845.726262] lockd: locking file mutex
[ 5845.726483] lockd: unlocking file mutex
[ 5845.726485] lockd: traverse shares
[ 5845.726486] lockd: traverse locks
[ 5845.726488] lockd: inspection complete
[ 5845.726625] lockd: check file for release
...
(Same pattern repeated for several other files)
...
[ 5845.728644] lockd: got entry in list 58
[ 5845.728645] lockd: inspecting file f=f7aa2840
[ 5845.728646] lockd: traverse blocks
[ 5845.728648] lockd: locking file mutex
...
The final debug is from nlmsvc_traverse_blocks() in /usr/src/linux/fs/lockd/svclock.c

        dprintk ("lockd: locking file mutex\n");
        mutex_lock(&file->f_mutex);
        list_for_each_entry_safe(block, next, &file->f_blocks, b_flist) {
                dprintk ("lockd: trying block for host %p\n", host)
                ...
        }
        dprintk ("lockd: unlocking file mutex\n");

And it's clear now that we're calling the mutex_lock and never leaving.

The important note is that all the previous file checks worked. Why is there a mutex
already taken on only this file? Well, note that this is the file from the request that
actually triggered the GC. Presumably there's a mutex taken out for this file, then
we run the GC, and we attempt to re-take out the mutex. I'll trawl the code and
confirm this.

If so, the fix is probably to move the call to the GC so that it's outside the handling
for the actual RPC call. In fact, the mutex isn't strictly required for the GC because
in this case we're only counting host references. But it looks like we're doing our
reference count by piggybacking on some other code which actually does sweeps of
file locks, so we can't just remove the mutexes.

Right, this appears (unsurprisingly) to be a mutex contention issue, on the file-specific mutex.

See the attached trace: server-kirby-v3.dmsg

Key parts are:

[ 5845.725268] lockd: request from 192.168.1.210, port=860
[ 5845.725272] lockd: LOCK          called
[ 5845.725274] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 5845.725504] lockd: get host romita
[ 5845.725506] lockd: found host in cache
[ 5845.725507] lockd: nsm_monitor(romita)
[ 5845.725509] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 5845.725789] lockd: found file f7aa2840 (count 0)
[ 5845.725792] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=80357, 1073741826-1073742335, bl=0)
[ 5845.725806] lockd: nlmsvc_lookup_block f=f7aa2840 pd=80357 1073741826-1073742335 ty=0
[ 5845.725809] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 5845.726186] lockd: host garbage collection
[ 5845.726188] lockd: nlmsvc_mark_resources
[ 5845.726189] lockd: nlm_traverse_files
[ 5845.726255] lockd: mutex acquired, checking 128 file hash entries
[ 5845.726257] lockd: got entry in list 29
[ 5845.726259] lockd: inspecting file f=f7afd480
[ 5845.726260] lockd: traverse blocks
[ 5845.726262] lockd: locking file mutex
[ 5845.726483] lockd: unlocking file mutex
[ 5845.726485] lockd: traverse shares
[ 5845.726486] lockd: traverse locks
[ 5845.726488] lockd: inspection complete
[ 5845.726625] lockd: check file for release
...
(Same pattern repeated for several other files)
...
[ 5845.728644] lockd: got entry in list 58
[ 5845.728645] lockd: inspecting file f=f7aa2840
[ 5845.728646] lockd: traverse blocks
[ 5845.728648] lockd: locking file mutex
...
The final debug is from nlmsvc_traverse_blocks() in /usr/src/linux/fs/lockd/svclock.c

dprintk ("lockd: locking file mutex\n");
        mutex_lock(&file->f_mutex);
        list_for_each_entry_safe(block, next, &file->f_blocks, b_flist) {
                dprintk ("lockd: trying block for host %p\n", host)
                ...
        }
        dprintk ("lockd: unlocking file mutex\n");

And it's clear now that we're calling the mutex_lock and never leaving.

The important note is that all the previous file checks worked.  Why is there a mutex 
already taken on only this file?  Well, note that this is the file from the request that 
actually triggered the GC.  Presumably there's a mutex taken out for this file, then
we run the GC, and we attempt to re-take out the mutex.  I'll trawl the code and 
confirm this.

If so, the fix is probably to move the call to the GC so that it's outside the handling
for the actual RPC call.  In fact, the mutex isn't strictly required for the GC because 
in this case we're only counting host references.  But it looks like we're doing our 
reference count by piggybacking on some other code which actually does sweeps of 
file locks, so we can't just remove the mutexes.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-21:

#10

Download full text (3.2 KiB)

OK, let's follow the request and see what is performing a file mutex lock.

[ 5845.725268] lockd: request from 192.168.1.210, port=860
This is from the main lockd kernel thread function.
static void lockd (...) in svc.c. It invokes svc_process().

[ 5845.725272] lockd: LOCK called
Via some xdr magic, preprocessor, and function lookup table, our main handler function
nlmsvc_proc_lock (...) from svcproc.c is called.

[ 5845.725274] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 5845.725504] lockd: get host romita
[ 5845.725506] lockd: found host in cache
nlmsvc_proc_lock (...) invokes nlmsvc_retrieve_args (...) also in svcproc.c to get/parse
some args, including the host. In this case, the host is found in the cache.

[ 5845.725507] lockd: nsm_monitor(romita)
nlmsvc_retrieve_args (...) also monitors the host in some way that isn't clear to me yet.
It doesn't appear to be related to our problem, so that can be put aside for now.

[ 5845.725509] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
nlmsvc_retrieve_args (...) also does a file lookup by calling nlm_lookup_file(...) This debug
is from nlm_lookup_file (even though it says nlm_file_lookup). We take out the file table
mutex here, but not the file-specific mutex. We initialise the file mutex here, so from this
point onwards we need to be looking out for file specific locks.
[ 5845.725789] lockd: found file f7aa2840 (count 0)

[ 5845.725792] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=80357, 1073741826-1073742335, bl=0)
Now nlmsvc_proc_lock (...) calls nlmsvc_lock (...) from svclock.c to do the actual locking. Very
first thing, right after the debug, this takes out the mutex on the file...

/* Lock file against concurrent access */
mutex_lock(&file->f_mutex);

The corresponding...
        mutex_unlock(&file->f_mutex);
...is right down the bottom of nlmsvc_lock (...)...
out:
        mutex_unlock(&file->f_mutex);
        nlmsvc_release_block(block);
        dprintk("lockd: nlmsvc_lock returned %u\n", ret);

...but we don't get that far. I think we've found it then. But let's carry on...

[ 5845.725806] lockd: nlmsvc_lookup_block f=f7aa2840 pd=80357 1073741826-1073742335 ty=0
The call from nlmsvc_lock (...) to nlmsvc_lookup_block (...) is right after the file-specific
mutex lock is taken out. We don't find an existing block, so nlmsvc_lock (...) creates a
new one by calling nlmsvc_create_block (...).

[ 5845.725809] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
nlmsvc_create_block (...) calls nlmsvc_lookup_host (...) ...

[ 5845.726186] lockd: host garbage collection
...which decides it's time to take out the trash.

[ 5845.726188] lockd: nlmsvc_mark_resources
[ 5845.726189] lockd: nlm_traverse_files
[ 5845.726255] lockd: mutex acquired, checking 128 file hash entries
[ 5845.726257] lockd: got entry in list 29
[ 5845.726259] lockd: inspecting file f=f7afd480
[ 5845.726260] lockd: traverse blocks
[ 5845.726262] lockd: locking file mutex
...which goes through all the files fine, until it comes to the specific file for which we are currently
serving the r...

OK, let's follow the request and see what is performing a file mutex lock.

[ 5845.725268] lockd: request from 192.168.1.210, port=860
This is from the main lockd kernel thread function.
static void lockd (...) in svc.c.  It invokes svc_process().

[ 5845.725272] lockd: LOCK called
Via some xdr magic, preprocessor, and function lookup table, our main handler function
nlmsvc_proc_lock (...) from svcproc.c is called.

[ 5845.725274] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
[ 5845.725504] lockd: get host romita
[ 5845.725506] lockd: found host in cache
nlmsvc_proc_lock (...) invokes nlmsvc_retrieve_args (...) also in svcproc.c to get/parse 
some args, including the host.  In this case, the host is found in the cache.

[ 5845.725507] lockd: nsm_monitor(romita)
nlmsvc_retrieve_args (...) also monitors the host in some way that isn't clear to me yet.  
It doesn't appear to be related to our problem, so that can be put aside for now.

[ 5845.725509] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 0028c75d)
nlmsvc_retrieve_args (...) also does a file lookup by calling nlm_lookup_file(...)  This debug
is from nlm_lookup_file (even though it says nlm_file_lookup).  We take out the file table
mutex here, but not the file-specific mutex.  We initialise the file mutex here, so from this
point onwards we need to be looking out for file specific locks.
[ 5845.725789] lockd: found file f7aa2840 (count 0)

[ 5845.725792] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=80357, 1073741826-1073742335, bl=0)
Now nlmsvc_proc_lock (...) calls nlmsvc_lock (...) from svclock.c to do the actual locking.  Very
first thing, right after the debug, this takes out the mutex on the file...

/* Lock file against concurrent access */
        mutex_lock(&file->f_mutex);

The corresponding...
        mutex_unlock(&file->f_mutex);
...is right down the bottom of nlmsvc_lock (...)...  
out:
        mutex_unlock(&file->f_mutex);
        nlmsvc_release_block(block);
        dprintk("lockd: nlmsvc_lock returned %u\n", ret);

...but we don't get that far.  I think we've found it then.  But let's carry on...

[ 5845.725806] lockd: nlmsvc_lookup_block f=f7aa2840 pd=80357 1073741826-1073742335 ty=0
The call from nlmsvc_lock (...) to nlmsvc_lookup_block (...) is right after the file-specific 
mutex lock is taken out.    We don't find an existing block, so nlmsvc_lock (...)  creates a 
new one by calling nlmsvc_create_block (...).

[ 5845.725809] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, name=romita)
nlmsvc_create_block (...) calls nlmsvc_lookup_host (...) ...

[ 5845.726186] lockd: host garbage collection
...which decides it's time to take out the trash.

[ 5845.726188] lockd: nlmsvc_mark_resources
[ 5845.726189] lockd: nlm_traverse_files
[ 5845.726255] lockd: mutex acquired, checking 128 file hash entries
[ 5845.726257] lockd: got entry in list 29
[ 5845.726259] lockd: inspecting file f=f7afd480
[ 5845.726260] lockd: traverse blocks
[ 5845.726262] lockd: locking file mutex
...which goes through all the files fine, until it comes to the specific file for which we are currently
serving the request... and hey presto.  Two mutexes are not better than one.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-21:

#11

Hmm... one quick-fix approach would seem to be to pass LOCK_RECURSIVE to mutex_init.

On that note, it's not clear to me yet why we're even using mutexes here. Isn't there only a single lockd process? And in that case, all these mutexes are private, no? Or is it possible to start two lockd's for higher performance (not something I've ever done).

Alternatively, create a new function:

/*
* Check to see if it's time to sweep the garbage out of the hosts structures.
*/
static void
nlm_gc_hosts_if_needed(void)

if (time_after_eq(jiffies, next_gc))
nlm_gc_hosts();
}

...remove the corresponding code from nlm_lookup_host (...), and invoke nlm_gc_hosts_if_needed from somewhere outside the file-specific mutex code. Maybe in the lockd main loop, after each call to svc_process (...).

I think I'll try that with my code. I'm just a bit worried about performance impact of making all file mutexes recursive. Surely a recursive mutex has to be a bit of a hit compared to the vanilla version?

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-21:

#12

I went with the nlm_gc_hosts_if_needed () approach. Stable so far. Debug shows completion of GC.

[ 6879.405447] lockd: request from 192.168.1.211, port=729
[ 6879.405454] lockd: LOCK called
[ 6879.405458] lockd: nlm_lookup_host(192.168.1.211, p=6, v=4, my role=server, name=ditko)
[ 6879.405460] lockd: get host ditko
[ 6879.405461] lockd: found host in cache
[ 6879.405463] lockd: nsm_monitor(ditko)
[ 6879.405466] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 002a0bca)
[ 6879.405470] lockd: creating file for (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 002a0bca)
[ 6879.405477] lockd: found file f7a3fcc0 (count 0)
[ 6879.405481] lockd: nlmsvc_lock(sda1/2755530, ty=1, pi=95, 0-9223372036854775807, bl=0)
[ 6879.405484] lockd: nlmsvc_lookup_block f=f7a3fcc0 pd=95 0-9223372036854775807 ty=1
[ 6879.405487] lockd: nlm_lookup_host(192.168.1.211, p=6, v=4, my role=server, name=ditko)
[ 6879.405488] lockd: get host ditko
[ 6879.405489] lockd: found host in cache
[ 6879.405492] lockd: created block ef70db80...
[ 6879.405495] lockd: vfs_lock_file returned 0
[ 6879.405497] lockd: freeing block ef70db80...
[ 6879.405498] lockd: release host ditko
[ 6879.405500] lockd: nlm_release_file(f7a3fcc0, ct = 2)
[ 6879.405502] lockd: nlmsvc_lock returned 0
[ 6879.405503] lockd: LOCK status 0
[ 6879.405504] lockd: release host ditko
[ 6879.405506] lockd: nlm_release_file(f7a3fcc0, ct = 1)
[ 6879.405512] lockd: host garbage collection
[ 6879.405513] lockd: nlmsvc_mark_resources
[ 6879.405515] lockd: nlm_traverse_files
[ 6879.405516] lockd: mutex acquired, checking 128 file hash entries
[ 6879.405519] lockd: got entry in list 109
[ 6879.405520] lockd: inspecting file f=f7a3fcc0
[ 6879.405521] lockd: traverse blocks
[ 6879.405525] lockd: locking file mutex
[ 6879.405526] lockd: unlocking file mutex
[ 6879.405527] lockd: traverse shares
[ 6879.405528] lockd: traverse locks
[ 6879.405530] lockd: inspection complete
[ 6879.405531] lockd: check file for release
[ 6879.405532] lockd: nlm_traverse_files finally releasing mutex
[ 6879.405533] lockd: nlm_traverse_files completed
[ 6879.405535] lockd: now removing inactive hostsnlm_gc_hosts skipping romita (cnt 0 use 0 exp 1672246)
[ 6879.405538] nlm_gc_hosts skipping ditko (cnt 0 use 1 exp 1672627)
[ 6879.405540] lockd: completed host garbage collection, next at (1642627 + 15000 = 1657627)
[ 6879.406106] lockd: request from 192.168.1.211, port=729
...

I'm missing a \n in a dprintk. Otherwise looks sweet.

I went with the nlm_gc_hosts_if_needed () approach.  Stable so far.  Debug shows completion of GC.

[ 6879.405447] lockd: request from 192.168.1.211, port=729
[ 6879.405454] lockd: LOCK          called
[ 6879.405458] lockd: nlm_lookup_host(192.168.1.211, p=6, v=4, my role=server, name=ditko)
[ 6879.405460] lockd: get host ditko
[ 6879.405461] lockd: found host in cache
[ 6879.405463] lockd: nsm_monitor(ditko)
[ 6879.405466] lockd: nlm_file_lookup (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 002a0bca)
[ 6879.405470] lockd: creating file for (01070001 00288001 00000000 926e57da d142d9c6 dabb48bd c2a30bcf 002a0bca)
[ 6879.405477] lockd: found file f7a3fcc0 (count 0)
[ 6879.405481] lockd: nlmsvc_lock(sda1/2755530, ty=1, pi=95, 0-9223372036854775807, bl=0)
[ 6879.405484] lockd: nlmsvc_lookup_block f=f7a3fcc0 pd=95 0-9223372036854775807 ty=1
[ 6879.405487] lockd: nlm_lookup_host(192.168.1.211, p=6, v=4, my role=server, name=ditko)
[ 6879.405488] lockd: get host ditko
[ 6879.405489] lockd: found host in cache
[ 6879.405492] lockd: created block ef70db80...
[ 6879.405495] lockd: vfs_lock_file returned 0
[ 6879.405497] lockd: freeing block ef70db80...
[ 6879.405498] lockd: release host ditko
[ 6879.405500] lockd: nlm_release_file(f7a3fcc0, ct = 2)
[ 6879.405502] lockd: nlmsvc_lock returned 0
[ 6879.405503] lockd: LOCK          status 0
[ 6879.405504] lockd: release host ditko
[ 6879.405506] lockd: nlm_release_file(f7a3fcc0, ct = 1)
[ 6879.405512] lockd: host garbage collection
[ 6879.405513] lockd: nlmsvc_mark_resources
[ 6879.405515] lockd: nlm_traverse_files
[ 6879.405516] lockd: mutex acquired, checking 128 file hash entries
[ 6879.405519] lockd: got entry in list 109
[ 6879.405520] lockd: inspecting file f=f7a3fcc0
[ 6879.405521] lockd: traverse blocks
[ 6879.405525] lockd: locking file mutex
[ 6879.405526] lockd: unlocking file mutex
[ 6879.405527] lockd: traverse shares
[ 6879.405528] lockd: traverse locks
[ 6879.405530] lockd: inspection complete
[ 6879.405531] lockd: check file for release
[ 6879.405532] lockd: nlm_traverse_files finally releasing mutex
[ 6879.405533] lockd: nlm_traverse_files completed
[ 6879.405535] lockd: now removing inactive hostsnlm_gc_hosts skipping romita (cnt 0 use 0 exp 1672246)
[ 6879.405538] nlm_gc_hosts skipping ditko (cnt 0 use 1 exp 1672627)
[ 6879.405540] lockd: completed host garbage collection, next at (1642627 + 15000 = 1657627)
[ 6879.406106] lockd: request from 192.168.1.211, port=729
...

I'm missing a \n in a dprintk.  Otherwise looks sweet.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-01-31:

#13

This is fixed in the 2.6.24 kernel series.

I installed:

linux-image-2.6.24-5-generic_2.6.24-5.8_i386.deb
linux-ubuntu-modules-2.6.24-5-generic_2.6.24-5.9_i386.deb

from:

http://packages.ubuntu.com/hardy/base/

(After making the changes to yaird required to install it)
vi /usr/lib/yaird/perl/Input.pm
--- Input.pm.orig 2007-10-22 18:29:27.000000000 +0200
+++ Input.pm 2007-12-11 15:39:52.000000000 +0100
@@ -54,6 +54,11 @@
                my $devLink = Conf::get('sysFs')
                        . "/class/input/$handler/device";
                my $hw = readlink ($devLink);
+ if (defined ($hw) && $hw =~ s!^(\.\./)+(class/input/input\d+)$!$2!) {
+ # Linux 2.6.23 eventX -> inputX link
+ $devLink = Conf::get('sysFs') . '/' . $hw . '/device';
+ $hw = readlink ($devLink);
+ }
                if (defined ($hw)) {
                        unless ($hw =~ s!^(\.\./)+devices/!!) {
                                # imagine localised linux (/sys/geraete ...)

...it all works fine. I'll try and track down the patchset required to fix the Gibbon kernel.

Revision history for this message

Ben Beuchler (insyte) wrote on 2008-02-11:

#14

Any progress on a patch? I'm running into the same problem. If not, would you mind providing a bit more info describing the necessary steps to get the 2.6.24 kernel installed on a Gutsy server?

Thanks...

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-02-11:

#15

Second part is easy. Fix yaird as above, download the .deb files, and "dpkg install" them both. I had no hassles with that.

J. Bruce Fields suggested the following two patches, but I didn't use those.

http://git.linux-nfs.org/?p=trondmy/nfs-2.6.git;a=commitdiff;h=255129d1e9ca0ed3d69d5517fae3e03d7ab4b806
http://git.linux-nfs.org/?p=trondmy/nfs-2.6.git;a=commitdiff;h=a6d85430424d44e946e0946bfaad607115510989

...I just downloaded the ubuntu source for the kernel I had, and manually patched the lockd driver.

Revision history for this message

Russel Winder (russel) wrote on 2008-02-28:

#16

I am running fully up to date Gutsy server and am getting what I think is the same problem as is reported here. After an indeterminate amount of time and/or activity, the [lockd] process on the server goes from S state to D state and all queries from clients result in messages such as:

Feb 28 07:59:36 balin kernel: [73693.569139] lockd: server dimen not responding, still trying

and hang forever.

I tried stopping and then starting nfs-common and nfs-kernel-server but the [lockd] process remains and in state D. Killing it explicitly has no apparent effect. A new [lockd] process appears in the process table after the restart of nfs-kernel-server but it appears not to be used.

The only remedy appears to be to reboot the server and then it seems all the clients.

It seems that the solution to the problem may now be known, so I guess the question is when will an update to the Gutsy kernel be issued? I guess it goes without saying that it would be good if the kernel issued with Hardy does not have this problem?

Thanks.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-02-28: Re: [Bug 181996] Re: NFS server: lockd: server not responding

#17

Yeah,

I know how to fix the problem, but I have no idea how to get a patch
into Gutsy. Any ideas who I would contact?

J.

Russel Winder wrote:
> I am running fully up to date Gutsy server and am getting what I think
> is the same problem as is reported here. After an indeterminate amount
> of time and/or activity, the [lockd] process on the server goes from S
> state to D state and all queries from clients result in messages such
> as:
>
> Feb 28 07:59:36 balin kernel: [73693.569139] lockd: server dimen not
> responding, still trying
>
> and hang forever.
>
> I tried stopping and then starting nfs-common and nfs-kernel-server but
> the [lockd] process remains and in state D. Killing it explicitly has
> no apparent effect. A new [lockd] process appears in the process table
> after the restart of nfs-kernel-server but it appears not to be used.
>
> The only remedy appears to be to reboot the server and then it seems all
> the clients.
>
> It seems that the solution to the problem may now be known, so I guess
> the question is when will an update to the Gutsy kernel be issued? I
> guess it goes without saying that it would be good if the kernel issued
> with Hardy does not have this problem?
>
> Thanks.
>

Revision history for this message

Russel Winder (russel) wrote on 2008-02-28:

#18

I would have thought that the Ubuntu Kernel Team would have looked at this problem -- especially as there is a putative fix. However, it seems it may not yet have even been triaged by them. The problem, at least as I see it, is that there is no regularity to the failure. This must make it hard to actively work on.

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-02-28:

#19

No,

The failure is very regular. It happens whenever the garbage
collection is performed as a result of a lock request.

J.

Revision history for this message

Brett Sealey (brett-sealey) wrote on 2008-02-28:

#20

I've been seeing it for a while now, but only when I run an application on the nfs client that intensively uses file locking.

The only fix is to reboot the server.

When it occurs, the following hangs on the client(in the flock):
time flock ~/junk echo ok; rm ~/junk

[note: flock is in the util-linux package]

A fix in Gutsy seems simple and would be very nice.

Revision history for this message

Leann Ogasawara (leannogasawara) wrote on 2008-03-06:

#21

From the comments here it seems this is resolved for the Hardy kernel so marking "Fix Released" against the Hardy 'linux' kernel source package. The kernel stable release update policy if fairly strict: https://wiki.ubuntu.com/KernelUpdates . If someone could confirm the two patches mentioned in comment https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/181996/comments/15 resolve the issue for Gutsy the kernel team may take this into consideration for an SRU. Until then, against 2.6.22 this will be closed. Thanks.

Changed in linux:
status:	New → Fix Released
Changed in linux-source-2.6.22:
status:	New → Won't Fix

Revision history for this message

Jesper Krogh (jesper) wrote on 2008-04-03:

#22

I can confirm that the above 2 patches solves the problem.

The problem is really grave.. making the NFS-server in gutsy rarely usable. The locking problem occoured about every second day here.. I applied the patch over a week ago and hasn't seen the problem since.

Jesper

Revision history for this message

Jesper Krogh (jesper) wrote on 2008-04-03:

#23

Leann Ogasawara: Should we provide more to get a SRU for this bug in gutsy?

Jesper

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-04-03:

#24

What's an SRU? I'd love to know more about the process for getting fixes into Ubuntu. Please explain!

Revision history for this message

Jesper Krogh (jesper) wrote on 2008-04-04:

#25

SRU is a StableReleaseUpdate .. thats described in the links above. The process to get fixes pushed to a "stable release".

Revision history for this message

Jesper Krogh (jesper) wrote on 2008-04-04:

#26

Changing to Confirmed.. as Described by Leann Ogasawara when the patches are confirmed to work on a gutsy system.

Changed in linux-source-2.6.22:
status:	Won't Fix → Confirmed

Revision history for this message

Leann Ogasawara (leannogasawara) wrote on 2008-04-10:

#27

Hi Jesper,

Thanks so much for testing and the feedback. I've reopened the Gutsy nomination and have reassigned to the kernel team.

For anyone wanting more information about the Stable Release Policy also refer to: https://wiki.ubuntu.com/StableReleaseUpdates .

Thanks again for the testing and the help. We definitely appreciate your patience and cooperation.

Changed in linux:
status:	New → Invalid
Changed in linux-source-2.6.22:
assignee:	nobody → ubuntu-kernel-team
importance:	Undecided → High
milestone:	none → gutsy-updates
status:	New → Triaged
assignee:	nobody → ubuntu-kernel-team
importance:	Undecided → High
milestone:	none → gutsy-updates
status:	Confirmed → Triaged

Revision history for this message

JT (spikyjt) wrote on 2008-04-21:

#28

I'd just like to add that I have this problem too and thank all those who have provided debugging info. This bug has been crippling my system for some time, and confusing me greatly.

I would like to tentatively ask if there is any further progress with adding the patch into a release update? I shall test the patches myself to add another confirmed success with them (I hope) and report back.

I have to say I find it a little scary that this kernel version could go out as a "stable" release with this bug in it. Do not many people use NFS in ubuntu circles? I thought it would be considered an essential service.

Thanks again for all your help.

Revision history for this message

Russel Winder (russel) wrote on 2008-05-09:

#29

I am now running Hardy with kernel 2.6.24-16-server and have not seen this problem for 8 days now. Is it the case that the kernel was patched and this is a patched kernel? If it is I am very happy and thankful to those who did the debugging and the patching. If not, then has the problem been circumvented?

Thanks.

Revision history for this message

Jesper Krogh (jk-novozymes) wrote on 2008-05-09:

#30

Well.. since the problem only is present on a gutsy kernel.. it is quite obvious that you can reproduce on the hardy kernel. The patch above is from the patch-stream between gutsy and hardy.

Jesper

Colin Ian King (colin-king) on 2008-05-16

Changed in linux-source-2.6.22:
assignee:	ubuntu-kernel-team → colin-king

Revision history for this message

Tim Gardner (timg-tpi) wrote on 2008-05-18:

#31

There are a series of NFS patches pending on the SRU process. Any day now...

Changed in linux-source-2.6.22:
assignee:	colin-king → timg-tpi
status:	Triaged → Fix Committed

Tim Gardner (timg-tpi) on 2008-05-28

Changed in linux-source-2.6.22:
status:	Triaged → Fix Committed

Revision history for this message

Shang Wu (shangwu) wrote on 2008-06-12:

#32

Any update on this? Has it been released yet??

Revision history for this message

Tim Gardner (timg-tpi) wrote on 2008-06-27:

#33

Released in 2.6.22-14.53

Changed in linux-source-2.6.22:
assignee:	timg-tpi → nobody
status:	Fix Committed → Fix Released
status:	Fix Committed → Fix Released

Revision history for this message

Vincent A (vja) wrote on 2008-08-18:

#35

After getting the same problem last week ("lockd: server ... not responding, timed out" on client; unkillable lockd on server) I had a look at the source of the linux-image-2.6.22-15-generic package that we're using. To my surprise, I couldn't confirm that the patches mentioned in https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/181996/comments/15 had been applied. Can anyone comment on this?

Details:
$ dpkg -s linux-image-generic |grep ^Version
Version: 2.6.22.15.22
$ apt-get source linux-image-2.6.22-15-generic
[...]
$ less linux-source-2.6.22-2.6.22/fs/lockd/svclock.c

Revision history for this message

Bart Swennen (bswennen) wrote on 2008-09-01:

#36

I've come to the same conclusion as Vincent in https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/181996/comments/35 : the 2.6.22-15 kernel seems not to have those patches applied ... any chance it will in the near future ?

I've looked at the sources in linux-source-2.6.22_2.6.22-15.58_all.deb

Upgrading to hardy is not (yet) an option, but we really would like to use a `normal' Ubuntu-gutsy-kernel, which we cannot now because of this bug.

Revision history for this message

Eckart Haug (ubuntu-syntacs) wrote on 2008-10-07:

#37

Upgrading to hardy won't help, still the same;

client:
2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008 i686 GNU/Linux

says:
Oct 7 10:54:15 lagaffe kernel: [ 3099.897267] lockd: server tide not responding, still trying
Oct 7 10:54:16 lagaffe kernel: [ 3101.624752] lockd: server tide not responding, still trying

server:
2.6.24-19-server #1 SMP Sat Jul 12 00:40:01 UTC 2008 i686 GNU/Linux

says:
Oct 7 10:56:15 tide kernel: [3364891.912872] lockd: server lagaffe not responding, timed out
Oct 7 10:56:15 tide kernel: [3364891.912939] lockd: couldn't create RPC handle for lagaffe
Oct 7 10:56:15 tide kernel: [3364891.913118] rpcbind: server lagaffe not responding, timed out

Revision history for this message

the.jxc (jonathan-spiderfan) wrote on 2008-10-08:

#38

Can't agree with you there, Eckart. I upgraded to Hardy and all my problems with NFS disappeared.

jcouper@kirby:~$ uname -a
Linux kirby 2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008 i686 GNU/Linux

...and hasn't hung once in months. Used to hang at least once a day under Gibbon.

Revision history for this message

Bart Swennen (bswennen) wrote on 2008-10-08:

#39

Same here: do not agree with Eckart: we use the hardy kernel on an otherwise Gutsy installation and the problem stays away.

When booting the Gutsy kernel, it promptly pops up again (within a day).

Revision history for this message

Eckart Haug (ubuntu-syntacs) wrote on 2008-10-12:

#40

I tried the generic kernel (as opposed to server). Worked for a couple of days, then same again.
Until about 4 weeks ago, the problem appeared sporadically, then almost every day - without
any change to the server (no automatic updates). It might depend on configuratin or certain
packages on the client. My home resides on the server. Within the time under question I installed
virtual box on the client. It adds a script which adds a tap device (but doesn't activate a bridge).
Might also depend on my slow server hw (PIII-866/256MB).
I'm mounting nolock for the moment :-)), seems to work fine

Revision history for this message

Launchpad Janitor (janitor) wrote on 2008-12-23: Kernel team bugs

#41

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message

Audric Schiltknecht (audric-schiltknecht) wrote on 2009-02-09:

#42

I have exactly the same problem on a Hardy server (Should I open a new bug report ?):
   * linux-image-server 2.6.24.23.25
   * nfs-kernel-server 1:1.1.2-2ubuntu2.2
   * nfs-common 1:1.1.2-2ubuntu2.2

If I reboot the server, it works for only a few minutes.

Revision history for this message

Eckart Haug (ubuntu-syntacs) wrote on 2009-02-17:

#43

(Storm)
I was using the nolock option since then - that means work with locking disabled,
wich - of course - worked.

On 10.02. I enabled locking again to give it a try. No problems since then.
Kernels are 2.6.24-23-generic on both client and server
nfs-kernel-server and nfs-common are 1:1.1.2-2ubuntu2.2

I still don't think it's a new problem - it just shows up in very special cases,
whch we don't know. Over here it disappeared as randomly as it appeared before
- and you still have it. When did it appear in your site ? Which changes did you make
before ?

If you post, have a look at https://wiki.ubuntu.com/KernelTeamBugPolicies
Over here, we're on our own now.

Revision history for this message

Guido Nickels (gsn) wrote on 2009-09-03:

#44

Hi!

We're experiencing the bug on hardy here, too:

- snip -
Sep 3 11:22:57 recovery1 kernel: [68409.731835] rpcbind: server s03.hallopizza.org not responding, timed out
Sep 3 11:22:57 recovery1 kernel: [68409.731876] lockd: server s03.hallopizza.org not responding, timed out
Sep 3 11:22:57 recovery1 kernel: [68409.731895] lockd: couldn't create RPC handle for s03.hallopizza.org
Sep 3 11:23:57 recovery1 kernel: [68469.578518] rpcbind: server s03.hallopizza.org not responding, timed out
Sep 3 11:23:57 recovery1 kernel: [68469.578559] lockd: server s03.hallopizza.org not responding, timed out
Sep 3 11:23:57 recovery1 kernel: [68469.578568] lockd: couldn't create RPC handle for s03.hallopizza.org
- snap -

Versions:
linux-image-2.6.24-24-generic 2.6.24-24.59
nfs-common 1:1.1.2-2ubuntu2.2
nfs-kernel-server 1:1.1.2-2ubuntu2.2

only reboot helps, but not for long - and we can't disable locking as some customers depend on it.

Please tell me if I can help with debug information.

Cheers!

Guido

Revision history for this message

Arie Skliarouk (skliarie) wrote on 2010-01-10:

#45

We use initrd.img-2.6.24-19-openvz with bunch of Linux clients without any problems.
Recently I tried to add Mac OS X client and immediately noticed that the nfs-kernel-server on Linux started locking up for several seconds (thus stalling NFS access for every other client) every minute with following message printed in the logs:
Jan 10 11:25:32 ubuntu1 kernel: [15421367.859941] rpcbind: server boaz-macbook.local not responding, timed out
Jan 10 11:25:32 ubuntu1 kernel: [15421367.859965] lockd: couldn't create RPC handle for boaz-macbook.local

I had to switch the MacOS X to use samba instead.

	Status	Importance	Assigned to	Milestone
linux (Ubuntu)	Fix Released	Undecided	Unassigned
Gutsy	Invalid	Undecided	Unassigned
linux-source-2.6.22 (Ubuntu)	Fix Released	High	Unassigned	Ubuntu gutsy-updates
Gutsy	Fix Released	High	Unassigned	Ubuntu gutsy-updates

Ubuntu
linux package

NFS server: lockd: server not responding

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntulinux package

NFS server: lockd: server not responding

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package