nfs client hangs

Bug #394413 reported by Keith Jackson
46
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

On Mon, Jun 29, 2009 at 04:29:25PM -0700, Keith Jackson wrote:
Hi,
I've run into a strange problem, and I'm not sure where to ask about
it. It appears to be kernel related, so hopefully someone here can
shed some light on it for me. I have a over a dozen ubuntu 9 server
machines that share an NFS filesystem served up on a FreeBSD machine.
We recently upgraded the FreeBSD box to FreeBSD 7.2. The ubuntu
machines are running the standard 2.6.28-11-server kernel.

At this point, we started having problems with NFS on all of the
ubuntu machines. They would come up, mount the NFS shares and then
5-10 minutes later NFS would be hung. I ran wireshark on one of them,
while doing "ls -l ~username". When NFS was working everything looked
fine. Once it hung, I would see the normal YP MATCH RPC call looking
for the entry for "username". The NIS server would return the password
entry for "username" as I'd expect. When things were working, I'd then
see the standard NFS traffic. When NFS was hosed, the YP MATCH reply
would be the last net traffic I would see to the NFS server. Nothing
NFS related would ever happen. An strace would show that the ls was
hung in a stat64 call.

A reboot of the system would cause things to start working for a
while, but 5-10 minutes later NFS would hang. I eventually updated to
a 2.6.30 kernel from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.30/
 This has fixed the problem.

Any thoughts on what the problem might be, or how I might go about
debugging it better?

Revision history for this message
Keith Jackson (krjackson) wrote :

I did some further testing. 2.6.29-02062905-generic from the mainline tree also has this problem. As does: vmlinuz-2.6.28-11-server and vmlinuz-2.6.27-11-server. vmlinuz-2.6.24-21-server seems to be working fine. Since the problem is showing up in the mainline, I will focus on doing some more testing there to see if I can find exactly what version is the first to exhibit this problem.

affects: ubuntu → nfs-utils (Ubuntu)
Revision history for this message
buzzlightyear (lp-dewet) wrote :

Have you found any mroe information? I'm having the exact same issue between my mythbuntu media centre and FreeNAS 0.7RC1 (based on FreeBSD 7.2).

Can someone confirm whether the later 2.6.28 kernels solve this issue, as this is all I'm offered by the upgrade utils?

There is a post on the issue at FreeNAS forum, https://sourceforge.net/apps/phpbb/freenas/viewtopic.php?f=49&t=1704

Some additional threads here: http://markmail.org/message/3tp3f4k4bq7voiuj?q=ubuntu+nfs+fin_wait2&page=1&refer=3tp3f4k4bq7voiuj

The above one seems to rule out FreeBSD on the server side causing the issue, because the server in this case is ubuntu.

http://markmail.org/message/jdw2pzmjstwy4ctc?q=ubuntu+nfs+fin_wait2

This thread (http://markmail.org/message/hf63q26gdqiskltu?q=ubuntu+nfs+fin_wait2) seems to indicate that it could be due to a change in the client source code.

Revision history for this message
manojav (manojav) wrote :

I too have this problem. I am using FreeNAS 0.69.2 with 2 Ubuntu 9.04 clients, same problems. Switched over to CIFS for now.. hoping this gets sorted soon! Please let me know if I need to provide any debug logs.

Steve Langasek (vorlon)
affects: nfs-utils (Ubuntu) → linux (Ubuntu)
Revision history for this message
kiilo (kiilo.org) wrote :

Hello,

Yes this bug exists, for me now too - Im mounting a NFS export from an FreeBSD 7.2 server
sucessfully into a system runing (uname -a):

 Linux ashok 2.6.26-2-686 #1 SMP Wed Aug 19 06:06:52 UTC 2009 i686 GNU/Linux

the shares are accessible for a while, but after some minutes untouched i get a bright (netstat -ap):

 tcp 0 0 ashok.example.com:746 lahan.example.com:nfs FIN_WAIT2 -

i googled around, found the annotations above also useful, but dont see any solution.

Would be nice the machine is able to use NFS for more than 5 minutes again - probably over weeks?

kiilo

Revision history for this message
kiilo (kiilo.org) wrote :

a found a SOLUTION around!

mount the exports via UDP:

in your /etc/fstab add the option udp:

the relevant line in my /etc/fstab look like:

 lahan.example.com:/nas/test /nas/test nfs rw,intr,udp 0 0

the nifty ",udp" changes a lot ...

kiilo

Revision history for this message
buzzlightyear (lp-dewet) wrote :

I'll try this workaround to see if it solves my problem, thanks for the tip.

Revision history for this message
kiilo (kiilo.org) wrote :

small ADDON:

I discovered the NFS server give a timeout, and transfers are dead end slow, because the default MTU size doesnt fit. I cant change the MTU size for the network card (its not suported by the hardware), so i have to minimize the rw size on the client side:

my changed /etc/fstab:

 lahan.digris.ch:/nas/test /nas/test nfs rw,intr,udp,rsize=1492,wsize=1492 0 0

I get now up to 8Mb/s transfers.

greets
kiilo

Revision history for this message
manojav (manojav) wrote :

kiilo, all these settings just prolong the problem. I have gotten 20MB/s + on the transfers but they eventually do hang. Also unless your are doing an NFS mounts over a PPPoE ADSL connection an mtu 1492 means nothing. I have a gigabit network between the NFS server and client, where the MTU should be higher.

That is all besides the point, I believe we are still waiting for a kernel level fix

Revision history for this message
manojav (manojav) wrote :

after upgrading to karmic (2.6.31-14-generic #48-Ubuntu SMP), then nfs problems are gone.

I have had good success with udp mounting using autofs.

mountname -rw,user,auto,rsize=16384,wsize=16384,timeo=60,intr 192.168.1.17:/mnt/raid1d1/media

works great. Autofs also unmounts them when not being used useful for not hogging up nfsd processes on the server.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Keith,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 394413

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
manojav (manojav) wrote :

Yes I still see this issue. Will post the post the appropriate logs when this happens agian. Again this seems to happen only on ubuntu kernels.

Revision history for this message
Keith Jackson (krjackson) wrote :

I have not seen this problem again after upgrading to karmic.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
manojav (manojav) wrote :

This issue is back. in Lucid.. so very frustrating that ubuntu cant get this right. doing the apport-collect now

Changed in linux (Ubuntu):
status: Fix Released → Incomplete
status: Incomplete → New
Revision history for this message
manojav (manojav) wrote :

sadly apport-collect also hangs nice.

Revision history for this message
manojav (manojav) wrote : apport information

AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: manojav 1982 F.... pulseaudio
 /dev/snd/pcmC0D0p: manojav 1982 F...m pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfbff8000 irq 22'
   Mixer name : 'Realtek ALC883'
   Components : 'HDA:10ec0883,14627238,00100002'
   Controls : 34
   Simple ctrls : 20
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=UUID=4f0783e4-1c11-43a4-94e6-17a6aeeeb22e
IwConfig:
 lo no wireless extensions.

 eth0 no wireless extensions.

 vboxnet0 no wireless extensions.
MachineType: MICRO-STAR INTERNATIONAL CO.,LTD MS-7238
NonfreeKernelModules: nvidia
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-22-generic root=UUID=df327c18-088e-43ae-aa91-d075a3dc4099 ro splash quiet vga=769 quiet splash nomodeset video=uvesafb:mode_option=1280x1024-24,mtrr=3,scroll=ywrap
ProcEnviron:
 PATH=(custom, user)
 LANG=en_CA.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Regression: Yes
RelatedPackageVersions: linux-firmware 1.34
Reproducible: Yes
RfKill:

Tags: lucid filesystem regression-release needs-upstream-testing
Uname: Linux 2.6.32-22-generic i686
UserGroups: adm admin audio cdrom dialout dip floppy fuse lpadmin plugdev vboxusers video
WpaSupplicantLog:

dmi.bios.date: 03/06/2007
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V1.4
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: MS-7238
dmi.board.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
dmi.board.version: 1.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: MICRO-STAR INTERNATIONAL CO.,LTD
dmi.chassis.version: 1.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV1.4:bd03/06/2007:svnMICRO-STARINTERNATIONALCO.,LTD:pnMS-7238:pvr1.0:rvnMICRO-STARINTERNATIONALCO.,LTD:rnMS-7238:rvr1.0:cvnMICRO-STARINTERNATIONALCO.,LTD:ct3:cvr1.0:
dmi.product.name: MS-7238
dmi.product.version: 1.0
dmi.sys.vendor: MICRO-STAR INTERNATIONAL CO.,LTD

tags: added: apport-collected
Revision history for this message
manojav (manojav) wrote : AlsaDevices.txt

apport information

Revision history for this message
manojav (manojav) wrote : AplayDevices.txt

apport information

Revision history for this message
manojav (manojav) wrote : ArecordDevices.txt

apport information

Revision history for this message
manojav (manojav) wrote : BootDmesg.txt

apport information

Revision history for this message
manojav (manojav) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
manojav (manojav) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
manojav (manojav) wrote : CurrentDmesg.txt

apport information

Revision history for this message
manojav (manojav) wrote : Lspci.txt

apport information

Revision history for this message
manojav (manojav) wrote : Lsusb.txt

apport information

Revision history for this message
manojav (manojav) wrote : PciMultimedia.txt

apport information

Revision history for this message
manojav (manojav) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
manojav (manojav) wrote : ProcInterrupts.txt

apport information

Revision history for this message
manojav (manojav) wrote : ProcModules.txt

apport information

Revision history for this message
manojav (manojav) wrote : UdevDb.txt

apport information

Revision history for this message
manojav (manojav) wrote : UdevLog.txt

apport information

Revision history for this message
manojav (manojav) wrote : WifiSyslog.txt

apport information

Revision history for this message
manojav (manojav) wrote :

that apport was collected after i switch over the cifs for file sharing becaues nfs keeps hanging and apport-collect also hung.

Revision history for this message
manojav (manojav) wrote :

syslog

Revision history for this message
manojav (manojav) wrote :

syslog

Revision history for this message
Harald (haraldboehmecke) wrote :

This is still an issue.

As of my experience, I can receive files from NFS with no problems. As soon as I SEND files to the NFS Server, the Ubuntu client hangs--partially. What I mean by Hang is:

- NFS File Transfer continues, at normal speed.
- GUI is 80% inoperable: No further programs are allowed to be launched and the open programs tend to "freeze". Panel apps keep running surprisingly. System Monitor stops scrolling cpu/mem/net info.
- If I stop the NFS Transfer, normal behavior is recovered. All Applications resume functionality.

Client:
Linux client 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010 x86_64 GNU/Linux

Server:
Linux server 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010 x86_64 GNU/Linux

Server Exports:
/var/lib/samba/documentos 192.168.2.0/24(rw,async,no_subtree_check)

Client Mounts /etc/fstab:
192.168.2.5:/var/lib/samba/documentos /home/user/Documents/server-documents nfs size=8192,wsize=8192,timeo=14,intr

Regards,

Harald

Revision history for this message
manojav (manojav) wrote :

This is also reported here and not yet fixed..

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/394413

Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.