NFS client hangs after some usage

Bug #1170087 reported by Joshua Kugler
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

We have an Ubuntu 10.04 machine. I mounts a directory (statically, i.e. a line in /etc/fstab) on an NFS server.

After some usage, the mount will simply hang. Any attempt to access the mount (e.g. ls /somedir) will just hang, and will have to be kill -9'ed.

Trying to mount another NFS export on the same server, or another server, will simply fail (syscall failed). Trying to access the hung export from another machine works fine (i.e. the server side works, this is a client-side problem).

This is kernel 2.6.32-45-server.

What logs would you like? When trying to do this:

rpcdebug -m nfs -s all

All I see on an access attempt is:

Apr 17 10:47:53 brent kernel: [145406.995318] NFS: revalidating (0:15/2)
Apr 17 10:47:53 brent kernel: [145406.995326] NFS call getattr

If I do this:

rpcdebug - m rpc -s all

I get waaaaay too much info. :) This box also serves as an NFS server, but when this glitch happens, the NFS server is unaffected.

We cannot do upstream testing. We tried the latest kernel on another machine with the exact same config, and one of our network interfaces would not come up.

BIOS update available at:
http://www.supermicro.com/support/resources/results.aspx

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-45-server 2.6.32-45.104
Regression: No
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-45.104-server 2.6.32.60+drm33.26
Uname: Linux 2.6.32-45-server x86_64
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
Date: Wed Apr 17 12:17:37 2013
MachineType: Supermicro H8DGU
PciMultimedia:

ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-45-server root=/dev/mapper/vg_root-10.04 ro text console=tty0 console=ttyS0,115200n8
ProcEnviron:
 SHELL=/bin/bash
 PATH=(custom, no user)
 LANG=en_US.UTF-8
SourcePackage: linux
dmi.bios.date: 10/14/10
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1.0c
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: H8DGU
dmi.board.vendor: Supermicro
dmi.board.version: 1234567890
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 1234567890
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1.0c:bd10/14/10:svnSupermicro:pnH8DGU:pvr1234567890:rvnSupermicro:rnH8DGU:rvr1234567890:cvnSupermicro:ct3:cvr1234567890:
dmi.product.name: H8DGU
dmi.product.version: 1234567890
dmi.sys.vendor: Supermicro

Revision history for this message
Joshua Kugler (jkugler) wrote :
Revision history for this message
Joshua Kugler (jkugler) wrote :

Googling for our problem, found this. Sounds like the issue may be similar. http://www.novell.com/support/kb/doc.php?id=7008148

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.9 kernel[0]. You will need to install both the linux-image and linux-image-extra .deb packages.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.9-rc7-raring/

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joshua Kugler (jkugler) wrote :

Please see my comment in the original bug report:

"We cannot do upstream testing. We tried the latest kernel on another machine with the exact same config, and one of our network interfaces would not come up."

tags: added: kernel-unable-to-test-upstream
Revision history for this message
Joshua Kugler (jkugler) wrote :

We tried this kernel. This is what happened:

* Installed new kernel, and rebooted
* eth3 was no longer there
* discovered it was now called eth4
* Changed /etc/network/interfaces and rebooted.
* Everything came up fine.
* Noticed this during bootup: Not starting NFS kernel daemon: no support in current kernel.
* Could not mount exports from this server any more.
* At some point during the process, the serial console became unresponsive, and all network connections died.
* That pretty much ruled it out right there.
* Reverted changes to network config
* Rebooted into 3.9.
* Removed 3.9
* Rebooted into 2.6.x kernel

Additional information:

root@brent:~# /etc/init.d/nfs-kernel-server restart
 * Stopping NFS kernel daemon [ OK ]
 * Unexporting directories for NFS kernel daemon... [ OK ]
 * Not starting NFS kernel daemon: no support in current kernel.
root@brent:~# lsmod|grep nfs
nfsd 260175 2
auth_rpcgss 40877 1 nfsd
nfsv3 34774 1
nfs_acl 12883 2 nfsd,nfsv3
nfs 170267 2 nfsv3
fscache 57914 1 nfs
lockd 77317 3 nfsd,nfsv3,nfs
sunrpc 242438 16 nfsd,auth_rpcgss,nfsv3,nfs_acl,nfs,lockd

penalvch (penalvch)
tags: added: bios-outdated-r3.0b
description: updated
Revision history for this message
Joshua Kugler (jkugler) wrote :

I'm not sure how the BIOS update would affect us. The error posted from the BIOS update info has to do with the ALSA sound driver, not NFS.

penalvch (penalvch)
tags: removed: kernel-unable-to-test-upstream
Revision history for this message
penalvch (penalvch) wrote :

Joshua Kugler as per your :
[ 0.000000] AMI BIOS detected: BIOS may corrupt low RAM, working around it.
...
[ 0.000000] Your BIOS doesn't leave a aperture memory hole
[ 0.000000] Please enable the IOMMU option in the BIOS setup
...
[ 12.643794] EDAC amd64: f10_probe_valid_hardware() This machine is running with DDR3 memory. This is not currently supported. DCHR0=0x3f48090d DCHR1=0x3f48090d
[ 12.643797] EDAC amd64: Contact 'amd64_edac' module MAINTAINER to help add support.

Hence, as per http://www.supermicro.com/support/resources/results.aspx an update is available for your BIOS (r3.0b). If during your maintenance window, you update to this following https://help.ubuntu.com/community/BiosUpdate , does it change anything? If it doesn't, could you please both specify what happened, and just provide the output of the following terminal command:
sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

For more on BIOS updates and linux, please see https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette .

Thank you for your understanding.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.