nfsd hangs

Bug #1315955 reported by w00key on 2014-05-04
152
This bug affects 29 people
Affects Status Importance Assigned to Milestone
NFS-Utils
New
Undecided
Unassigned
nfs-utils (Ubuntu)
Critical
Unassigned
Trusty
Critical
Unassigned

Bug Description

On a relatively busy NFS server, the system hang on us with the following messages:

May 4 07:53:36 wol-nfs kernel: [487678.715589] INFO: task nfsd:2793 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.715653] Not tainted 3.13.0-24-generic #46-Ubuntu
May 4 07:53:36 wol-nfs kernel: [487678.715695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 4 07:53:36 wol-nfs kernel: [487678.715790] nfsd D ffff88023fc14440 0 2793 2 0x00000000
May 4 07:53:36 wol-nfs kernel: [487678.715800] ffff88023317fca0 0000000000000002 ffff880233268000 ffff88023317ffd8
May 4 07:53:36 wol-nfs kernel: [487678.715807] 0000000000014440 0000000000014440 ffff880233268000 ffffffffa03520a0
May 4 07:53:36 wol-nfs kernel: [487678.715811] ffffffffa03520a4 ffff880233268000 00000000ffffffff ffffffffa03520a8
May 4 07:53:36 wol-nfs kernel: [487678.715818] Call Trace:
May 4 07:53:36 wol-nfs kernel: [487678.715860] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
May 4 07:53:36 wol-nfs kernel: [487678.715865] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
May 4 07:53:36 wol-nfs kernel: [487678.715870] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
May 4 07:53:36 wol-nfs kernel: [487678.715905] [<ffffffffa033be55>] nfs4_lock_state+0x15/0x20 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715917] [<ffffffffa032e858>] nfsd4_open+0xd8/0x8f0 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715928] [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715937] [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715961] [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.715977] [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.715986] [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715995] [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.716004] [<ffffffff8108b312>] kthread+0xd2/0xf0
May 4 07:53:36 wol-nfs kernel: [487678.716009] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
May 4 07:53:36 wol-nfs kernel: [487678.716016] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
May 4 07:53:36 wol-nfs kernel: [487678.716020] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0

And many more with the exact same stack trace:

May 4 07:53:36 wol-nfs kernel: [487678.716025] INFO: task nfsd:2794 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.716500] INFO: task nfsd:2795 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.717166] INFO: task nfsd:2796 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.717657] INFO: task nfsd:2797 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.718150] INFO: task nfsd:2798 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.718743] INFO: task nfsd:2799 blocked for more than 120 seconds.

Except this one

May 4 07:53:36 wol-nfs kernel: [487678.719229] INFO: task nfsd:2800 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.719347] Not tainted 3.13.0-24-generic #46-Ubuntu
May 4 07:53:36 wol-nfs kernel: [487678.719605] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 4 07:53:36 wol-nfs kernel: [487678.719741] nfsd D ffff88023fd94440 0 2800 2 0x00000000
May 4 07:53:36 wol-nfs kernel: [487678.719746] ffff8800b81f1b40 0000000000000002 ffff88022f96c7d0 ffff8800b81f1fd8
May 4 07:53:36 wol-nfs kernel: [487678.719751] 0000000000014440 0000000000014440 ffff88022f96c7d0 ffff8800b81f1ca8
May 4 07:53:36 wol-nfs kernel: [487678.719755] ffff8800b81f1cb0 7fffffffffffffff ffff88022f96c7d0 ffff8800b81f1c90
May 4 07:53:36 wol-nfs kernel: [487678.719760] Call Trace:
May 4 07:53:36 wol-nfs kernel: [487678.719766] [<ffffffff81719e89>] schedule+0x29/0x70
May 4 07:53:36 wol-nfs kernel: [487678.719770] [<ffffffff817190d9>] schedule_timeout+0x239/0x2d0
May 4 07:53:36 wol-nfs kernel: [487678.719775] [<ffffffff81719a11>] ? __schedule+0x381/0x7d0
May 4 07:53:36 wol-nfs kernel: [487678.719781] [<ffffffff8101b763>] ? native_sched_clock+0x13/0x80
May 4 07:53:36 wol-nfs kernel: [487678.719786] [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
May 4 07:53:36 wol-nfs kernel: [487678.719791] [<ffffffff8171a9a6>] wait_for_completion+0xa6/0x160
May 4 07:53:36 wol-nfs kernel: [487678.719798] [<ffffffff8109a790>] ? wake_up_state+0x20/0x20
May 4 07:53:36 wol-nfs kernel: [487678.719804] [<ffffffff810824ca>] flush_workqueue+0x11a/0x5a0
May 4 07:53:36 wol-nfs kernel: [487678.719818] [<ffffffffa0346683>] nfsd4_shutdown_callback+0x73/0x80 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719829] [<ffffffffa033d37d>] destroy_client+0x18d/0x430 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719840] [<ffffffffa033e9d6>] nfsd4_setclientid_confirm+0x1e6/0x210 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719849] [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719857] [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719872] [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.719885] [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.719893] [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719901] [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719905] [<ffffffff8108b312>] kthread+0xd2/0xf0
May 4 07:53:36 wol-nfs kernel: [487678.719909] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
May 4 07:53:36 wol-nfs kernel: [487678.719914] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
May 4 07:53:36 wol-nfs kernel: [487678.719918] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0

It looks like the last thread just hung, keeping a lock and blocking out every single other thread/process of nfsd.

Preceding the crash, there were a few suspicious messages about a CPU soft lockup, with the following stack trace. This may or may not be related. It's days ago though, so it's probably nothing.

Apr 30 12:45:41 wol-nfs kernel: [159283.910727] BUG: soft lockup - CPU#2 stuck for 22s! [chown:6108]
Apr 30 12:45:41 wol-nfs kernel: [159283.910928] Call Trace:
Apr 30 12:45:41 wol-nfs kernel: [159283.910934] [<ffffffff812085e0>] ? locks_delete_block+0x70/0x80
Apr 30 12:45:41 wol-nfs kernel: [159283.910937] [<ffffffff81209f40>] __break_lease+0x350/0x3d0
Apr 30 12:45:41 wol-nfs kernel: [159283.910940] [<ffffffff811d5b48>] ? notify_change+0x1a8/0x390
Apr 30 12:45:41 wol-nfs kernel: [159283.910943] [<ffffffff811b6767>] chown_common+0x117/0x180
Apr 30 12:45:41 wol-nfs kernel: [159283.910945] [<ffffffff811b826f>] SyS_fchownat+0xaf/0x110
Apr 30 12:45:41 wol-nfs kernel: [159283.910948] [<ffffffff8172663f>] tracesys+0xe1/0xe6
Apr 30 12:45:41 wol-nfs kernel: [159283.910949] Code: 39 d0 75 ea b8 01 00 00 00 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e9 06 00 00 00 66 83 07 02 c3 90 8b 37 f0 66 83 07 02 <f6> 47 02 01 74 f1 55 48 89 e5 e8 31 1b ff ff 5d c3 0f 1f 84 00

The relevant sections of kern.log are in an separate attachment.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-generic 3.13.0.24.29
ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 4 23:41 seq
 crw-rw---- 1 root audio 116, 33 May 4 23:41 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
CurrentDmesg:
 [ 5.274819] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
 [ 5.279871] NFSD: starting 90-second grace period (net ffffffff81cd9b00)
 [ 5.518836] init: plymouth-upstart-bridge main process ended, respawning
 [ 12.233348] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:91:fc:20:00:00:00:00:00:00:08:00 SRC=10.0.0.0 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 PROTO=2
Date: Mon May 5 00:29:12 2014
HibernationDevice: RESUME=/dev/mapper/wolnfs--vg-swap_1
InstallationDate: Installed on 2014-04-20 (14 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
IwConfig:
 eth0 no wireless extensions.

 lo no wireless extensions.
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: VMware, Inc. VMware Virtual Platform
PciMultimedia:

ProcFB: 0 svgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/wolnfs--vg-root ro
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware 1.127
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/30/2013
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd07/30/2013:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: VMware Virtual Platform
dmi.product.version: None
dmi.sys.vendor: VMware, Inc.

w00key (booink) wrote :
affects: linux (Ubuntu) → nfs-utils (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nfs-utils (Ubuntu):
status: New → Confirmed
Moritz Augustin (pub-1) wrote :

I can confirm this bug with Ubuntu 14.04 LTS and would appreciate any workaround since this is hurting me alot in my production environment (8 servers) which I have updated to the current LTS assuming stable core packets like nfs related ones...
If you need more details please let me know.

w00key (booink) wrote :

Our workaround so far is to force all clients to connect with nfs3 instead of nfs4 and to avoid doing major operations like file tree syncs over nfs, doing them on the nfsd host itself.

Either nfs3 doesn't get stuck and crash (case 1) or we're just lucky it hasn't returned yet, with minor mitigation by dropping load (case 2).

Changed in nfs-utils (Ubuntu):
importance: Undecided → Critical
Luis (luisflucas) wrote :

This bug also affects me. I'm using ubuntu server 14.04 in server side and kubuntu 14.04 in clients.
My workaround is to force nfs3.

SK (h-web) wrote :

same here, production, very critical !!

Same issue for us too on a production environment using Ubuntu Server 14.04.

nsf3 Workaround is confirmed to work ? what are we loosing from using nfs4 ?

Moritz Augustin (pub-1) wrote :

in our case the nfs3 workaround works nicely and although we also do heavy duty file system operations on the clients we have not suffered from any problems

Please:
- Report to <https://bugzilla.kernel.org/>.
- Paste the new report URL here.
- Set this bug status back to "confirmed".

Changed in nfs-utils (Ubuntu):
status: Confirmed → Incomplete
tags: added: asked-to-upstream
Download full text (4.3 KiB)

additionnal stack traces, not exactly the same as the OP ones.

Jan 21 06:05:20 filer2 kernel: [398213.798328] INFO: task nfsd:1440 blocked for more than 120 seconds.
Jan 21 06:05:20 filer2 kernel: [398213.798371] Not tainted 3.13.0-24-generic #47-Ubuntu
Jan 21 06:05:20 filer2 kernel: [398213.798435] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 21 06:05:20 filer2 kernel: [398213.798627] nfsd D ffff88013fd14440 0 1440 2 0x00000000
Jan 21 06:05:20 filer2 kernel: [398213.798630] ffff8800b595fce0 0000000000000002 ffff8800b921afe0 ffff8800b595ffd8
Jan 21 06:05:20 filer2 kernel: [398213.798633] 0000000000014440 0000000000014440 ffff8800b921afe0 ffffffffa01a90a0
Jan 21 06:05:20 filer2 kernel: [398213.798635] ffffffffa01a90a4 ffff8800b921afe0 00000000ffffffff ffffffffa01a90a8
Jan 21 06:05:20 filer2 kernel: [398213.798638] Call Trace:
Jan 21 06:05:20 filer2 kernel: [398213.798642] [<ffffffff8171a409>] schedule_preempt_disabled+0x29/0x70
Jan 21 06:05:20 filer2 kernel: [398213.798645] [<ffffffff8171c275>] __mutex_lock_slowpath+0x135/0x1b0
Jan 21 06:05:20 filer2 kernel: [398213.798654] [<ffffffffa00d09b0>] ? svcauth_unix_domain_release+0x30/0x30 [sunrpc]
Jan 21 06:05:20 filer2 kernel: [398213.798658] [<ffffffff8171c30f>] mutex_lock+0x1f/0x2f
Jan 21 06:05:20 filer2 kernel: [398213.798665] [<ffffffffa019714f>] nfsd4_renew+0x5f/0xf0 [nfsd]
Jan 21 06:05:20 filer2 kernel: [398213.798672] [<ffffffffa01865da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
Jan 21 06:05:20 filer2 kernel: [398213.798677] [<ffffffffa0172d2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
Jan 21 06:05:20 filer2 kernel: [398213.798686] [<ffffffffa00cd63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
Jan 21 06:05:20 filer2 kernel: [398213.798696] [<ffffffffa00cd9a7>] svc_process+0x107/0x170 [sunrpc]
Jan 21 06:05:20 filer2 kernel: [398213.798701] [<ffffffffa017271f>] nfsd+0xbf/0x130 [nfsd]
Jan 21 06:05:20 filer2 kernel: [398213.798706] [<ffffffffa0172660>] ? nfsd_destroy+0x80/0x80 [nfsd]
Jan 21 06:05:20 filer2 kernel: [398213.798708] [<ffffffff8108b312>] kthread+0xd2/0xf0
Jan 21 06:05:20 filer2 kernel: [398213.798711] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
Jan 21 06:05:20 filer2 kernel: [398213.798714] [<ffffffff817263fc>] ret_from_fork+0x7c/0xb0
Jan 21 06:05:20 filer2 kernel: [398213.798717] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
Jan 21 06:05:20 filer2 kernel: [398213.798721] INFO: task kworker/u4:2:9594 blocked for more than 120 seconds.
Jan 21 06:05:20 filer2 kernel: [398213.798882] Not tainted 3.13.0-24-generic #47-Ubuntu
Jan 21 06:05:20 filer2 kernel: [398213.799052] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 21 06:05:20 filer2 kernel: [398213.799197] kworker/u4:2 D ffff88013fd14440 0 9594 2 0x00000000
Jan 21 06:05:20 filer2 kernel: [398213.799207] Workqueue: nfsd4 laundromat_main [nfsd]
Jan 21 06:05:20 filer2 kernel: [398213.799209] ffff8800365cbd40 0000000000000002 ffff8800b582c7d0 ffff8800365cbfd8
Jan 21 06:05:20 filer2 kernel: [398213.799212] 0000000000014440 0000000000014440 ffff8800b582c7d0 ffffffffa01a90a0
Jan 21 06:05:20 filer2 ...

Read more...

Changed in nfs-utils (Ubuntu Trusty):
status: New → Incomplete
importance: Undecided → Critical
F0x06 (kevin-velickovic) wrote :
Download full text (3.8 KiB)

Same problem for me
Ubuntu version: Ubuntu 14.04.2 LTS
Kernel version: 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Jun 8 17:28:41 server1 kernel: [ 1561.791349] INFO: task nfsd:1986 blocked for more than 120 seconds.
Jun 8 17:28:41 server1 kernel: [ 1561.791366] Tainted: P OX 3.13.0-53-generic #89-Ubuntu
Jun 8 17:28:41 server1 kernel: [ 1561.791384] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 8 17:28:41 server1 kernel: [ 1561.791405] nfsd D ffff88041fb93180 0 1986 2 0x00000000
Jun 8 17:28:41 server1 kernel: [ 1561.791407] ffff8803f94d18a0 0000000000000046 ffff8800d4588000 ffff8803f94d1fd8
Jun 8 17:28:41 server1 kernel: [ 1561.791409] 0000000000013180 0000000000013180 ffff8800d4588000 ffff880256c1e8f8
Jun 8 17:28:41 server1 kernel: [ 1561.791410] ffff880256c1e8a8 ffff880256c1e900 ffff880256c1e8d0 0000000000000000
Jun 8 17:28:41 server1 kernel: [ 1561.791412] Call Trace:
Jun 8 17:28:41 server1 kernel: [ 1561.791414] [<ffffffff81727229>] schedule+0x29/0x70
Jun 8 17:28:41 server1 kernel: [ 1561.791419] [<ffffffffa007eaf5>] cv_wait_common+0xe5/0x120 [spl]
Jun 8 17:28:41 server1 kernel: [ 1561.791421] [<ffffffff810ab220>] ? prepare_to_wait_event+0x100/0x100
Jun 8 17:28:41 server1 kernel: [ 1561.791426] [<ffffffffa007eb45>] __cv_wait+0x15/0x20 [spl]
Jun 8 17:28:41 server1 kernel: [ 1561.791437] [<ffffffffa0139da3>] dmu_buf_hold_array_by_dnode+0x233/0x570 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791449] [<ffffffffa013a1bd>] dmu_buf_hold_array+0x5d/0x80 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791461] [<ffffffffa013ba01>] dmu_read_uio+0x41/0xe0 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791480] [<ffffffffa01bafbc>] zfs_read+0x14c/0x450 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791492] [<ffffffffa01385fe>] ? dmu_object_size_from_db+0x5e/0x80 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791511] [<ffffffffa01d757a>] zpl_aio_read+0xda/0x130 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791513] [<ffffffff811bd9cc>] do_sync_readv_writev+0x4c/0x80
Jun 8 17:28:41 server1 kernel: [ 1561.791515] [<ffffffff811bee90>] do_readv_writev+0xb0/0x220
Jun 8 17:28:41 server1 kernel: [ 1561.791534] [<ffffffffa01bac57>] ? zfs_open+0x87/0x120 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791536] [<ffffffff813213d3>] ? ima_get_action+0x23/0x30
Jun 8 17:28:41 server1 kernel: [ 1561.791538] [<ffffffff813206b2>] ? process_measurement+0x82/0x2c0
Jun 8 17:28:41 server1 kernel: [ 1561.791539] [<ffffffff811bf02d>] vfs_readv+0x2d/0x50
Jun 8 17:28:41 server1 kernel: [ 1561.791543] [<ffffffffa0401aae>] nfsd_vfs_read.isra.12+0x6e/0x160 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791547] [<ffffffffa0402da9>] ? nfsd_open+0xb9/0x190 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791551] [<ffffffffa0403076>] nfsd_read+0x1e6/0x2c0 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791557] [<ffffffffa040ce4c>] nfsd3_proc_read+0xcc/0x170 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791561] [<ffffffffa03fdd3b>] nfsd_dispatch+0xbb/0x200 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791568] [<ffffffffa036262d>] svc_process_com...

Read more...

Heeeeeeelp, i have the same problem here on Ubuntu 15.04! What the hell it is??

Download full text (10.9 KiB)

I've set NFS version to 3. Since then it's working, but not what I want though.

> Am 20.08.2015 um 09:30 schrieb Michael Heuberger <email address hidden>:
>
> Heeeeeeelp, i have the same problem here on Ubuntu 15.04! What the hell
> it is??
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1315955
>
> Title:
> nfsd hangs
>
> Status in NFS-Utils:
> New
> Status in nfs-utils package in Ubuntu:
> Incomplete
> Status in nfs-utils source package in Trusty:
> Incomplete
>
> Bug description:
> On a relatively busy NFS server, the system hang on us with the
> following messages:
>
> May 4 07:53:36 wol-nfs kernel: [487678.715589] INFO: task nfsd:2793 blocked for more than 120 seconds.
> May 4 07:53:36 wol-nfs kernel: [487678.715653] Not tainted 3.13.0-24-generic #46-Ubuntu
> May 4 07:53:36 wol-nfs kernel: [487678.715695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> May 4 07:53:36 wol-nfs kernel: [487678.715790] nfsd D ffff88023fc14440 0 2793 2 0x00000000
> May 4 07:53:36 wol-nfs kernel: [487678.715800] ffff88023317fca0 0000000000000002 ffff880233268000 ffff88023317ffd8
> May 4 07:53:36 wol-nfs kernel: [487678.715807] 0000000000014440 0000000000014440 ffff880233268000 ffffffffa03520a0
> May 4 07:53:36 wol-nfs kernel: [487678.715811] ffffffffa03520a4 ffff880233268000 00000000ffffffff ffffffffa03520a8
> May 4 07:53:36 wol-nfs kernel: [487678.715818] Call Trace:
> May 4 07:53:36 wol-nfs kernel: [487678.715860] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
> May 4 07:53:36 wol-nfs kernel: [487678.715865] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
> May 4 07:53:36 wol-nfs kernel: [487678.715870] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
> May 4 07:53:36 wol-nfs kernel: [487678.715905] [<ffffffffa033be55>] nfs4_lock_state+0x15/0x20 [nfsd]
> May 4 07:53:36 wol-nfs kernel: [487678.715917] [<ffffffffa032e858>] nfsd4_open+0xd8/0x8f0 [nfsd]
> May 4 07:53:36 wol-nfs kernel: [487678.715928] [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
> May 4 07:53:36 wol-nfs kernel: [487678.715937] [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
> May 4 07:53:36 wol-nfs kernel: [487678.715961] [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
> May 4 07:53:36 wol-nfs kernel: [487678.715977] [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
> May 4 07:53:36 wol-nfs kernel: [487678.715986] [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
> May 4 07:53:36 wol-nfs kernel: [487678.715995] [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
> May 4 07:53:36 wol-nfs kernel: [487678.716004] [<ffffffff8108b312>] kthread+0xd2/0xf0
> May 4 07:53:36 wol-nfs kernel: [487678.716009] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
> May 4 07:53:36 wol-nfs kernel: [487678.716016] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
> May 4 07:53:36 wol-nfs kernel: [487678.716020] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
>
> And many more with the exact same stack trace:
>
> May 4 07:53:36 wol-nfs...

Matt (l-matts) wrote :

We've just run into this after about 600 days of continual NFS use. It does appear to be load related. I can say anecdotally though that when swapping over from Ubuntu 14.04 to 14.04.3 we found it appearing more often, which lead me to this bug report. I've just disabled NFSv4 as was suggested.

Omar Alvarez (osurfer3) wrote :

I think I'm having this issue on 15.10, NFS shares are extremely slow and hang when copying any file. My NFS environment is not really busy, and Samba is working fine. I will try to force NFSv3.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers