Intensive NFS operations could lock whole system

Bug #1874266 reported by Paweł Hikiert
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

After last update from kernel version 4.4.0-176-generic to 4.4.0-177-generic my machine which does repository mirrors on NFS filesystem hung.
After hang the system is pingable, and all processes run well until they need to perform IO operation. Then they end in 'D' state.
If dmesg -w -H is run prior to hang you could get following report:

[Apr22 16:06] INFO: task find:986 blocked for more than 120 seconds.
[ +0.000034] Not tainted 4.4.0-177-generic #207-Ubuntu
[ +0.000013] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0.000024] find D f4bbdd2c 0 986 721 0x00000000
[ +0.000010] f4bbdd1c 00000086 41ed78a0 f4bbdd2c eb5b7844 c1b36ec0 95907572 00000025
[ +0.000003] f4eb7e00 c1b1da80 f3ab6e00 f4bbe000 f3ab6e00 f53ae3a8 f4bbdd28 c17ec5cd
[ +0.000002] 000001ff f4bbdd78 c11a8b57 f4bbdd3c f87bb75b f76ae800 f4bbdd50 f87b2bdd
[ +0.000002] Call Trace:
[ +0.000006] [<c17ec5cd>] schedule+0x2d/0x80
[ +0.000012] [<c11a8b57>] kmap_high+0x117/0x290
[ +0.000013] [<f87bb75b>] ? nfs_revalidate_inode_rcu+0x1b/0x40 [nfs]
[ +0.000005] [<f87b2bdd>] ? nfs_check_verifier+0x6d/0x80 [nfs]
[ +0.000004] [<c11d2c95>] ? kmem_cache_alloc_trace+0x185/0x1e0
[ +0.000012] [<c109ee50>] ? wake_up_q+0x70/0x70
[ +0.000003] [<c106bec0>] kmap+0x40/0x50
[ +0.000006] [<f87b4e0d>] nfs_readdir_xdr_to_array+0xdd/0x370 [nfs]
[ +0.000005] [<f87b6835>] ? nfs4_lookup_revalidate+0x25/0x140 [nfs]
[ +0.000002] [<c11e4d62>] ? mem_cgroup_commit_charge+0x62/0xe0
[ +0.000011] [<f87b50bb>] nfs_readdir_filler+0x1b/0x80 [nfs]
[ +0.000002] [<c117e3b2>] do_read_cache_page+0x102/0x190
[ +0.000019] [<f87b50a0>] ? nfs_readdir_xdr_to_array+0x370/0x370 [nfs]
[ +0.000002] [<c117e464>] read_cache_page+0x24/0x30
[ +0.000004] [<f87b5261>] nfs_readdir+0x141/0x740 [nfs]
[ +0.000003] [<c13d21b6>] ? _copy_to_user+0x26/0x30
[ +0.000002] [<c135fba5>] ? common_file_perm+0x55/0x1b0
[ +0.000016] [<fbaac1c0>] ? nfs4_xdr_dec_fsinfo+0x80/0x80 [nfsv4]
[ +0.000003] [<c1201c2e>] iterate_dir+0x8e/0x130
[ +0.000002] [<c120bde2>] ? set_close_on_exec+0x62/0x70
[ +0.000002] [<c120224d>] SyS_getdents64+0x6d/0xf0
[ +0.000001] [<c1201e10>] ? filldir+0x140/0x140
[ +0.000002] [<c100397f>] do_fast_syscall_32+0x9f/0x190
[ +0.000011] [<c17f08b0>] sysenter_past_esp+0x3d/0x61
[ +0.000002] INFO: task bash:1065 blocked for more than 120 seconds.
[ +0.000015] Not tainted 4.4.0-177-generic #207-Ubuntu
[ +0.000020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0.000018] bash D 00000017 0 1065 760 0x00000000
[ +0.000002] f1b87e44 00000086 c11a799b 00000017 00000000 f1f1eff8 b37c5095 00000025
[ +0.000003] f4eb7e00 c1b1da80 f3ab3c00 f1b88000 f3ab3c00 f53b0298 f1b87e50 c17ec5cd
[ +0.000002] 000001ff f1b87ea0 c11a8b57 f3ab3c00 f3131000 f3ab3c00 00000001 00000000
[ +0.000002] Call Trace:
[ +0.000003] [<c11a799b>] ? follow_page_mask+0x16b/0x290
[ +0.000001] [<c17ec5cd>] schedule+0x2d/0x80
[ +0.000001] [<c11a8b57>] kmap_high+0x117/0x290
[ +0.000002] [<c109ee50>] ? wake_up_q+0x70/0x70
[ +0.000001] [<c106bec0>] kmap+0x40/0x50
[ +0.000002] [<c11f4ae8>] copy_strings+0x1f8/0x2d0
[ +0.000002] [<c11f4be6>] copy_strings_kernel+0x26/0x30
[ +0.000001] [<c11f5b29>] do_execveat_common+0x479/0x6f0
[ +0.000002] [<c11fd62a>] ? getname_flags+0x3a/0x1a0
[ +0.000001] [<c11f5fb4>] SyS_execve+0x34/0x40
[ +0.000002] [<c100397f>] do_fast_syscall_32+0x9f/0x190
[ +0.000002] [<c17f08b0>] sysenter_past_esp+0x3d/0x61

I did NOT noticed such behaviour on x86_64 architecture, only on i386.
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.4.0-177-generic.
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.23
Architecture: i386
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/pcmC0D1c', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 16.04
IwConfig: Error: [Errno 2] No such file or directory
Lsusb:
 Bus 001 Device 002: ID 80ee:0021 VirtualBox USB Tablet
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: innotek GmbH VirtualBox
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 vboxdrmfb
ProcKernelCmdLine: root=/dev/sda2 quiet splash
ProcVersionSignature: Ubuntu 4.4.0-177.207-generic 4.4.214
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-177-generic N/A
 linux-backports-modules-4.4.0-177-generic N/A
 linux-firmware 1.157.22
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial xenial
Uname: Linux 4.4.0-177-generic i686
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: False
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.board.name: VirtualBox
dmi.board.vendor: Oracle Corporation
dmi.board.version: 1.2
dmi.chassis.type: 1
dmi.chassis.vendor: Oracle Corporation
dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:rvnOracleCorporation:rnVirtualBox:rvr1.2:cvnOracleCorporation:ct1:cvr:
dmi.product.name: VirtualBox
dmi.product.version: 1.2
dmi.sys.vendor: innotek GmbH

Revision history for this message
Paweł Hikiert (nsilent22) wrote :
Revision history for this message
Paweł Hikiert (nsilent22) wrote :
Revision history for this message
Paweł Hikiert (nsilent22) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1874266

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: xenial
Revision history for this message
Paweł Hikiert (nsilent22) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Paweł Hikiert (nsilent22) wrote : CRDA.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : Card0.Codecs.codec97.0.ac97.0-0.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : Card0.Codecs.codec97.0.ac97.0-0.regs.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : HookError_generic.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : Lspci.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : PciMultimedia.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : ProcModules.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : UdevDb.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote : WifiSyslog.txt

apport information

Revision history for this message
Paweł Hikiert (nsilent22) wrote :

Of course all apport information were collected prior to hangup. After hangup running new processes is almost impossible (if it requires performing some IO operations, and almost always it does).

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.