CIFS hangs if server goes down, even if 'soft' is enabled

Bug #1073648 reported by Mike Stipicevic
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Hi, I have a NAS mounted as follows:

//mybook/Public /mnt/mybook cifs nodev,soft,_netdev,noexec,nosuid,uid=1000 0 2

Unfortunately, the 'soft' option doesn't seem to be honored. When the server goes down, all programs lock up trying to access it and cannot be killed (kill -9 is ineffective). Umount yields and error unless -l is given, in which case it hangs.

dmesg has the following messages:

    [ 2160.228973] INFO: task rsync:20808 blocked for more than 120 seconds.
    [ 2160.228975] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [ 2160.228978] rsync D 0000000000000002 0 20808 1 0x00000004
    [ 2160.228982] ffff8801c1305878 0000000000000082 ffff8801c1305fd8 0000000000000286
    [ 2160.228987] ffff8801c1305fd8 ffff8801c1305fd8 ffff8801c1305fd8 00000000000137c0
    [ 2160.228992] ffff88010c5cc500 ffff88032b4edc00 ffff8801c1305868 ffff8802bb624220
    [ 2160.228996] Call Trace:
    [ 2160.229001] [<ffffffff81658f2f>] schedule+0x3f/0x60
    [ 2160.229004] [<ffffffff81659d37>] __mutex_lock_slowpath+0xd7/0x150
    [ 2160.229008] [<ffffffff8165ae3e>] ? _raw_spin_lock+0xe/0x20
    [ 2160.229012] [<ffffffff8165994a>] mutex_lock+0x2a/0x50
    [ 2160.229018] [<ffffffffa0eb393c>] cifs_reconnect_tcon+0x18c/0x310 [cifs]
    [ 2160.229022] [<ffffffff8108ab80>] ? add_wait_queue+0x60/0x60
    [ 2160.229029] [<ffffffffa0eb49e7>] small_smb_init+0x37/0x80 [cifs]
    [ 2160.229036] [<ffffffffa0eb6921>] cifs_async_readv+0x71/0x180 [cifs]
    [ 2160.229045] [<ffffffffa0ec9a28>] cifs_readpages+0x288/0x430 [cifs]
    [ 2160.229049] [<ffffffff81122a78>] read_pages+0x48/0x100
    [ 2160.229053] [<ffffffff81122c93>] __do_page_cache_readahead+0x163/0x180
    [ 2160.229057] [<ffffffff81123001>] ra_submit+0x21/0x30
    [ 2160.229061] [<ffffffff81123125>] ondemand_readahead+0x115/0x230
    [ 2160.229065] [<ffffffff811232c8>] page_cache_async_readahead+0x88/0xb0
    [ 2160.229071] [<ffffffff813108fe>] ? radix_tree_lookup_slot+0xe/0x10
    [ 2160.229076] [<ffffffff81117b6e>] ? find_get_page+0x1e/0x90
    [ 2160.229081] [<ffffffff811184a9>] do_generic_file_read.constprop.33+0x269/0x440
    [ 2160.229086] [<ffffffff8111941f>] generic_file_aio_read+0xef/0x280
    [ 2160.229091] [<ffffffff8117792a>] do_sync_read+0xda/0x120
    [ 2160.229096] [<ffffffff8129d5f3>] ? security_file_permission+0x93/0xb0
    [ 2160.229100] [<ffffffff81177db1>] ? rw_verify_area+0x61/0xf0
    [ 2160.229103] [<ffffffff81178290>] vfs_read+0xb0/0x180
    [ 2160.229106] [<ffffffff811783aa>] sys_read+0x4a/0x90
    [ 2160.229110] [<ffffffff81663442>] system_call_fastpath+0x16/0x1b

So far, the only remedy to clear the programs (and bring the mount back online) is to reboot.

Thanks!

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-32-generic 3.2.0-32.51
ProcVersionSignature: Ubuntu 3.2.0-32.51-generic 3.2.30
Uname: Linux 3.2.0-32-generic x86_64
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu14
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: chicken 3776 F.... pulseaudio
 /dev/snd/controlC0: chicken 3776 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf7df8000 irq 79'
   Mixer name : 'Realtek ALC889'
   Components : 'HDA:10ec0889,104383c0,00100004'
   Controls : 47
   Simple ctrls : 23
Card1.Amixer.info:
 Card hw:1 'NVidia'/'HDA NVidia at 0xfbcfc000 irq 34'
   Mixer name : 'Nvidia GPU 11 HDMI/DP'
   Components : 'HDA:10de0011,10de0101,00100100'
   Controls : 24
   Simple ctrls : 4
Date: Wed Oct 31 10:03:45 2012
HibernationDevice: RESUME=UUID=34cbfc6f-a406-4efe-aacc-f563db40fa2f
IwConfig: Error: [Errno 2] No such file or directory
MachineType: System manufacturer System Product Name
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=rxvt-unicode
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-32-generic root=/dev/mapper/hostname-root ro splash quiet video=vesa:off vga=normal vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-32-generic N/A
 linux-backports-modules-3.2.0-32-generic N/A
 linux-firmware 1.79.1
RfKill: Error: [Errno 2] No such file or directory
SourcePackage: linux
UdevDb: Error: [Errno 2] No such file or directory
UpgradeStatus: Upgraded to precise on 2012-09-24 (37 days ago)
UserAsoundrc:
 # Select PulseAudio as the default sound device
 pcm.!default {
     type pulse
 }
dmi.bios.date: 11/16/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0502
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P6X58D-E
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0502:bd11/16/2010:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP6X58D-E:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Mike Stipicevic (stipredirect) wrote :
Revision history for this message
Mike Stipicevic (stipredirect) wrote :

It appears that after mounting with -l and waiting for ~15 minutes, the drive comes back up.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.7 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. Please only remove that one tag and leave the other tags. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-rc4-raring/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: needs-upstream-testing
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
lcampagn (luke-campagnola) wrote :

I am having the same problem on ubuntu 16.04.2, kernel 4.4.0-75-generic, and on a mint 18 system, kernel 4.4.0-21-generic. I have several cifs mounts, and if any one server becomes unresponsive, then any process attempting to access the mounts will hang indefinitely. If I unmount using -l and wait several minutes, eventually these hung processes will resume (with an error). I am mounting with the "soft" option.

The hung process stack looks like:
$ sudo cat /proc/7771/stack
[<ffffffffc030970d>] cifs_reconnect_tcon+0x9d/0x340 [cifs]
[<ffffffffc0309a5a>] smb_init+0x2a/0x50 [cifs]
[<ffffffffc0310c53>] CIFSSMBQPathInfo+0x63/0x2e0 [cifs]
[<ffffffffc033c20f>] cifs_query_path_info+0x6f/0x1a0 [cifs]
[<ffffffffc032ae80>] cifs_get_inode_info+0x390/0x8f0 [cifs]
[<ffffffffc032d425>] cifs_revalidate_dentry_attr+0x1d5/0x250 [cifs]
[<ffffffffc032d551>] cifs_getattr+0x51/0x110 [cifs]
[<ffffffff81213a6c>] vfs_getattr_nosec+0x2c/0x40
[<ffffffff81213c86>] vfs_getattr+0x26/0x30
[<ffffffff81213d68>] vfs_fstatat+0x78/0xc0
[<ffffffff81214311>] SYSC_newlstat+0x31/0x60
[<ffffffff8121444e>] SyS_newlstat+0xe/0x10
[<ffffffff8183b972>] entry_SYSCALL_64_fastpath+0x16/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

Changed in linux (Ubuntu):
status: Expired → New
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.