CIFS hangs at cifs_oplock_break

Bug #1428045 reported by Tobias Junghans
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

we're having constant issues with CIFS mounts on our Linux terminal servers (Ubuntu 12.04 and 14.04 with updated kernel). Usually after 1 or 2 days we see the following in dmesg:

[102000.440031] INFO: task cifsiod:2184 blocked for more than 120 seconds.
[102000.441342] Not tainted 3.19.0-7-generic #7-Ubuntu
[102000.441454] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[102000.441617] cifsiod D ffff8800363dfc98 0 2184 2 0x00000000
[102000.441796] Workqueue: cifsiod cifs_oplock_break [cifs]
[102000.442048] ffff8800363dfc98 ffff8800368b09d0 0000000000014200 ffff8800363dffd8
[102000.442215] 0000000000014200 ffff88013ab9bae0 ffff8800368b09d0 0000000000000246
[102000.442382] ffff8800363dfd30 ffff88013ffd7c88 0000000000000002 ffffffff817cb120
[102000.442628] Call Trace:
[102000.442681] [<ffffffff817cb120>] ? bit_wait_io+0x50/0x50
[102000.442828] [<ffffffff817ca4c9>] schedule+0x29/0x70
[102000.442926] [<ffffffff817cb14b>] bit_wait+0x2b/0x50
[102000.443023] [<ffffffff817caca7>] __wait_on_bit+0x67/0x90
[102000.443128] [<ffffffff810a2857>] ? wake_up_process+0x27/0x50
[102000.443240] [<ffffffff817cb120>] ? bit_wait_io+0x50/0x50
[102000.443345] [<ffffffff817cad42>] out_of_line_wait_on_bit+0x72/0x80
[102000.443466] [<ffffffff810b6f70>] ? autoremove_wake_function+0x40/0x40
[102000.443736] [<ffffffffc037689b>] cifs_oplock_break+0x6b/0x330 [cifs]
[102000.443994] [<ffffffff8108f218>] process_one_work+0x158/0x460
[102000.444250] [<ffffffff8108f6ac>] rescuer_thread+0x18c/0x460
[102000.444495] [<ffffffff8108f520>] ? process_one_work+0x460/0x460
[102000.444745] [<ffffffff810950e9>] kthread+0xc9/0xe0
[102000.444975] [<ffffffff81095020>] ? kthread_create_on_node+0x1c0/0x1c0
[102000.445234] [<ffffffff817cf2bc>] ret_from_fork+0x7c/0xb0
[102000.445472] [<ffffffff81095020>] ? kthread_create_on_node+0x1c0/0x1c0

and later

[102000.445739] INFO: task kworker/3:2:20985 blocked for more than 120 seconds.
[102000.446004] Not tainted 3.19.0-7-generic #7-Ubuntu
[102000.446239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[102000.446657] kworker/3:2 D ffff8800bae87cb8 0 20985 2 0x00000000
[102000.446949] Workqueue: cifsiod cifs_oplock_break [cifs]
[102000.447187] ffff8800bae87cb8 ffff8800baed9d70 0000000000014200 ffff8800bae87fd8
[102000.447613] 0000000000014200 ffff88013abf44b0 ffff8800baed9d70 0000000000000246
[102000.448047] ffff8800bae87d50 ffff88013ffd7c88 0000000000000002 ffffffff817cb120
[102000.448477] Call Trace:
[102000.448659] [<ffffffff817cb120>] ? bit_wait_io+0x50/0x50
[102000.448896] [<ffffffff817ca4c9>] schedule+0x29/0x70
[102000.449127] [<ffffffff817cb14b>] bit_wait+0x2b/0x50
[102000.449360] [<ffffffff817caca7>] __wait_on_bit+0x67/0x90
[102000.449595] [<ffffffff817cb120>] ? bit_wait_io+0x50/0x50
[102000.449834] [<ffffffff817cad42>] out_of_line_wait_on_bit+0x72/0x80
[102000.450085] [<ffffffff810b6f70>] ? autoremove_wake_function+0x40/0x40
[102000.450361] [<ffffffffc037689b>] cifs_oplock_break+0x6b/0x330 [cifs]
[102000.450631] [<ffffffff8108f218>] process_one_work+0x158/0x460
[102000.450907] [<ffffffff8108fef3>] worker_thread+0x53/0x5a0
[102000.451163] [<ffffffff8108fea0>] ? idle_worker_timeout+0x110/0x110
[102000.451834] [<ffffffff810950e9>] kthread+0xc9/0xe0
[102000.452093] [<ffffffff81095020>] ? kthread_create_on_node+0x1c0/0x1c0
[102000.452357] [<ffffffff817cf2bc>] ret_from_fork+0x7c/0xb0
[102000.452601] [<ffffffff81095020>] ? kthread_create_on_node+0x1c0/0x1c0

The same repeats for various other processes like kworker and the hanging userspace process. This happens with all recent kernel versions (3.13-3.19) and forces us to reboot the server regularly as the requesting processes always hang with process state "D" and (I/O) load increases from day to day. Further information:

Mount options: sec=krb5,multiuser,mfsymlinks,dir_mode=0700,file_mode=0700,nomapposix,noserverino,nobrl

CIFS server: Samba 4.1.17 with default configuration

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1428045

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Tobias Junghans (tobydox) wrote :

Apport can't be run and all neccessary log information is included in the bug report text itself.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.0 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.0-rc2-vivid/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Jack Spinov (k-ze-4) wrote :

We have exactly the same issue with latest upstream kernel from ubuntu, i.e. 4.2.0-18-generic.

This issue is really annoying and looks like it doesn't depend on load. Means it can hang with only few processes accessing CIFS share. Looks like it is some kind of race condition with exclusive lock request on CIFS share. Our main proccess uses those and constantly got hanged. Sometimes it is a matter of days, sometimes it hangs within 15 minutes.

#bug-exists-upstream

Jack Spinov (k-ze-4)
Changed in linux (Ubuntu):
status: Expired → Incomplete
Jack Spinov (k-ze-4)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Ryan Gerstenkorn (rgerstenkorn) wrote :

I believe we are running into the same bug on 4.4.0-34-generic. Here is the output of dmesg when the issue starts. Let me know if I can provide any more info.

INFO: task kworker/1:1:20481 blocked for more than 120 seconds.
      Not tainted 4.4.0-34-generic #53~14.04.1-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/1:1 D ffff88007a0b7ce8 0 20481 2 0x00000080
Workqueue: cifsiod cifs_oplock_break [cifs]
 ffff88007a0b7ce8 ffff8800791b3700 ffff880075408000 ffff88007a0b8000
 0000000000000002 ffff88007a0b7d80 ffff88007ffcf960 ffffffff817f4210
 ffff88007a0b7d00 ffffffff817f3b25 0000000000000002 ffff88007a0b7d18
Call Trace:
 [<ffffffff817f4210>] ? out_of_line_wait_on_atomic_t+0xd0/0xd0
 [<ffffffff817f3b25>] schedule+0x35/0x80
 [<ffffffff817f4221>] bit_wait+0x11/0x50
 [<ffffffff817f3ec0>] __wait_on_bit+0x60/0x90
 [<ffffffff817f4210>] ? out_of_line_wait_on_atomic_t+0xd0/0xd0
 [<ffffffff817f3f62>] out_of_line_wait_on_bit+0x72/0x80
 [<ffffffff810bdca0>] ? autoremove_wake_function+0x40/0x40
 [<ffffffffc02bf5eb>] cifs_oplock_break+0x6b/0x3a0 [cifs]
 [<ffffffff81095970>] process_one_work+0x150/0x3f0
 [<ffffffff810960ea>] worker_thread+0x11a/0x470
 [<ffffffff817f34c9>] ? __schedule+0x359/0x980
 [<ffffffff81095fd0>] ? rescuer_thread+0x310/0x310
 [<ffffffff8109b849>] kthread+0xc9/0xe0
 [<ffffffff8109b780>] ? kthread_park+0x60/0x60
 [<ffffffff817f770f>] ret_from_fork+0x3f/0x70
 [<ffffffff8109b780>] ? kthread_park+0x60/0x60

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.