Lage samba file transfers block btrfs to hung task

Bug #1711148 reported by S. W.
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
samba
New
Undecided
Unassigned
linux (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

I have this Bug since 15.10. It was confirmed but never fixed.....

This is a KVM guest with 17.04 (upgraded from 15.04->15.10->16.04->16.10).
Its just am samba file server attached to a AD.
When lage files get transfered (e.g. AD profile) samba gets stuck and prioduces a kernel hang.

I install a Debain Stech (Kernel 4.9 and samba 4.2.14) for dual boot for testing, copied the smb.conf and attached it to the AD.
I never get any errors like this for month, so this IS a Ubuntu related bug!
I guess this is a samba related problem after samba 4.3.x

[87236.886634] INFO: task btrfs-cleaner:245 blocked for more than 120 seconds.
[87236.891185] Not tainted 4.10.0-32-generic #36-Ubuntu
[87236.891781] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87236.892270] btrfs-cleaner D 0 245 2 0x00000000
[87236.892274] Call Trace:
[87236.892320] __schedule+0x233/0x6f0
[87236.892342] ? __queue_work+0x171/0x470
[87236.892343] schedule+0x36/0x80
[87236.892525] wait_current_trans+0xc2/0x100 [btrfs]
[87236.892532] ? wake_atomic_t_function+0x60/0x60
[87236.892546] start_transaction+0x2d4/0x460 [btrfs]
[87236.892560] btrfs_start_transaction+0x18/0x20 [btrfs]
[87236.892571] btrfs_drop_snapshot+0x454/0x8f0 [btrfs]
[87236.892585] btrfs_clean_one_deleted_snapshot+0xb7/0x100 [btrfs]
[87236.892597] cleaner_kthread+0x133/0x180 [btrfs]
[87236.892600] kthread+0x109/0x140
[87236.892613] ? __btree_submit_bio_start+0x20/0x20 [btrfs]
[87236.892614] ? kthread_create_on_node+0x60/0x60
[87236.892619] ret_from_fork+0x2c/0x40
[87357.711653] INFO: task btrfs-cleaner:245 blocked for more than 120 seconds.
[87357.712727] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.713035] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.713348] btrfs-cleaner D 0 245 2 0x00000000
[87357.713356] Call Trace:
[87357.713366] __schedule+0x233/0x6f0
[87357.713369] ? __queue_work+0x171/0x470
[87357.713371] schedule+0x36/0x80
[87357.713429] wait_current_trans+0xc2/0x100 [btrfs]
[87357.713432] ? wake_atomic_t_function+0x60/0x60
[87357.713446] start_transaction+0x2d4/0x460 [btrfs]
[87357.713460] btrfs_start_transaction+0x18/0x20 [btrfs]
[87357.713473] btrfs_drop_snapshot+0x454/0x8f0 [btrfs]
[87357.713487] btrfs_clean_one_deleted_snapshot+0xb7/0x100 [btrfs]
[87357.713501] cleaner_kthread+0x133/0x180 [btrfs]
[87357.713503] kthread+0x109/0x140
[87357.713516] ? __btree_submit_bio_start+0x20/0x20 [btrfs]
[87357.713517] ? kthread_create_on_node+0x60/0x60
[87357.713523] ret_from_fork+0x2c/0x40
[87357.713548] INFO: task nmbd:1434 blocked for more than 120 seconds.
[87357.713846] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.714103] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.714350] nmbd D 0 1434 1 0x00000000
[87357.714351] Call Trace:
[87357.714353] __schedule+0x233/0x6f0
[87357.714354] schedule+0x36/0x80
[87357.714372] lock_extent_bits+0x190/0x1e0 [btrfs]
[87357.714374] ? wake_atomic_t_function+0x60/0x60
[87357.714389] lock_and_cleanup_extent_if_need+0xff/0x280 [btrfs]
[87357.714403] __btrfs_buffered_write+0x1e4/0x740 [btrfs]
[87357.714448] btrfs_file_write_iter+0x182/0x4e0 [btrfs]
[87357.714464] ? _copy_to_user+0x54/0x60
[87357.714470] new_sync_write+0xd5/0x130
[87357.714471] __vfs_write+0x26/0x40
[87357.714472] vfs_write+0xb5/0x1a0
[87357.714473] SyS_write+0x55/0xc0
[87357.714475] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.714477] RIP: 0033:0x7f223143e670
[87357.714478] RSP: 002b:00007ffce24e0398 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.714479] RAX: ffffffffffffffda RBX: 0000559d2a0eef60 RCX: 00007f223143e670
[87357.714480] RDX: 000000000000004d RSI: 00007ffce24e0900 RDI: 0000000000000009
[87357.714480] RBP: 00007f222dc38b00 R08: 000000000000c1c0 R09: 7362696c2f336563
[87357.714481] R10: 696c626d6e2f626d R11: 0000000000000246 R12: 0000000000000020
[87357.714482] R13: 0000000000000008 R14: 00007ffce24e0c48 R15: 0000000000000001
[87357.714485] INFO: task miniserv.pl:1535 blocked for more than 120 seconds.
[87357.714743] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.714992] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.715232] miniserv.pl D 0 1535 1 0x00000000
[87357.715234] Call Trace:
[87357.715236] __schedule+0x233/0x6f0
[87357.715237] schedule+0x36/0x80
[87357.715253] wait_current_trans+0xc2/0x100 [btrfs]
[87357.715254] ? wake_atomic_t_function+0x60/0x60
[87357.715268] start_transaction+0x24e/0x460 [btrfs]
[87357.715281] btrfs_join_transaction+0x17/0x20 [btrfs]
[87357.715295] btrfs_dirty_inode+0x49/0xd0 [btrfs]
[87357.715309] btrfs_update_time+0x7f/0xc0 [btrfs]
[87357.715316] touch_atime+0xa9/0xd0
[87357.715323] generic_file_read_iter+0x655/0x8f0
[87357.715331] ? handle_mm_fault+0x882/0x1330
[87357.715333] new_sync_read+0xd2/0x120
[87357.715334] __vfs_read+0x26/0x40
[87357.715335] vfs_read+0x96/0x130
[87357.715336] SyS_read+0x55/0xc0
[87357.715348] ? trace_do_page_fault+0x37/0xd0
[87357.715350] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.715351] RIP: 0033:0x7fc0469e56d0
[87357.715351] RSP: 002b:00007fffdb6c64e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[87357.715353] RAX: ffffffffffffffda RBX: 00007fc0469cfb00 RCX: 00007fc0469e56d0
[87357.715353] RDX: 0000000000002000 RSI: 0000557c7cc86830 RDI: 0000000000000009
[87357.715354] RBP: 0000557c7cc86830 R08: 0000557c7bee5f50 R09: 0000000000000000
[87357.715355] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000002000
[87357.715356] R13: 000000000001b7e0 R14: 0000557c7cc86820 R15: 0000557c7b5eb010
[87357.715358] INFO: task smbd:1624 blocked for more than 120 seconds.
[87357.715647] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.715915] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.716178] smbd D 0 1624 1374 0x00000000
[87357.716180] Call Trace:
[87357.716183] __schedule+0x233/0x6f0
[87357.716184] schedule+0x36/0x80
[87357.716186] rwsem_down_write_failed+0x21d/0x3a0
[87357.716188] call_rwsem_down_write_failed+0x17/0x30
[87357.716189] down_write+0x2d/0x40
[87357.716208] btrfs_file_write_iter+0x75/0x4e0 [btrfs]
[87357.716209] ? _copy_to_user+0x54/0x60
[87357.716210] new_sync_write+0xd5/0x130
[87357.716211] __vfs_write+0x26/0x40
[87357.716212] vfs_write+0xb5/0x1a0
[87357.716213] SyS_write+0x55/0xc0
[87357.716214] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.716215] RIP: 0033:0x7fd7a2e4f670
[87357.716216] RSP: 002b:00007ffc436546e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.716217] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fd7a2e4f670
[87357.716218] RDX: 0000000000000051 RSI: 00007ffc43654c50 RDI: 0000000000000019
[87357.716219] RBP: 0000000000000019 R08: 000000000000fefe R09: 737365636f72702f
[87357.716220] R10: 28353938323a632e R11: 0000000000000246 R12: 0000000000000000
[87357.716220] R13: 00007fd79e0d0ca0 R14: 0000000000000000 R15: 0000000000000029
[87357.716223] INFO: task smbd:8765 blocked for more than 120 seconds.
[87357.716470] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.716728] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.716981] smbd D 0 8765 1374 0x00000000
[87357.716982] Call Trace:
[87357.716985] __schedule+0x233/0x6f0
[87357.716986] schedule+0x36/0x80
[87357.716992] rwsem_down_write_failed+0x21d/0x3a0
[87357.716993] call_rwsem_down_write_failed+0x17/0x30
[87357.716994] down_write+0x2d/0x40
[87357.717013] btrfs_file_write_iter+0x75/0x4e0 [btrfs]
[87357.717014] ? _copy_to_user+0x54/0x60
[87357.717015] new_sync_write+0xd5/0x130
[87357.717016] __vfs_write+0x26/0x40
[87357.717017] vfs_write+0xb5/0x1a0
[87357.717018] SyS_write+0x55/0xc0
[87357.717019] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.717020] RIP: 0033:0x7fd7a2e4f670
[87357.717021] RSP: 002b:00007ffc436546e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.717022] RAX: ffffffffffffffda RBX: 00007ffc43654800 RCX: 00007fd7a2e4f670
[87357.717023] RDX: 0000000000000051 RSI: 00007ffc43654c50 RDI: 000000000000002a
[87357.717024] RBP: 00007ffc436547f0 R08: 000000000000fefe R09: 737365636f72702f
[87357.717024] R10: 28353938323a632e R11: 0000000000000246 R12: 00007fd79ed72c35
[87357.717025] R13: 00007ffc43654978 R14: 0000000000000000 R15: 00007fd79ed72c48
[87357.717052] INFO: task smbd:11499 blocked for more than 120 seconds.
[87357.717318] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.717563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.717804] smbd D 0 11499 1374 0x00000000
[87357.717806] Call Trace:
[87357.717808] __schedule+0x233/0x6f0
[87357.717809] schedule+0x36/0x80
[87357.717810] rwsem_down_write_failed+0x21d/0x3a0
[87357.717812] call_rwsem_down_write_failed+0x17/0x30
[87357.717813] down_write+0x2d/0x40
[87357.717829] btrfs_file_write_iter+0x75/0x4e0 [btrfs]
[87357.717830] ? _copy_to_user+0x54/0x60
[87357.717831] new_sync_write+0xd5/0x130
[87357.717832] __vfs_write+0x26/0x40
[87357.717833] vfs_write+0xb5/0x1a0
[87357.717834] SyS_write+0x55/0xc0
[87357.717835] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.717836] RIP: 0033:0x7fd7a2e4f670
[87357.717837] RSP: 002b:00007ffc436546e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.717838] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fd7a2e4f670
[87357.717839] RDX: 0000000000000051 RSI: 00007ffc43654c50 RDI: 0000000000000019
[87357.717840] RBP: 0000000000000019 R08: 000000000000fefe R09: 737365636f72702f
[87357.717841] R10: 28353938323a632e R11: 0000000000000246 R12: 0000000000000000
[87357.717841] R13: 00007fd79e0d0ca0 R14: 0000000000000000 R15: 0000000000000029
[87357.717844] INFO: task smbd:12608 blocked for more than 120 seconds.
[87357.718108] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.718350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.718590] smbd D 0 12608 1374 0x00000000
[87357.718591] Call Trace:
[87357.718598] __schedule+0x233/0x6f0
[87357.718599] schedule+0x36/0x80
[87357.718600] rwsem_down_write_failed+0x21d/0x3a0
[87357.718602] call_rwsem_down_write_failed+0x17/0x30
[87357.718603] down_write+0x2d/0x40
[87357.718620] btrfs_file_write_iter+0x75/0x4e0 [btrfs]
[87357.718621] ? _copy_to_user+0x54/0x60
[87357.718622] new_sync_write+0xd5/0x130
[87357.718623] __vfs_write+0x26/0x40
[87357.718624] vfs_write+0xb5/0x1a0
[87357.718625] SyS_write+0x55/0xc0
[87357.718626] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.718627] RIP: 0033:0x7fd7a2e4f670
[87357.718627] RSP: 002b:00007ffc436546e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.718629] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fd7a2e4f670
[87357.718629] RDX: 0000000000000051 RSI: 00007ffc43654c50 RDI: 0000000000000019
[87357.718630] RBP: 0000000000000019 R08: 000000000000c0fe R09: 737365636f72702f
[87357.718634] R10: 28353938323a632e R11: 0000000000000246 R12: 0000000000000000
[87357.718635] R13: 00007fd79e0d0ca0 R14: 0000000000000000 R15: 0000000000000029
[87357.718637] INFO: task smbd:12609 blocked for more than 120 seconds.
[87357.718901] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.719133] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.719369] smbd D 0 12609 1374 0x00000000
[87357.719370] Call Trace:
[87357.719373] __schedule+0x233/0x6f0
[87357.719374] schedule+0x36/0x80
[87357.719375] rwsem_down_write_failed+0x21d/0x3a0
[87357.719377] call_rwsem_down_write_failed+0x17/0x30
[87357.719378] down_write+0x2d/0x40
[87357.719394] btrfs_file_write_iter+0x75/0x4e0 [btrfs]
[87357.719395] ? _copy_to_user+0x54/0x60
[87357.719396] new_sync_write+0xd5/0x130
[87357.719397] __vfs_write+0x26/0x40
[87357.719398] vfs_write+0xb5/0x1a0
[87357.719399] SyS_write+0x55/0xc0
[87357.719400] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.719401] RIP: 0033:0x7fd7a2e4f670
[87357.719402] RSP: 002b:00007ffc436546e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.719403] RAX: ffffffffffffffda RBX: 000055ab0e2e4ae0 RCX: 00007fd7a2e4f670
[87357.719404] RDX: 0000000000000051 RSI: 00007ffc43654c50 RDI: 0000000000000009
[87357.719404] RBP: 00007ffc43654df0 R08: 000000000000c0fe R09: 737365636f72702f
[87357.719405] R10: 28353938323a632e R11: 0000000000000246 R12: 0000000059931f02
[87357.719406] R13: 000055ab0e2e43c0 R14: 0000000000000000 R15: 0000000000000029
[87357.719407] INFO: task smbd:12851 blocked for more than 120 seconds.
[87357.719672] Not tainted 4.10.0-32-generic #36-Ubuntu
[87357.719944] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87357.720203] smbd D 0 12851 1374 0x00000000
[87357.720204] Call Trace:
[87357.720207] __schedule+0x233/0x6f0
[87357.720208] schedule+0x36/0x80
[87357.720210] rwsem_down_write_failed+0x21d/0x3a0
[87357.720211] call_rwsem_down_write_failed+0x17/0x30
[87357.720212] down_write+0x2d/0x40
[87357.720231] btrfs_file_write_iter+0x75/0x4e0 [btrfs]
[87357.720232] ? _copy_to_user+0x54/0x60
[87357.720234] new_sync_write+0xd5/0x130
[87357.720235] __vfs_write+0x26/0x40
[87357.720236] vfs_write+0xb5/0x1a0
[87357.720237] SyS_write+0x55/0xc0
[87357.720238] entry_SYSCALL_64_fastpath+0x1e/0xad
[87357.720239] RIP: 0033:0x7fd7a2e4f670
[87357.720240] RSP: 002b:00007ffc436546e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[87357.720241] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fd7a2e4f670
[87357.720242] RDX: 0000000000000051 RSI: 00007ffc43654c50 RDI: 0000000000000024
[87357.720242] RBP: 0000000000000024 R08: 000000000000c0fe R09: 737365636f72702f
[87357.720243] R10: 28353938323a632e R11: 0000000000000246 R12: 0000000000000000
[87357.720244] R13: 00007fd79e0d0ca0 R14: 0000000000000000 R15: 0000000000000009
wvk@WVK-Storage:~$

S. W. (cebit2006)
tags: added: btrfs
tags: added: samba zesty
tags: added: 4.10 smb
S. W. (cebit2006)
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1711148

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.13 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc4

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
S. W. (cebit2006) wrote :
Download full text (30.2 KiB)

Why should this have changed. This anoying bug is existing since 15.10 and even if there are several tickets nobody funkin cares!

https://bugs.launchpad.net/ubuntu/+source/linux-lts-wily/+bug/1583191
https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1579803

And yes i'm not dump. I tried the latest kernel...even 4.13...oh wonder...nothing changed....

AND NO this ic NO UPSTREAM BUG! This is Ubuntu related ONLY!!!!!! I'm using DEBIAN 9 with exactly the same Samba settings, just an other samba version and kernel 4.9.
MY guess is this is a Samba related problem!
It happens ONLY when there is heavy use on Samba!

[ 37.424709] audit: type=1400 audit(1503500221.451:3):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=683
comm="apparmor_parser"
[ 37.424959] audit: type=1400 audit(1503500221.452:4):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=683 comm="apparmor_parser"
[ 37.425188] audit: type=1400 audit(1503500221.452:5):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/connman/scripts/dhclient-script" pid=683
comm="apparmor_parser"
[ 37.704058] audit: type=1400 audit(1503500221.731:6):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/snapd/snap-confine" pid=688 comm="apparmor_parser"
[ 37.704408] audit: type=1400 audit(1503500221.731:7):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper"
pid=688 comm="apparmor_parser"
[ 37.714792] audit: type=1400 audit(1503500221.742:8):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/lightdm/lightdm-guest-session" pid=684 comm="apparmor_parser"
[ 37.715154] audit: type=1400 audit(1503500221.742:9):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/lightdm/lightdm-guest-session//chromium" pid=684
comm="apparmor_parser"
[ 37.814957] audit: type=1400 audit(1503500221.842:10):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/sbin/cups-browsed" pid=690 comm="apparmor_parser"
[ 37.854086] audit: type=1400 audit(1503500221.881:11):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/sbin/ippusbxd" pid=694 comm="apparmor_parser"
[ 67.352219] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
recovery directory
[ 67.352252] NFSD: starting 90-second grace period (net ffffffffaf1ec8c0)
[ 82.774749] kauditd_printk_skb: 5 callbacks suppressed
[ 82.774751] audit: type=1400 audit(1503500267.300:17):
apparmor="DENIED" operation="capable" profile="/usr/sbin/ntpd" pid=1467
comm="ntpd" capability=12 capname="net_admin"
[ 114.385440] random: crng init done
[89653.176401] INFO: task btrfs-cleaner:226 blocked for more than 120
seconds.
[89653.176425] Not tainted 4.13.0-041300rc6-generic #201708201831
[89653.176426] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[89653.176428] btrfs-cleaner D 0 226 2 0x00000000
[89653.176432] Call Trace:
[89653.176473] __sched...

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Rolf Leggewie (r0lf) wrote :

Nothing you say in your rant rules out an upstream bug. But maybe you are not so interested in getting this analysed and fixed?

Anyhow, most likely time has moved on. Are you still affected by this problem?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.