kernel soft lockup race condition on filesystem read operations in generic_file_splice_read function
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Hardy |
Fix Released
|
Medium
|
Leann Ogasawara |
Bug Description
SRU Justification:
Impact: Without the fix, users can experience "sporadic kernel lockups on a Ubuntu Hardy LTS fileserver which produces serious downtimes."
Fix: upstream commit 8191ecd1d14c691
Test case: Without a patched kernel you'll see soft lockup error messages in your dmesg output an experience sporadic kernel lockups. With a patched kernel you won't experience the lockups or see the error messages.
Hello,
we are experiencing sporadic kernel lockups on a Ubuntu Hardy LTS fileserver which produces serious downtimes. The following message can be found in our kern.log and dmesg:
May 30 13:55:20 sanhead01 kernel: [699831.819099] BUG: soft lockup - CPU#1 stuck for 11s! [nfsd:17397]
May 30 13:55:20 sanhead01 kernel: [699831.891913] CPU 1:
May 30 13:55:20 sanhead01 kernel: [699831.891914] Modules linked in: nfs nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs bonding usbkbd qla2xxx raid1 raid10 raid456 async_xor async_memcpy async_tx xor raid0 multipath linear md_mod dm_mirror dm_snapshot dm_mod fbcon tileblit font bitblit softcursor fan thermal processor forcedeth tg3 ehci_hcd e1000 ohci_hcd scsi_transport_fc scsi_tgt pata_amd sata_nv pata_acpi ata_generic libata usbhid hid sd_mod sg scsi_mod ext3 jbd mbcache shpchp pci_hotplug evdev pcspkr serio_raw button psmouse i2c_nforce2 i2c_core joydev uhci_hcd usbcore ac video output sbs sbshc container battery dock bridge 8021q af_packet drbd cn
May 30 13:55:20 sanhead01 kernel: [699831.891952] Pid: 17397, comm: nfsd Not tainted 2.6.24-26-server #1
May 30 13:55:20 sanhead01 kernel: [699831.891954] RIP: 0010:[find_
May 30 13:55:20 sanhead01 kernel: [699831.891959] RSP: 0018:ffff8100cb
May 30 13:55:20 sanhead01 kernel: [699831.891961] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8100cba31c10
May 30 13:55:20 sanhead01 kernel: [699831.891963] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff81011da43200
May 30 13:55:20 sanhead01 kernel: [699831.891965] RBP: ffff81001c6914d8 R08: 0000000000000001 R09: 0000000000000000
May 30 13:55:20 sanhead01 kernel: [699831.891967] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000014
May 30 13:55:20 sanhead01 kernel: [699831.891970] R13: 0000000000000001 R14: 0000000000000000 R15: ffff81011da43200
May 30 13:55:20 sanhead01 kernel: [699831.891972] FS: 00007f0e957c66e
May 30 13:55:20 sanhead01 kernel: [699831.891974] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 30 13:55:20 sanhead01 kernel: [699831.891977] CR2: 00007fbdb04a2000 CR3: 0000000118845000 CR4: 00000000000006e0
May 30 13:55:20 sanhead01 kernel: [699831.891979] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 30 13:55:20 sanhead01 kernel: [699831.891981] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 30 13:55:20 sanhead01 kernel: [699831.891983]
May 30 13:55:20 sanhead01 kernel: [699831.891983] Call Trace:
May 30 13:55:20 sanhead01 kernel: [699831.891989] [ext3:generic_
May 30 13:55:20 sanhead01 kernel: [699831.891998] [ifind_
May 30 13:55:20 sanhead01 kernel: [699831.892002] [ext3:iget_
May 30 13:55:20 sanhead01 kernel: [699831.892007] [<ffffffff883c8
May 30 13:55:20 sanhead01 kernel: [699831.892012] [<ffffffff883c8
May 30 13:55:20 sanhead01 kernel: [699831.892020] [<ffffffff88432
May 30 13:55:20 sanhead01 kernel: [699831.892032] [<ffffffff883df
May 30 13:55:20 sanhead01 kernel: [699831.892040] [set_current_
May 30 13:55:20 sanhead01 kernel: [699831.892050] [splice_
May 30 13:55:20 sanhead01 kernel: [699831.892058] [<ffffffff88433
May 30 13:55:20 sanhead01 kernel: [699831.892070] [<ffffffff88433
May 30 13:55:20 sanhead01 kernel: [699831.892083] [<ffffffff88434
May 30 13:55:20 sanhead01 kernel: [699831.892095] [<ffffffff883d8
May 30 13:55:20 sanhead01 kernel: [699831.892106] [<ffffffff8843c
May 30 13:55:20 sanhead01 kernel: [699831.892116] [<ffffffff8842f
May 30 13:55:20 sanhead01 kernel: [699831.892130] [<ffffffff883d7
May 30 13:55:20 sanhead01 kernel: [699831.892133] [<ffffffff80236
May 30 13:55:20 sanhead01 kernel: [699831.892138] [__down_
May 30 13:55:20 sanhead01 kernel: [699831.892147] [<ffffffff8842f
May 30 13:55:20 sanhead01 kernel: [699831.892154] [<ffffffff8842f
May 30 13:55:20 sanhead01 kernel: [699831.892160] [child_
May 30 13:55:20 sanhead01 kernel: [699831.892167] [<ffffffff8842f
May 30 13:55:20 sanhead01 kernel: [699831.892179] [<ffffffff8842f
May 30 13:55:20 sanhead01 kernel: [699831.892182] [child_
It seems to me that the following patch is related to and most probably fixes the problem:
http://
rc8/2.6.
Can you provide this patch as a security/stability update for the Ubuntu Hardy LTS Kernel please?
System Information:
sanhead01:~# lsb_release -rd
Description: Ubuntu 8.04.4 LTS
Release: 8.04
Linux sanhead01 2.6.24-26-server #1 SMP Tue Dec 1 18:26:43 UTC 2009 x86_64 GNU/Linux
Ubuntu 2.6.24-26.64-server
affects: | linux-ubuntu-modules-2.6.24 (Ubuntu) → linux (Ubuntu) |
Changed in linux (Ubuntu Hardy): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-hardy removed: verification-needed-hardy |
This bug is missing log files that will aid in dianosing the problem. From a terminal window please run:
apport-collect 790557
and then change the status of the bug back to 'New'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.