Comment 0 for bug 790557

Revision history for this message
Daniel (pada) wrote :

Hello,

we are experiencing sporadic kernel lockups on a Ubuntu Hardy LTS fileserver which produces serious downtimes. The following message can be found in our kern.log and dmesg:

May 30 13:55:20 sanhead01 kernel: [699831.819099] BUG: soft lockup - CPU#1 stuck for 11s! [nfsd:17397]
May 30 13:55:20 sanhead01 kernel: [699831.891913] CPU 1:
May 30 13:55:20 sanhead01 kernel: [699831.891914] Modules linked in: nfs nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs bonding usbkbd qla2xxx raid1 raid10 raid456 async_xor async_memcpy async_tx xor raid0 multipath linear md_mod dm_mirror dm_snapshot dm_mod fbcon tileblit font bitblit softcursor fan thermal processor forcedeth tg3 ehci_hcd e1000 ohci_hcd scsi_transport_fc scsi_tgt pata_amd sata_nv pata_acpi ata_generic libata usbhid hid sd_mod sg scsi_mod ext3 jbd mbcache shpchp pci_hotplug evdev pcspkr serio_raw button psmouse i2c_nforce2 i2c_core joydev uhci_hcd usbcore ac video output sbs sbshc container battery dock bridge 8021q af_packet drbd cn
May 30 13:55:20 sanhead01 kernel: [699831.891952] Pid: 17397, comm: nfsd Not tainted 2.6.24-26-server #1
May 30 13:55:20 sanhead01 kernel: [699831.891954] RIP: 0010:[find_get_pages_contig+0x95/0xb0] [find_get_pages_contig+0x95/0xb0] find_get_pages_contig+0x95/0xb0
May 30 13:55:20 sanhead01 kernel: [699831.891959] RSP: 0018:ffff8100cba31a88 EFLAGS: 00000286
May 30 13:55:20 sanhead01 kernel: [699831.891961] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8100cba31c10
May 30 13:55:20 sanhead01 kernel: [699831.891963] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff81011da43200
May 30 13:55:20 sanhead01 kernel: [699831.891965] RBP: ffff81001c6914d8 R08: 0000000000000001 R09: 0000000000000000
May 30 13:55:20 sanhead01 kernel: [699831.891967] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000014
May 30 13:55:20 sanhead01 kernel: [699831.891970] R13: 0000000000000001 R14: 0000000000000000 R15: ffff81011da43200
May 30 13:55:20 sanhead01 kernel: [699831.891972] FS: 00007f0e957c66e0(0000) GS:ffff81011bc01800(0000) knlGS:0000000000000000
May 30 13:55:20 sanhead01 kernel: [699831.891974] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 30 13:55:20 sanhead01 kernel: [699831.891977] CR2: 00007fbdb04a2000 CR3: 0000000118845000 CR4: 00000000000006e0
May 30 13:55:20 sanhead01 kernel: [699831.891979] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 30 13:55:20 sanhead01 kernel: [699831.891981] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 30 13:55:20 sanhead01 kernel: [699831.891983]
May 30 13:55:20 sanhead01 kernel: [699831.891983] Call Trace:
May 30 13:55:20 sanhead01 kernel: [699831.891989] [ext3:generic_file_splice_read+0x10b/0x1e10] generic_file_splice_read+0x10b/0x4c0
May 30 13:55:20 sanhead01 kernel: [699831.891998] [ifind_fast+0x45/0xa0] ifind_fast+0x45/0xa0
May 30 13:55:20 sanhead01 kernel: [699831.892002] [ext3:iget_locked+0x44/0x800] iget_locked+0x44/0x180
May 30 13:55:20 sanhead01 kernel: [699831.892007] [<ffffffff883c85aa>] :exportfs:find_acceptable_alias+0x1a/0xe0
May 30 13:55:20 sanhead01 kernel: [699831.892012] [<ffffffff883c8703>] :exportfs:exportfs_decode_fh+0x93/0x270
May 30 13:55:20 sanhead01 kernel: [699831.892020] [<ffffffff88432490>] :nfsd:nfsd_acceptable+0x0/0xf0
May 30 13:55:20 sanhead01 kernel: [699831.892032] [<ffffffff883dfa69>] :sunrpc:cache_check+0x49/0x490
May 30 13:55:20 sanhead01 kernel: [699831.892040] [set_current_groups+0x23b/0x240] set_current_groups+0x23b/0x240
May 30 13:55:20 sanhead01 kernel: [699831.892050] [splice_direct_to_actor+0xbc/0x190] splice_direct_to_actor+0xbc/0x190
May 30 13:55:20 sanhead01 kernel: [699831.892058] [<ffffffff88433e50>] :nfsd:nfsd_direct_splice_actor+0x0/0x20
May 30 13:55:20 sanhead01 kernel: [699831.892070] [<ffffffff88433e27>] :nfsd:nfsd_vfs_read+0x3c7/0x3f0
May 30 13:55:20 sanhead01 kernel: [699831.892083] [<ffffffff88434402>] :nfsd:nfsd_read+0xe2/0x100
May 30 13:55:20 sanhead01 kernel: [699831.892095] [<ffffffff883d8a90>] :sunrpc:svc_sock_enqueue+0x80/0x360
May 30 13:55:20 sanhead01 kernel: [699831.892106] [<ffffffff8843c6fd>] :nfsd:nfsd3_proc_read+0xfd/0x1a0
May 30 13:55:20 sanhead01 kernel: [699831.892116] [<ffffffff8842f271>] :nfsd:nfsd_dispatch+0xb1/0x240
May 30 13:55:20 sanhead01 kernel: [699831.892130] [<ffffffff883d7dad>] :sunrpc:svc_process+0x47d/0x7e0
May 30 13:55:20 sanhead01 kernel: [699831.892133] [<ffffffff80236540>] default_wake_function+0x0/0x10
May 30 13:55:20 sanhead01 kernel: [699831.892138] [__down_read+0x12/0xb1] __down_read+0x12/0xb1
May 30 13:55:20 sanhead01 kernel: [699831.892147] [<ffffffff8842f810>] :nfsd:nfsd+0x0/0x2e0
May 30 13:55:20 sanhead01 kernel: [699831.892154] [<ffffffff8842f99f>] :nfsd:nfsd+0x18f/0x2e0
May 30 13:55:20 sanhead01 kernel: [699831.892160] [child_rip+0xa/0x12] child_rip+0xa/0x12
May 30 13:55:20 sanhead01 kernel: [699831.892167] [<ffffffff8842f810>] :nfsd:nfsd+0x0/0x2e0
May 30 13:55:20 sanhead01 kernel: [699831.892179] [<ffffffff8842f810>] :nfsd:nfsd+0x0/0x2e0
May 30 13:55:20 sanhead01 kernel: [699831.892182] [child_rip+0x0/0x12] child_rip+0x0/0x12

It seems to me that the following patch is related to and most probably fixes the problem:

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-
rc8/2.6.25-rc8-mm2/broken-out/generic_file_splice_read-fix-lockups.patch

Can you provide this patch as a security/stability update for the Ubuntu Hardy LTS Kernel please?

System Information:

sanhead01:~# lsb_release -rd
Description: Ubuntu 8.04.4 LTS
Release: 8.04

Linux sanhead01 2.6.24-26-server #1 SMP Tue Dec 1 18:26:43 UTC 2009 x86_64 GNU/Linux
Ubuntu 2.6.24-26.64-server