DMESG - kernel BUG at /build/buildd/linux-2.6.24/fs/sysfs/file.c:126!

Bug #244377 reported by jorno on 2008-06-30
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Hardy
Medium
Tim Gardner

Bug Description

Binary package hint: linux-image-2.6.24-19-generic

I have already tried filing this bug at bugzilla.kernel.org (http://bugzilla.kernel.org/show_bug.cgi?id=11013), and they told me this had already been fixed in 2.6.24.5. (This is a RAID-error)

So I just make this bug-report to make you aware of the problem, and suggest that you include the patch/fixed version in the next release of the kernel.

Thanks.

Kind regards from Norway.

jorno (jorn-odberg) wrote :

Oh, forgot to include the part of DMESG, if anyone is interested... ;-)

[73760.319674] ------------[ cut here ]------------
[73760.319742] kernel BUG at /build/buildd/linux-2.6.24/fs/sysfs/file.c:126!
[73760.319803] invalid opcode: 0000 [#4] SMP
[73760.319950] Modules linked in: nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs cpufreq_powersave cpufreq_userspace cpufreq_ondemand cpufreq_conservative cpufreq_stats freq_table iptable_filter ip_tables x_tables reiserfs aes_i586 dm_crypt dm_mod asb100 hwmon_vid parport_pc lp af_packet parport loop ipv6 psmouse serio_raw button i2c_nforce2 nvidia_agp shpchp pci_hotplug i2c_core agpgart evdev pcspkr ext3 jbd mbcache sr_mod cdrom ata_generic sg sd_mod usbhid hid pata_amd sata_sil sata_promise pata_acpi skge libata scsi_mod ehci_hcd ohci_hcd forcedeth usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[73760.323741]
[73760.323796] Pid: 15189, comm: cat Tainted: G D (2.6.24-19-generic #1)
[73760.323857] EIP: 0060:[<c01d6ba8>] EFLAGS: 00010212 CPU: 0
[73760.323922] EIP is at sysfs_read_file+0xd8/0xe0
[73760.323980] EAX: 00000001 EBX: dc5fb780 ECX: 00000000 EDX: df9e2c90
[73760.324039] ESI: 00001000 EDI: df9e2cf0 EBP: dc5fb794 ESP: dc4fdf54
[73760.324099] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[73760.324158] Process cat (pid: 15189, ti=dc4fc000 task=f762e000 task.ti=dc4fc000)
[73760.324219] Stack: b7f76ce0 00000000 00001000 08057000 f8892810 f7c9c1c4 f18063c0 08057000
[73760.324693] c01d6ad0 00001000 c0192327 dc4fdfa0 00001000 f18063c0 fffffff7 08057000
[73760.325167] dc4fc000 c0192881 dc4fdfa0 00000000 00000000 00000000 00000003 00001000
[73760.325641] Call Trace:
[73760.325751] [<c01d6ad0>] sysfs_read_file+0x0/0xe0
[73760.325856] [<c0192327>] vfs_read+0xb7/0x170
[73760.325965] [<c0192881>] sys_read+0x41/0x70
[73760.326071] [<c01043c2>] sysenter_past_esp+0x6b/0xa9
[73760.326185] =======================
[73760.326242] Code: 83 c4 18 5b 5e 5f 5d c3 be ed ff ff ff eb e8 b8 d0 00 00 00 be f4 ff ff ff e8 c5 c8 f9 ff 85 c0 89 43 0c 0f 85 6d ff ff ff eb cc <0f> 0b eb fe 8d 74 26 00 83 ec 14 89 74 24 08 8b 74 24 18 89 7c
[73760.329262] EIP: [<c01d6ba8>] sysfs_read_file+0xd8/0xe0 SS:ESP 0068:dc4fdf54
[73760.329436] ---[ end trace 8a9fac7c77d2cf54 ]---

Hi Jorno,

Assuming the fix was in 2.6.24.5, the upcoming Intrepid Ibex 8.10 kernel should already have this fix incorporated. I'll go ahead and open the Hardy nomination for it to be backported. Also including the git commit it which I believe is the patch they are referencing from 2.6.24.5. Thanks.

ogasawara@yoji:~/linux-2.6$ git log -p bd2ab67030e9116f1e4aae1289220255412b37fd
commit bd2ab67030e9116f1e4aae1289220255412b37fd
Author: Dan Williams <email address hidden>
Date: Thu Apr 10 21:29:27 2008 -0700

    md: close a livelock window in handle_parity_checks5

    If a failure is detected after a parity check operation has been initiated,
    but before it completes handle_parity_checks5 will never quiesce operations on
    the stripe.

    Explicitly handle this case by "canceling" the parity check, i.e. clear the
    STRIPE_OP_CHECK flags and queue the stripe on the handle list again to refresh
    any non-uptodate blocks.

    Kernel versions >= 2.6.23 are susceptible.

    Cc: <email address hidden>
    Cc: NeilBrown <email address hidden>
    Signed-off-by: Dan Williams <email address hidden>
    Signed-off-by: Andrew Morton <email address hidden>
    Signed-off-by: Linus Torvalds <email address hidden>

Changed in linux:
status: New → Fix Released
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: New → Triaged
Tim Gardner (timg-tpi) wrote :
Changed in linux:
assignee: ubuntu-kernel-team → timg-tpi
milestone: none → ubuntu-8.04.2
status: Triaged → Fix Committed
Tim Gardner (timg-tpi) wrote :
Martin Pitt (pitti) wrote :

Accepted into -proposed, please test and give feedback here. Please see https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Martin Pitt (pitti) wrote :

linux 2.6.24-21 copied to hardy-updates.

Changed in linux:
status: Fix Committed → Fix Released
A1an (alan-b) wrote :

The bugs still seems to be there, it is crashing the LSI RAID monitoring utility MegaCLI, I don't know if there are other functions affected.

Distributor ID: Ubuntu
Description: Ubuntu 8.04.4 LTS
Release: 8.04
Codename: hardy

Linux hostname 2.6.24-27-server #1 SMP Fri Mar 12 01:23:09 UTC 2010 x86_64 GNU/Linux

[446660.122162] ------------[ cut here ]------------
[446660.122167] kernel BUG at /build/buildd/linux-2.6.24/fs/sysfs/file.c:126!
[446660.122169] invalid opcode: 0000 [7] SMP
[446660.122171] CPU 5
[446660.122172] Modules linked in: ipv6 af_packet bonding iptable_filter ip_tables x_tables xfs ac mptctl ipmi_devintf ipmi_si ipmi_msghandler parport_pc lp parport loop serio_raw psmouse dcdbas evdev pcspkr i5000_edac button iTCO_wdt iTCO_vendor_support edac_core shpchp pci_hotplug ext3 jbd mbcache sr_mod cdrom pata_acpi ata_generic usbhid hid sd_mod sg ehci_hcd ata_piix uhci_hcd libata mptsas mptscsih bnx2 usbcore mptbase megaraid_sas scsi_transport_sas scsi_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[446660.122203] Pid: 28625, comm: MegaCli64 Tainted: G D 2.6.24-27-server #1
[446660.122205] RIP: 0010:[<ffffffff80305d62>] [<ffffffff80305d62>] sysfs_read_file+0x142/0x150
[446660.122212] RSP: 0018:ffff810101583ec8 EFLAGS: 00010212
[446660.122214] RAX: 0000000000000001 RBX: ffff8106c169a420 RCX: 0000000000000000
[446660.122215] RDX: 0000000000000000 RSI: ffff810101081000 RDI: ffff81085a194f00
[446660.122217] RBP: 0000000000001000 R08: ffff810505f55000 R09: 00000000000000ae
[446660.122219] R10: 0000000000000000 R11: ffffffff881129b0 R12: ffff81085a1944b0
[446660.122220] R13: ffff8106c169a440 R14: ffff81085a112bb0 R15: ffffffff805b6a30
[446660.122222] FS: 00000000007c7820(0063) GS:ffff81085d13c680(0000) knlGS:0000000000000000
[446660.122224] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[446660.122226] CR2: 00000000007d6018 CR3: 000000079a8b9000 CR4: 00000000000006e0
[446660.122227] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[446660.122229] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[446660.122231] Process MegaCli64 (pid: 28625, threadinfo ffff810101582000, task ffff81085d9457d0)
[446660.122232] Stack: ffff810101583f50 0000000000001000 00000000007d5030 ffff810360cd5cc0
[446660.122236] 0000000000001000 ffff810101583f50 00000000007d5030 00000000007d5030
[446660.122239] 00000000007d2492 ffffffff802b5c6d fffffffffffffff2 ffff810360cd5cc0
[446660.122242] Call Trace:
[446660.122248] [<ffffffff802b5c6d>] vfs_read+0xed/0x190
[446660.122251] [<ffffffff802b6153>] sys_read+0x53/0x90
[446660.122256] [<ffffffff8020c38e>] system_call+0x7e/0x83
[446660.122260]
[446660.122261]
[446660.122261] Code: 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 28 48 85
[446660.122268] RIP [<ffffffff80305d62>] sysfs_read_file+0x142/0x150
[446660.122271] RSP <ffff810101583ec8>
[446660.122275] ---[ end trace 31493c06905cdd94 ]---

Regards

Stefan Bader (smb) wrote :

This got not much attention due to other things, sorry about that. Basically the crash results from a sysfs entry that tries to make available more than 4095bytes of data. Unfortunately the handling of it is not very helpful as the crash does not tell which call this was. Upstream had made some changes that would help to debug this. If this is still of interest, I prepared some debug kernels and have placed them to http://people.canonical.com/~smb/lp244377/
With the information those provide we could have a chance to find the real issue. But probably should then open a new bug report. Unfortunately this is a symptom that can be triggered by various drivers.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers