[hardy][xen] Oops in free_hot_cold_cache, probably drbd8-related
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
I run a Ubuntu Hardy i386 + Xen + drbd8 + ocfs2 cluster that frequently (every few hours, even when completely idle) reports an Oops. OCFS2 is already disabled, it seems to be linked to drbd8 being started. The drbd8 resource is not in use though, not even mounted on any box.
[ 5545.027304] invalid opcode: 0000 [#1] SMP
[ 5545.027319] Modules linked in: drbd cn bridge sbs container battery sbshc video output ac dock iptable_filter ip_tables x_tables parpe
[ 5545.027410]
[ 5545.027414] Pid: 14947, comm: sshd Not tainted (2.6.24-17-xen #1)
[ 5545.027419] EIP: 0061:[<c194e0e9>] EFLAGS: 00210216 CPU: 0
[ 5545.027426] EIP is at 0xc194e0e9
[ 5545.027429] EAX: c18d3c00 EBX: c18d7800 ECX: 00000004 EDX: 00000000
[ 5545.027434] ESI: 00000002 EDI: 40040000 EBP: 00000000 ESP: daecbe94
[ 5545.027438] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 5545.027443] Process sshd (pid: 14947, ti=daeca000 task=db61c7d0 task.ti=daeca000)
[ 5545.027447] Stack: c01623a5 00000000 c03fd800 daecbef0 00000002 00000008 daecbed0 c0162456
[ 5545.027463] c196f960 c1553938 c03fd800 00000008 c0165997 00000008 00000000 00000008
[ 5545.027479] 00000000 c192ade0 c18f40e0 c1822400 c18eddc0 c1971780 c1932060 c18d3c00
[ 5545.027521] Call Trace:
[ 5545.027525] [<c01623a5>] free_hot_
[ 5545.027541] [<c0162456>] __pagevec_
[ 5545.027551] [<c0165997>] release_
[ 5545.027563] [<c017a5f4>] free_pages_
[ 5545.027573] [<c01737b7>] exit_mmap+
[ 5545.027584] [<c0124303>] mmput+0x23/0x80
[ 5545.027592] [<c0129d95>] do_exit+0x165/0x8b0
[ 5545.027602] [<c0185dff>] vfs_read+
[ 5545.027610] [<c019b9c3>] mntput_
[ 5545.027621] [<c012a50a>] do_group_
[ 5545.027631] [<c0105832>] syscall_
[ 5545.027641] [<c0320000>] vcc_create+
[ 5545.027651] =======
[ 5545.027654] Code: 20 00 00 00 00 40 01 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 01 10 00 00 02 20 00 00 20 00 40 0
[ 5545.027736] EIP: [<c194e0e9>] 0xc194e0e9 SS:ESP 0069:daecbe94
[ 5545.028899] ---[ end trace 880e03764f078517 ]---
[ 5545.028906] Fixing recursive fault but reboot is needed!
The process is different each time but free_hot_cold_cache is always at the top.
Another series of Oopses I just got when running bonnie++ on a local partition (not even drbd8)
[ 91.876570] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008 remove_ checkpoint+ 0x14/0xb0 [jbd] try_to_ free_buffers+ 0xe9/0x140 [jbd] e+0x0/0xa0 [ext3] release_ page+0x2c/ 0x40 page_list+ 0x4c5/0x600 lru_pages+ 0x5f/0x1c0 inactive_ list+0x11f/ 0x3b0 zone+0x9c/ 0x100 wake_function+ 0x0/0x40 thread_ helper+ 0x7/0x10 ======= ======= == remove_ checkpoint+ 0x14/0xb0 [jbd] SS:ESP 0069:db71dda8
[ 91.876587] printing eip: de12d234
[ 91.876594] 1c1c3000 -> *pde = 00000000:09ae4001
[ 91.876598] 15ee4000 -> *pme = 00000000:00000000
[ 91.876604] Oops: 0000 [#1] SMP
[ 91.876611] Modules linked in: drbd cn bridge sbs container battery sbshc video output ac dock iptable_filter ip_tables x_tables parpe
[ 91.876699]
[ 91.876704] Pid: 161, comm: kswapd0 Not tainted (2.6.24-17-xen #1)
[ 91.876708] EIP: 0061:[<de12d234>] EFLAGS: 00010202 CPU: 0
[ 91.876721] EIP is at __journal_
[ 91.876725] EAX: 000001c0 EBX: 00000008 ECX: dc803640 EDX: 00924925
[ 91.876730] ESI: dc803640 EDI: dc803640 EBP: c192da94 ESP: db71dda8
[ 91.876734] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 91.876738] Process kswapd0 (pid: 161, ti=db71c000 task=db67ceb0 task.ti=db71c000)
[ 91.876743] Stack: c19749e0 c192da94 dc803640 de12ba89 dafaa8b8 c192c340 dafaa800 de2102d0
[ 91.876789] 000000d0 db71df7c db71df7c c015f70c c192c340 db71df0c c0167c15 00000000
[ 91.876804] 00000000 dc76c908 db71de7c db71de54 00000000 00f52873 00000000 00000015
[ 91.876819] Call Trace:
[ 91.876826] [<de12ba89>] journal_
[ 91.876842] [<de2102d0>] ext3_releasepag
[ 91.876860] [<c015f70c>] try_to_
[ 91.876874] [<c0167c15>] shrink_
[ 91.876888] [<c0166daf>] isolate_
[ 91.876899] [<c0167e6f>] shrink_
[ 91.876914] [<c016819c>] shrink_
[ 91.876923] [<c016883c>] kswapd+0x44c/0x490
[ 91.876938] [<c013bb90>] autoremove_
[ 91.876949] [<c011e260>] complete+0x40/0x60
[ 91.876958] [<c01683f0>] kswapd+0x0/0x490
[ 91.876966] [<c013b8d2>] kthread+0x42/0x70
[ 91.876972] [<c013b890>] kthread+0x0/0x70
[ 91.876980] [<c0105bb7>] kernel_
[ 91.876991] =======
[ 91.876994] Code: 0b eb fe 8d 74 26 00 0f 0b eb fe 0f 0b eb fe 90 8d b4 26 00 00 00 00 56 89 c1 53 83 ec 04 8b 58 28 85 db 74 29 8b 4
[ 91.877080] EIP: [<de12d234>] __journal_
[ 91.877668] ---[ end trace 276bea9ce4a4d4b9 ]---
[ 94.370990] BUG: unable to handle kernel paging request at virtual address b3578bd4
[ 94.371007] printing eip: c020ff0d
[ 94.371015] 015c5000 -> *pde = 00000000:1d5c8001
[ 94.371021] 015c8000 -> *pme = 00000000:00000000
[ 94.371028] Oops: 0000 [#2] SMP
[ 94.371036] Modules linked in: drbd cn bridge sbs container battery sbshc video output ac dock iptable_filter ip_tables x_tables parpe
[ 94.371147]
[ 94.371152] Pid: 4686, comm: getty Tainted: G D (2.6.24-17-xen #1)
[ 94.371159] EIP: 0061:[<c020ff0d>] EFLAGS: 00010446 CPU: 1
[ 94.371171] EIP is at memmove+0x1d/0x40
[ 94.371189] EAX: db578bd5 EBX: db578bd5 ECX: d8000000 EDX: db578bd5
[ 94.371195] ESI: b3578bd4 EDI: b3578bd4 EBP: dbed6000 ESP: dbed7f4c
[ 94.371201] DS: 007b ES: ...