Comment 7 for bug 999755

Revision history for this message
Gavin Heavyside (mydrive) wrote : Re: Kernel crash on EC2 m1.large instances

I've reproduced this by running the OHAI command from the OpsCode Chef ohai gem (0.6.12) in a loop, although it took nearly 2 days before it triggered. Basically I ran `gem install ohai; while true; do ohai; done` in a screen session.

 The stack trace is:

[18362917.357055] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[18362917.357079] IP: [<ffffffff8130d7f1>] rb_next+0x1/0x50
[18362917.357098] PGD 1d098d067 PUD 1d045b067 PMD 0
[18362917.357110] Oops: 0000 [#1] SMP
[18362917.357122] CPU 0
[18362917.357126] Modules linked in: ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables isofs acpiphp
[18362917.357152]
[18362917.357157] Pid: 21217, comm: ohai Not tainted 3.2.0-23-virtual #36-Ubuntu
[18362917.357166] RIP: e030:[<ffffffff8130d7f1>] [<ffffffff8130d7f1>] rb_next+0x1/0x50
[18362917.357176] RSP: e02b:ffff8801d22f1808 EFLAGS: 00010046
[18362917.357181] RAX: 0000000000000000 RBX: ffff8801d0842600 RCX: 0000000000000000
[18362917.357187] RDX: fffffffffffffff0 RSI: 0000000000000000 RDI: 0000000000000010
[18362917.357193] RBP: ffff8801d22f1838 R08: 0000000000000000 R09: 0000000000000000
[18362917.357199] R10: ffff8801dffa26c0 R11: 0000000000000001 R12: 0000000000000000
[18362917.357207] R13: 0000000000000000 R14: 0000000000000008 R15: ffff8801d0dc2d00
[18362917.357218] FS: 00007fcdb3810700(0000) GS:ffff8801dff73000(0000) knlGS:0000000000000000
[18362917.357225] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[18362917.357232] CR2: 0000000000000010 CR3: 00000001d2641000 CR4: 0000000000002660
[18362917.357240] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[18362917.357246] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[18362917.357253] Process ohai (pid: 21217, threadinfo ffff8801d22f0000, task ffff8801d0ad44a0)
[18362917.357261] Stack:
[18362917.357265] ffff8801d22f1838 ffffffff8104ece9 ffff8801d0842600 ffff8801dff866c0
[18362917.357277] ffff8801d0842e00 0000000000000008 ffff8801d22f1868 ffffffff810544b8
[18362917.357289] ffff8801d22f1868 ffff8801dff866c0 0000000000000000 ffff8801d0ad4848
[18362917.357301] Call Trace:
[18362917.357314] [<ffffffff8104ece9>] ? pick_next_entity+0xb9/0xe0
[18362917.357322] [<ffffffff810544b8>] pick_next_task_fair+0x38/0x70
[18362917.357331] [<ffffffff81652ddc>] __schedule+0x14c/0x6f0
[18362917.357341] [<ffffffff8100a25d>] ? xen_force_evtchn_callback+0xd/0x10
[18362917.357348] [<ffffffff8165344f>] schedule+0x3f/0x60
[18362917.357355] [<ffffffff8165456d>] schedule_hrtimeout_range_clock+0x14d/0x170
[18362917.357365] [<ffffffff8100aa1f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[18362917.357373] [<ffffffff816554ee>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[18362917.357383] [<ffffffff8108932d>] ? add_wait_queue+0x4d/0x60
[18362917.357389] [<ffffffff816545a3>] schedule_hrtimeout_range+0x13/0x20
[18362917.357400] [<ffffffff811877a9>] poll_schedule_timeout+0x49/0x70
[18362917.357408] [<ffffffff81188326>] do_select+0x4d6/0x600
[18362917.357414] [<ffffffff811878b0>] ? poll_freewait+0xe0/0xe0
[18362917.357422] [<ffffffff811879a0>] ? __pollwait+0xf0/0xf0
[18362917.357431] [<ffffffff81306cd6>] ? cpumask_next_and+0x36/0x50
[18362917.357438] [<ffffffff81052124>] ? select_idle_sibling+0x174/0x220
[18362917.357445] [<ffffffff8130bbdb>] ? radix_tree_lookup+0xb/0x10
[18362917.357453] [<ffffffff810d61c7>] ? irq_to_desc+0x17/0x20
[18362917.357461] [<ffffffff810d902e>] ? irq_get_irq_data+0xe/0x10
[18362917.357472] [<ffffffff813a404e>] ? info_for_irq+0xe/0x30
[18362917.357478] [<ffffffff81306cd6>] ? cpumask_next_and+0x36/0x50
[18362917.357487] [<ffffffff810592d1>] ? find_busiest_group+0x171/0xbb0
[18362917.357494] [<ffffffff81188611>] core_sys_select+0x1c1/0x330
[18362917.357501] [<ffffffff816554ee>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[18362917.357511] [<ffffffff8163d029>] ? idle_balance+0xf0/0x11b
[18362917.357517] [<ffffffff8100a25d>] ? xen_force_evtchn_callback+0xd/0x10
[18362917.357524] [<ffffffff8100aa32>] ? check_events+0x12/0x20
[18362917.357531] [<ffffffff8100aa1f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[18362917.357538] [<ffffffff81004c62>] ? xen_mc_flush+0xb2/0x1c0
[18362917.357545] [<ffffffff8100aa1f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[18362917.357552] [<ffffffff811889bb>] sys_select+0xbb/0x100
[18362917.357559] [<ffffffff8105edc7>] ? schedule_tail+0x27/0xb0
[18362917.357568] [<ffffffff8165d8c2>] system_call_fastpath+0x16/0x1b
[18362917.357573] Code: 89 06 48 8b 47 08 48 89 46 08 48 8b 47 10 48 89 46 10 c3 0f 1f 80 00 00 00 00 48 89 32 eb b2 0f 1f 00 48 89 70 10 eb a9 66 90 55 <48> 8b 17 48 89 e5 48 89 d0 48 83 e0 fc 48 39 c7 74 34 48 8b 47
[18362917.357653] RIP [<ffffffff8130d7f1>] rb_next+0x1/0x50
[18362917.357660] RSP <ffff8801d22f1808>
[18362917.357664] CR2: 0000000000000010
[18362917.357673] ---[ end trace de16620c8d9e9c7c ]---