Ubuntu
linux package

Bug #1168350
Activity log

Activity log for bug #1168350

Date	Who	What changed	Old value	New value	Message
2013-04-12 10:24:01	Munehisa Kamata	bug			added bug
2013-04-12 10:24:01	Munehisa Kamata	attachment added		apport.linux-image-3.2.0-40-virtual.9wl95P.apport https://bugs.launchpad.net/bugs/1168350/+attachment/3642038/+files/apport.linux-image-3.2.0-40-virtual.9wl95P.apport
2013-04-12 10:30:24	Brad Figg	linux (Ubuntu): status	New	Incomplete
2013-04-12 11:46:50	Munehisa Kamata	linux (Ubuntu): status	Incomplete	Confirmed
2013-04-12 16:52:11	Joseph Salisbury	linux (Ubuntu): importance	Undecided	Medium
2013-04-12 16:58:09	Joseph Salisbury	tags	precise	bot-stop-nagging kernel-da-key precise
2013-04-12 18:24:38	Joseph Salisbury	nominated for series		Ubuntu Precise
2013-04-12 18:24:38	Joseph Salisbury	bug task added		linux (Ubuntu Precise)
2013-04-12 18:24:45	Joseph Salisbury	linux (Ubuntu Precise): status	New	Confirmed
2013-04-12 18:24:48	Joseph Salisbury	linux (Ubuntu Precise): importance	Undecided	Medium
2013-04-18 14:11:05	Ben Howard	bug			added subscriber Antonio Rosales
2013-04-18 14:11:24	Ben Howard	bug			added subscriber Ben Howard
2013-04-23 08:29:17	Stefan Bader	linux (Ubuntu): status	Confirmed	Fix Released
2013-04-23 08:29:23	Stefan Bader	linux (Ubuntu Precise): assignee		Stefan Bader (stefan-bader-canonical)
2013-04-23 08:29:34	Stefan Bader	linux (Ubuntu Precise): status	Confirmed	In Progress
2013-05-01 19:56:20	Cristian Gafton	bug			added subscriber Cristian Gafton
2013-05-08 14:03:24	Stefan Bader	description	The arch_trigger_all_cpu_backtrace() tries to send NMI to all CPUs via IPI for getting stacktraces from them. But NMI vector is not implemented on virtualized environment(Xen PV) and the function results in Oops. [4746854.099062] INFO: rcu_sched detected stall on CPU 3 (t=15001 jiffies) [4746854.099091] BUG: unable to handle kernel paging request at ffffffffff5fb310 [4746854.099100] IP: [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099116] PGD 1c07067 PUD 1c08067 PMD 1dd4067 PTE 0 [4746854.099126] Oops: 0002 [#1] SMP [4746854.099134] CPU 3 [4746854.099137] Modules linked in: stallmod(O+) isofs acpiphp [4746854.099150] [4746854.099157] Pid: 4752, comm: insmod Tainted: G O 3.2.0-40-virtual #64-Ubuntu [4746854.099174] RIP: e030:[<ffffffff81037cf8>] [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099189] RSP: e02b:ffff8803bfd83c68 EFLAGS: 00010046 [4746854.099198] RAX: 0000000000000000 RBX: ffffffff81cd0060 RCX: 000000000003ffff [4746854.099208] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002 [4746854.099219] RBP: ffff8803bfd83c88 R08: 000000000003ffff R09: 0000000000000000 [4746854.099229] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000800 [4746854.099240] R13: 000000000f000000 R14: ffff8803bfd8e700 R15: 0000000000000000 [4746854.099256] FS: 00007f456d441700(0000) GS:ffff8803bfd80000(0000) knlGS:0000000000000000 [4746854.099270] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [4746854.099279] CR2: ffffffffff5fb310 CR3: 00000003a4180000 CR4: 0000000000002660 [4746854.099290] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [4746854.099301] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [4746854.099312] Process insmod (pid: 4752, threadinfo ffff8803a48d4000, task ffff8803a6b5c4a0) [4746854.099323] Stack: [4746854.099328] 0000000000000000 0000000000002710 ffffffff81c31000 ffffffff81c31100 [4746854.099346] ffff8803bfd83ca8 ffffffff8103333a ffff8803a6e17b00 ffffffff81c31000 [4746854.099363] ffff8803bfd83cc8 ffffffff810df347 ffff8803bfd8e250 ffff8803bfd8eb80 [4746854.099382] Call Trace: [4746854.099387] <IRQ> [4746854.099401] [<ffffffff8103333a>] arch_trigger_all_cpu_backtrace+0x5a/0x90 [4746854.099416] [<ffffffff810df347>] check_cpu_stall.isra.35+0x97/0xf0 [4746854.099429] [<ffffffff810df3d8>] __rcu_pending+0x38/0x1d0 [4746854.099439] [<ffffffff810df869>] rcu_check_callbacks+0x79/0x1e0 [4746854.099453] [<ffffffff81078098>] update_process_times+0x48/0x90 [4746854.099466] [<ffffffff8109b864>] tick_sched_timer+0x64/0xc0 [4746854.099480] [<ffffffff8108dfe8>] __run_hrtimer+0x78/0x1f0 [4746854.099491] [<ffffffff8109b800>] ? tick_nohz_handler+0x100/0x100 [4746854.099506] [<ffffffff8105e748>] ? load_balance+0x78/0x370 [4746854.099520] [<ffffffff8108e917>] hrtimer_interrupt+0xf7/0x230 [4746854.099535] [<ffffffff8100a817>] xen_timer_interrupt+0x27/0x40 [4746854.099547] [<ffffffff810d7bb5>] handle_irq_event_percpu+0x55/0x210 [4746854.099561] [<ffffffff813a6f7e>] ? info_for_irq+0xe/0x30 [4746854.099572] [<ffffffff810dae67>] handle_percpu_irq+0x47/0x60 [4746854.099583] [<ffffffff813a6de9>] __xen_evtchn_do_upcall+0x199/0x250 [4746854.099596] [<ffffffff813a8ecf>] xen_evtchn_do_upcall+0x2f/0x50 [4746854.099610] [<ffffffff81661b7e>] xen_do_hypervisor_callback+0x1e/0x30 [4746854.099619] <EOI> [4746854.099632] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099645] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099659] [<ffffffff813a757e>] ? xen_poll_irq_timeout+0x3e/0x50 [4746854.099671] [<ffffffff813a9060>] ? xen_poll_irq+0x10/0x20 [4746854.099683] [<ffffffff8163c200>] ? xen_spin_lock_slow+0x97/0xf2 [4746854.099695] [<ffffffffa000c000>] ? 0xffffffffa000bfff [4746854.099709] [<ffffffff810121da>] ? xen_spin_lock+0x4a/0x50 [4746854.099722] [<ffffffff816572ce>] ? _raw_spin_lock+0xe/0x20 [4746854.099734] [<ffffffffa000702b>] ? stall+0x2b/0x44 [stallmod] [4746854.099746] [<ffffffffa000c009>] ? init_module+0x9/0x1000 [stallmod] [4746854.099758] [<ffffffff81002040>] ? do_one_initcall+0x40/0x180 [4746854.099771] [<ffffffff810a7abe>] ? sys_init_module+0xbe/0x230 [4746854.099783] [<ffffffff8165f8c2>] ? system_call_fastpath+0x16/0x1b In this case, the function is invoked by RCU based stall detector when it detects stalled CPU(i.e. lockup) in an interrupt context. Oops in an interrupt context always causes a kernel panic, so this bug sometimes makes debugging a kernel lockup issue difficult. The function is also invoked from sysrq_handle_showallcpus() that is for getting traces from all active CPUs anytime we want. # echo l > /pros/sysrq-trigger This is the easiest way to reproduce this. [How to fix] As far as I see, one possible solution is to backport the following patch. This patch is already included in Quantal's kernel. http://lists.xen.org/archives/html/xen-devel/2012-04/msg01023.html Another solution is to disable arch_trigger_all_cpu_backtrace() at compile time but I'm still investigating what config is for that. If you need any other information, please feel free to ask me.	SRU Justification: Impact: The arch_trigger_all_cpu_backtrace tries to notify all other cpus via ipi. For that it looks up an ipi hook from the apic structure without verifying whether that pointer is NULL or not. Fix: Upstream fixed this by implementing the apic IPI hooks interface. Although some pieces seem to be unclear, this is not changed in upstream kernels since then. So either it does not matter or those pieces are not used. So for now backport the patch introducing the apic interface from upstream (only dropping one unnecessary declaration). This only affects PVM as HVM emulates flat apic completely. Testcase: Cause a call to arch_trigger_all_cpu_backtrace (Munehisa, can you provide a simple trigger?). --- The arch_trigger_all_cpu_backtrace() tries to send NMI to all CPUs via IPI for getting stacktraces from them. But NMI vector is not implemented on virtualized environment(Xen PV) and the function results in Oops. [4746854.099062] INFO: rcu_sched detected stall on CPU 3 (t=15001 jiffies) [4746854.099091] BUG: unable to handle kernel paging request at ffffffffff5fb310 [4746854.099100] IP: [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099116] PGD 1c07067 PUD 1c08067 PMD 1dd4067 PTE 0 [4746854.099126] Oops: 0002 [#1] SMP [4746854.099134] CPU 3 [4746854.099137] Modules linked in: stallmod(O+) isofs acpiphp [4746854.099150] [4746854.099157] Pid: 4752, comm: insmod Tainted: G O 3.2.0-40-virtual #64-Ubuntu [4746854.099174] RIP: e030:[<ffffffff81037cf8>] [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099189] RSP: e02b:ffff8803bfd83c68 EFLAGS: 00010046 [4746854.099198] RAX: 0000000000000000 RBX: ffffffff81cd0060 RCX: 000000000003ffff [4746854.099208] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002 [4746854.099219] RBP: ffff8803bfd83c88 R08: 000000000003ffff R09: 0000000000000000 [4746854.099229] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000800 [4746854.099240] R13: 000000000f000000 R14: ffff8803bfd8e700 R15: 0000000000000000 [4746854.099256] FS: 00007f456d441700(0000) GS:ffff8803bfd80000(0000) knlGS:0000000000000000 [4746854.099270] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [4746854.099279] CR2: ffffffffff5fb310 CR3: 00000003a4180000 CR4: 0000000000002660 [4746854.099290] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [4746854.099301] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [4746854.099312] Process insmod (pid: 4752, threadinfo ffff8803a48d4000, task ffff8803a6b5c4a0) [4746854.099323] Stack: [4746854.099328] 0000000000000000 0000000000002710 ffffffff81c31000 ffffffff81c31100 [4746854.099346] ffff8803bfd83ca8 ffffffff8103333a ffff8803a6e17b00 ffffffff81c31000 [4746854.099363] ffff8803bfd83cc8 ffffffff810df347 ffff8803bfd8e250 ffff8803bfd8eb80 [4746854.099382] Call Trace: [4746854.099387] <IRQ> [4746854.099401] [<ffffffff8103333a>] arch_trigger_all_cpu_backtrace+0x5a/0x90 [4746854.099416] [<ffffffff810df347>] check_cpu_stall.isra.35+0x97/0xf0 [4746854.099429] [<ffffffff810df3d8>] __rcu_pending+0x38/0x1d0 [4746854.099439] [<ffffffff810df869>] rcu_check_callbacks+0x79/0x1e0 [4746854.099453] [<ffffffff81078098>] update_process_times+0x48/0x90 [4746854.099466] [<ffffffff8109b864>] tick_sched_timer+0x64/0xc0 [4746854.099480] [<ffffffff8108dfe8>] __run_hrtimer+0x78/0x1f0 [4746854.099491] [<ffffffff8109b800>] ? tick_nohz_handler+0x100/0x100 [4746854.099506] [<ffffffff8105e748>] ? load_balance+0x78/0x370 [4746854.099520] [<ffffffff8108e917>] hrtimer_interrupt+0xf7/0x230 [4746854.099535] [<ffffffff8100a817>] xen_timer_interrupt+0x27/0x40 [4746854.099547] [<ffffffff810d7bb5>] handle_irq_event_percpu+0x55/0x210 [4746854.099561] [<ffffffff813a6f7e>] ? info_for_irq+0xe/0x30 [4746854.099572] [<ffffffff810dae67>] handle_percpu_irq+0x47/0x60 [4746854.099583] [<ffffffff813a6de9>] __xen_evtchn_do_upcall+0x199/0x250 [4746854.099596] [<ffffffff813a8ecf>] xen_evtchn_do_upcall+0x2f/0x50 [4746854.099610] [<ffffffff81661b7e>] xen_do_hypervisor_callback+0x1e/0x30 [4746854.099619] <EOI> [4746854.099632] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099645] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099659] [<ffffffff813a757e>] ? xen_poll_irq_timeout+0x3e/0x50 [4746854.099671] [<ffffffff813a9060>] ? xen_poll_irq+0x10/0x20 [4746854.099683] [<ffffffff8163c200>] ? xen_spin_lock_slow+0x97/0xf2 [4746854.099695] [<ffffffffa000c000>] ? 0xffffffffa000bfff [4746854.099709] [<ffffffff810121da>] ? xen_spin_lock+0x4a/0x50 [4746854.099722] [<ffffffff816572ce>] ? _raw_spin_lock+0xe/0x20 [4746854.099734] [<ffffffffa000702b>] ? stall+0x2b/0x44 [stallmod] [4746854.099746] [<ffffffffa000c009>] ? init_module+0x9/0x1000 [stallmod] [4746854.099758] [<ffffffff81002040>] ? do_one_initcall+0x40/0x180 [4746854.099771] [<ffffffff810a7abe>] ? sys_init_module+0xbe/0x230 [4746854.099783] [<ffffffff8165f8c2>] ? system_call_fastpath+0x16/0x1b In this case, the function is invoked by RCU based stall detector when it detects stalled CPU(i.e. lockup) in an interrupt context. Oops in an interrupt context always causes a kernel panic, so this bug sometimes makes debugging a kernel lockup issue difficult. The function is also invoked from sysrq_handle_showallcpus() that is for getting traces from all active CPUs anytime we want. # echo l > /pros/sysrq-trigger This is the easiest way to reproduce this. [How to fix] As far as I see, one possible solution is to backport the following patch. This patch is already included in Quantal's kernel. http://lists.xen.org/archives/html/xen-devel/2012-04/msg01023.html Another solution is to disable arch_trigger_all_cpu_backtrace() at compile time but I'm still investigating what config is for that. If you need any other information, please feel free to ask me.
2013-05-08 16:31:59	Munehisa Kamata	description	SRU Justification: Impact: The arch_trigger_all_cpu_backtrace tries to notify all other cpus via ipi. For that it looks up an ipi hook from the apic structure without verifying whether that pointer is NULL or not. Fix: Upstream fixed this by implementing the apic IPI hooks interface. Although some pieces seem to be unclear, this is not changed in upstream kernels since then. So either it does not matter or those pieces are not used. So for now backport the patch introducing the apic interface from upstream (only dropping one unnecessary declaration). This only affects PVM as HVM emulates flat apic completely. Testcase: Cause a call to arch_trigger_all_cpu_backtrace (Munehisa, can you provide a simple trigger?). --- The arch_trigger_all_cpu_backtrace() tries to send NMI to all CPUs via IPI for getting stacktraces from them. But NMI vector is not implemented on virtualized environment(Xen PV) and the function results in Oops. [4746854.099062] INFO: rcu_sched detected stall on CPU 3 (t=15001 jiffies) [4746854.099091] BUG: unable to handle kernel paging request at ffffffffff5fb310 [4746854.099100] IP: [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099116] PGD 1c07067 PUD 1c08067 PMD 1dd4067 PTE 0 [4746854.099126] Oops: 0002 [#1] SMP [4746854.099134] CPU 3 [4746854.099137] Modules linked in: stallmod(O+) isofs acpiphp [4746854.099150] [4746854.099157] Pid: 4752, comm: insmod Tainted: G O 3.2.0-40-virtual #64-Ubuntu [4746854.099174] RIP: e030:[<ffffffff81037cf8>] [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099189] RSP: e02b:ffff8803bfd83c68 EFLAGS: 00010046 [4746854.099198] RAX: 0000000000000000 RBX: ffffffff81cd0060 RCX: 000000000003ffff [4746854.099208] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002 [4746854.099219] RBP: ffff8803bfd83c88 R08: 000000000003ffff R09: 0000000000000000 [4746854.099229] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000800 [4746854.099240] R13: 000000000f000000 R14: ffff8803bfd8e700 R15: 0000000000000000 [4746854.099256] FS: 00007f456d441700(0000) GS:ffff8803bfd80000(0000) knlGS:0000000000000000 [4746854.099270] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [4746854.099279] CR2: ffffffffff5fb310 CR3: 00000003a4180000 CR4: 0000000000002660 [4746854.099290] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [4746854.099301] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [4746854.099312] Process insmod (pid: 4752, threadinfo ffff8803a48d4000, task ffff8803a6b5c4a0) [4746854.099323] Stack: [4746854.099328] 0000000000000000 0000000000002710 ffffffff81c31000 ffffffff81c31100 [4746854.099346] ffff8803bfd83ca8 ffffffff8103333a ffff8803a6e17b00 ffffffff81c31000 [4746854.099363] ffff8803bfd83cc8 ffffffff810df347 ffff8803bfd8e250 ffff8803bfd8eb80 [4746854.099382] Call Trace: [4746854.099387] <IRQ> [4746854.099401] [<ffffffff8103333a>] arch_trigger_all_cpu_backtrace+0x5a/0x90 [4746854.099416] [<ffffffff810df347>] check_cpu_stall.isra.35+0x97/0xf0 [4746854.099429] [<ffffffff810df3d8>] __rcu_pending+0x38/0x1d0 [4746854.099439] [<ffffffff810df869>] rcu_check_callbacks+0x79/0x1e0 [4746854.099453] [<ffffffff81078098>] update_process_times+0x48/0x90 [4746854.099466] [<ffffffff8109b864>] tick_sched_timer+0x64/0xc0 [4746854.099480] [<ffffffff8108dfe8>] __run_hrtimer+0x78/0x1f0 [4746854.099491] [<ffffffff8109b800>] ? tick_nohz_handler+0x100/0x100 [4746854.099506] [<ffffffff8105e748>] ? load_balance+0x78/0x370 [4746854.099520] [<ffffffff8108e917>] hrtimer_interrupt+0xf7/0x230 [4746854.099535] [<ffffffff8100a817>] xen_timer_interrupt+0x27/0x40 [4746854.099547] [<ffffffff810d7bb5>] handle_irq_event_percpu+0x55/0x210 [4746854.099561] [<ffffffff813a6f7e>] ? info_for_irq+0xe/0x30 [4746854.099572] [<ffffffff810dae67>] handle_percpu_irq+0x47/0x60 [4746854.099583] [<ffffffff813a6de9>] __xen_evtchn_do_upcall+0x199/0x250 [4746854.099596] [<ffffffff813a8ecf>] xen_evtchn_do_upcall+0x2f/0x50 [4746854.099610] [<ffffffff81661b7e>] xen_do_hypervisor_callback+0x1e/0x30 [4746854.099619] <EOI> [4746854.099632] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099645] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099659] [<ffffffff813a757e>] ? xen_poll_irq_timeout+0x3e/0x50 [4746854.099671] [<ffffffff813a9060>] ? xen_poll_irq+0x10/0x20 [4746854.099683] [<ffffffff8163c200>] ? xen_spin_lock_slow+0x97/0xf2 [4746854.099695] [<ffffffffa000c000>] ? 0xffffffffa000bfff [4746854.099709] [<ffffffff810121da>] ? xen_spin_lock+0x4a/0x50 [4746854.099722] [<ffffffff816572ce>] ? _raw_spin_lock+0xe/0x20 [4746854.099734] [<ffffffffa000702b>] ? stall+0x2b/0x44 [stallmod] [4746854.099746] [<ffffffffa000c009>] ? init_module+0x9/0x1000 [stallmod] [4746854.099758] [<ffffffff81002040>] ? do_one_initcall+0x40/0x180 [4746854.099771] [<ffffffff810a7abe>] ? sys_init_module+0xbe/0x230 [4746854.099783] [<ffffffff8165f8c2>] ? system_call_fastpath+0x16/0x1b In this case, the function is invoked by RCU based stall detector when it detects stalled CPU(i.e. lockup) in an interrupt context. Oops in an interrupt context always causes a kernel panic, so this bug sometimes makes debugging a kernel lockup issue difficult. The function is also invoked from sysrq_handle_showallcpus() that is for getting traces from all active CPUs anytime we want. # echo l > /pros/sysrq-trigger This is the easiest way to reproduce this. [How to fix] As far as I see, one possible solution is to backport the following patch. This patch is already included in Quantal's kernel. http://lists.xen.org/archives/html/xen-devel/2012-04/msg01023.html Another solution is to disable arch_trigger_all_cpu_backtrace() at compile time but I'm still investigating what config is for that. If you need any other information, please feel free to ask me.	SRU Justification: Impact: The arch_trigger_all_cpu_backtrace tries to notify all other cpus via ipi. For that it looks up an ipi hook from the apic structure without verifying whether that pointer is NULL or not. Fix: Upstream fixed this by implementing the apic IPI hooks interface. Although some pieces seem to be unclear, this is not changed in upstream kernels since then. So either it does not matter or those pieces are not used. So for now backport the patch introducing the apic interface from upstream (only dropping one unnecessary declaration). This only affects PVM as HVM emulates flat apic completely. Testcase: To cause a call to arch_trigger_all_cpu_backtrace by: # echo l > /proc/sysrq-trigger --- The arch_trigger_all_cpu_backtrace() tries to send NMI to all CPUs via IPI for getting stacktraces from them. But NMI vector is not implemented on virtualized environment(Xen PV) and the function results in Oops. [4746854.099062] INFO: rcu_sched detected stall on CPU 3 (t=15001 jiffies) [4746854.099091] BUG: unable to handle kernel paging request at ffffffffff5fb310 [4746854.099100] IP: [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099116] PGD 1c07067 PUD 1c08067 PMD 1dd4067 PTE 0 [4746854.099126] Oops: 0002 [#1] SMP [4746854.099134] CPU 3 [4746854.099137] Modules linked in: stallmod(O+) isofs acpiphp [4746854.099150] [4746854.099157] Pid: 4752, comm: insmod Tainted: G O 3.2.0-40-virtual #64-Ubuntu [4746854.099174] RIP: e030:[<ffffffff81037cf8>] [<ffffffff81037cf8>] flat_send_IPI_all+0x98/0xd0 [4746854.099189] RSP: e02b:ffff8803bfd83c68 EFLAGS: 00010046 [4746854.099198] RAX: 0000000000000000 RBX: ffffffff81cd0060 RCX: 000000000003ffff [4746854.099208] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002 [4746854.099219] RBP: ffff8803bfd83c88 R08: 000000000003ffff R09: 0000000000000000 [4746854.099229] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000800 [4746854.099240] R13: 000000000f000000 R14: ffff8803bfd8e700 R15: 0000000000000000 [4746854.099256] FS: 00007f456d441700(0000) GS:ffff8803bfd80000(0000) knlGS:0000000000000000 [4746854.099270] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [4746854.099279] CR2: ffffffffff5fb310 CR3: 00000003a4180000 CR4: 0000000000002660 [4746854.099290] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [4746854.099301] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [4746854.099312] Process insmod (pid: 4752, threadinfo ffff8803a48d4000, task ffff8803a6b5c4a0) [4746854.099323] Stack: [4746854.099328] 0000000000000000 0000000000002710 ffffffff81c31000 ffffffff81c31100 [4746854.099346] ffff8803bfd83ca8 ffffffff8103333a ffff8803a6e17b00 ffffffff81c31000 [4746854.099363] ffff8803bfd83cc8 ffffffff810df347 ffff8803bfd8e250 ffff8803bfd8eb80 [4746854.099382] Call Trace: [4746854.099387] <IRQ> [4746854.099401] [<ffffffff8103333a>] arch_trigger_all_cpu_backtrace+0x5a/0x90 [4746854.099416] [<ffffffff810df347>] check_cpu_stall.isra.35+0x97/0xf0 [4746854.099429] [<ffffffff810df3d8>] __rcu_pending+0x38/0x1d0 [4746854.099439] [<ffffffff810df869>] rcu_check_callbacks+0x79/0x1e0 [4746854.099453] [<ffffffff81078098>] update_process_times+0x48/0x90 [4746854.099466] [<ffffffff8109b864>] tick_sched_timer+0x64/0xc0 [4746854.099480] [<ffffffff8108dfe8>] __run_hrtimer+0x78/0x1f0 [4746854.099491] [<ffffffff8109b800>] ? tick_nohz_handler+0x100/0x100 [4746854.099506] [<ffffffff8105e748>] ? load_balance+0x78/0x370 [4746854.099520] [<ffffffff8108e917>] hrtimer_interrupt+0xf7/0x230 [4746854.099535] [<ffffffff8100a817>] xen_timer_interrupt+0x27/0x40 [4746854.099547] [<ffffffff810d7bb5>] handle_irq_event_percpu+0x55/0x210 [4746854.099561] [<ffffffff813a6f7e>] ? info_for_irq+0xe/0x30 [4746854.099572] [<ffffffff810dae67>] handle_percpu_irq+0x47/0x60 [4746854.099583] [<ffffffff813a6de9>] __xen_evtchn_do_upcall+0x199/0x250 [4746854.099596] [<ffffffff813a8ecf>] xen_evtchn_do_upcall+0x2f/0x50 [4746854.099610] [<ffffffff81661b7e>] xen_do_hypervisor_callback+0x1e/0x30 [4746854.099619] <EOI> [4746854.099632] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099645] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [4746854.099659] [<ffffffff813a757e>] ? xen_poll_irq_timeout+0x3e/0x50 [4746854.099671] [<ffffffff813a9060>] ? xen_poll_irq+0x10/0x20 [4746854.099683] [<ffffffff8163c200>] ? xen_spin_lock_slow+0x97/0xf2 [4746854.099695] [<ffffffffa000c000>] ? 0xffffffffa000bfff [4746854.099709] [<ffffffff810121da>] ? xen_spin_lock+0x4a/0x50 [4746854.099722] [<ffffffff816572ce>] ? _raw_spin_lock+0xe/0x20 [4746854.099734] [<ffffffffa000702b>] ? stall+0x2b/0x44 [stallmod] [4746854.099746] [<ffffffffa000c009>] ? init_module+0x9/0x1000 [stallmod] [4746854.099758] [<ffffffff81002040>] ? do_one_initcall+0x40/0x180 [4746854.099771] [<ffffffff810a7abe>] ? sys_init_module+0xbe/0x230 [4746854.099783] [<ffffffff8165f8c2>] ? system_call_fastpath+0x16/0x1b In this case, the function is invoked by RCU based stall detector when it detects stalled CPU(i.e. lockup) in an interrupt context. Oops in an interrupt context always causes a kernel panic, so this bug sometimes makes debugging a kernel lockup issue difficult. The function is also invoked from sysrq_handle_showallcpus() that is for getting traces from all active CPUs anytime we want. # echo l > /pros/sysrq-trigger This is the easiest way to reproduce this. [How to fix] As far as I see, one possible solution is to backport the following patch. This patch is already included in Quantal's kernel. http://lists.xen.org/archives/html/xen-devel/2012-04/msg01023.html Another solution is to disable arch_trigger_all_cpu_backtrace() at compile time but I'm still investigating what config is for that. If you need any other information, please feel free to ask me.
2013-05-09 14:34:26	Tim Gardner	linux (Ubuntu Precise): status	In Progress	Fix Committed
2013-06-04 15:22:42	Brad Figg	tags	bot-stop-nagging kernel-da-key precise	bot-stop-nagging kernel-da-key precise verification-needed-precise
2013-06-05 16:04:05	Steve Conklin	tags	bot-stop-nagging kernel-da-key precise verification-needed-precise	bot-stop-nagging kernel-da-key precise verification-done-precise
2013-06-13 18:22:10	Launchpad Janitor	linux (Ubuntu Precise): status	Fix Committed	Fix Released
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3076
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3222
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3223
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3224
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3225
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3234
2013-06-13 18:22:10	Launchpad Janitor	cve linked		2013-3235

Ubuntulinux package

Activity log for bug #1168350

Ubuntu
linux package