Steps to recreate the problem:
1. Install Ubuntu 15.04 as a PowerVM guest. 2. Install perf tool 3. Run following scripts to test 24/7 Power8 hardware counter event with perf. tool
=== Script 1 #!/bin/bash
count=0;
offset=0x128 PERF_ARGS="-r 10 -C 0" while [ $count -lt 100 ]; do
EVENT="hv_24x7/domain=0x2,offset=$offset,starting_index=10/"
perf stat $PERF_ARGS -x ' ' perf stat $PERF_ARGS -x ' ' -e $EVENT ls
count=) done
==== Script 2 #!/bin/bash
offset=0;
PERF_ARGS="-r 10 -C 0" while [ $offset -lt 8192 ]; do
offset=) done
After few iterations I hit the following BUG.
tt2.sh tt.sh tt2.sh tt.sh tt2.sh tt.sh 275679187521558 hv_24x7/domain=0x2,offset=6848,starting_index=10/ 0.00% tt2.sh tt.sh [ 4657.314709] softirq: huh, entered softirq 7 SCHED c00000000010abc0 with preem pt_count 00000100, exited with bfff0000? [ 4657.314727] kernel BUG at /build/buildd/linux-3.16.0/kernel/irq_work.c:157! [ 4657.314732] Oops: Exception in kernel mode, sig: 5 [#1] [ 4657.314740] Modules linked in: rtc_generic pseries_rng [ 4657.314749] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-25-generic #33-U [ 4657.314755] task: c000000001375e00 ti: c0000000013d0000 task.ti: c0000000013d0000 [ 4657.314759] NIP: c0000000001e8ffc LR: c00000000001fe70 CTR: c000000000002800ic) [ 4657.314770] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28042024 XER: 0000000a [ 4657.314782] CFAR: c00000000001fe6c SOFTE: 0 GPR04: 0000000000000010 00000000009c0000 c000000001424a98 0000000000000002 GPR12: 8000000000009033 c00000000e9a0000 0000000006a3fcd0 0000000000000060 GPR16: 0000000000200000 0000000000000000 c000000000e57c00 0000000000000000 GPR20: c000000001595dca c000000001595478 0000000000000001 000000000000ffff GPR28: c000000000e40380 c000000000e40300 c0000000013d3590 c000000000e56f08 [ 4657.314832] NIP [c0000000001e8ffc] irq_work_run+0x1c/0x30 [ 4657.314841] Call Trace: 4000 (unreliable) [ 4657.314861] [c0000000013d34f0] [c00000000001ff90] timer_interrupt+0xa0/0xe0 [ 4657.314871] [c0000000013d3520] [c000000000002914] decrementer_common+0x114/0x180 [ 4657.314884] --- Exception: 901 at arch_local_irq_restore+0x14/0x90 [ 4657.314896] [c0000000013d3810] [c00000000012ed08] vprintk_emit+0x3b8/0x660 (u [ 4657.314908] [c0000000013d38e0] [c000000000a02650] printk+0x84/0x98 [ 4657.314918] [c0000000013d3910] [c0000000000b51b4] __do_softirq+0x1e4/0x410 [ 4657.314927] [c0000000013d3a00] [c0000000000b57b8] irq_exit+0xf8/0x1400 [ 4657.314948] [c0000000013d3a60] [c000000000002c14] doorbell_super_common+0x114/0x180 [ 4657.314963] --- Exception: a01 at plpar_hcall_norets+0x8c/0xdc [ 4657.314963] LR = check_and_cede_processor+0x34/0x5020/0x50 (unreliable) [ 4657.314997] [c0000000013d3df0] [c00000000084077c] cpuidle_enter_state+0x6c/0x140c0 [ 4657.315030] [c0000000013d3f00] [c000000000d63ea8] start_kernel+0x500/0x51c [ 4657.315047] Instruction dump: [ 4657.315052] eba1ffe8 7c0803a6 ebc1fff0 ebe1fff8 4e800020 3c4c011f 3842c110 78290464 [ 4657.315068] 81290014 752a000f 7d380026 55291ffe <0b090000> 4bfffec8 60000000 60000000 [ 4657.315090] ---[ end trace ee202cccd2211e5d ]--- [ 4657.320224] [ 4657.362675] Unable to handle kernel paging request for data at address 0xc000 000b35515048 [ 4657.362680] Faulting instruction address: 0xc00000000006a37c [ 4657.362684] Oops: Kernel access of bad area, sig: 11 [#2] [ 4657.362686] SMP NR_CPUS=2048 NUMA pSeries [ 4657.362695] CPU: 12 PID: 7 Comm: rcu_sched Tainted: G D 3.16.0-25- [ 4657.362699] task: c0000000eb581540 ti: c0000000eb604000 task.ti: c0000000eb60 [ 4657.362703] NIP: c00000000006a37c LR: c0000000000865a8 CTR: c00000000006a340 [ 4657.362706] REGS: c0000000eb607800 TRAP: 0300 Tainted: G D (3.16.0-25-generic) 00000000 [ 4657.362718] CFAR: c0000000000865a4 DAR: c000000b35515048 DSISR: 40000000 SOFTE: 0 GPR00: c0000000000865a8 c0000000eb607a80 c0000000013d50f0 00000000013d30d0 GPR08: 0000000000cc0000 c000000b35515000 c00000000e9a0000 0000000000000000 GPR12: c00000000006a340 c00000000e9a6c00 0000000000000000 0000000000000001 GPR20: 0000000000000000 c000000001389700 0000000000000000 0000000000000001 GPR28: c000000001420a68 0000000000000000 00000000013d30d0 0000000000000001 [ 4657.362758] NIP [c00000000006a37c] icp_hv_cause_ipi+0x3c/0xc0 [ 4657.362762] LR [c0000000000865a8] pSeries_cause_ipi_mux+0x88/0xc0 [ 4657.362765] Call Trace: 0 (unreliable) [ 4657.362774] [c0000000eb607af0] [c0000000000865a8] pSeries_cause_ipi_mux+0x88/0xc0 [ 4657.362778] [c0000000eb607b20] [c0000000000426f0] smp_muxed_ipi_message_pass+ 0x70/0x90 [ 4657.362783] [c0000000eb607b60] [c0000000000f3a58] resched_task+0x118/0x140 [ 4657.362786] [c0000000eb607b90] [c0000000000f3da0] resched_cpu+0xc0/0x110 [ 4657.362791] [c0000000eb607be0] [c00000000013f170] rcu_implicit_dynticks_qs+0x200/0x230 [ 4657.362795] [c0000000eb607c10] [c00000000013de1c] force_qs_rnp+0x14c/0x250 [ 4657.362799] [c0000000eb607c90] [c0000000001407f0] rcu_gp_kthread+0x430/0x8e0 [ 4657.362803] [c0000000eb607d80] [c0000000000e0820] kthread+0x110/0x130 [ 4657.362807] [c0000000eb607e30] [c00000000000a468] ret_from_kernel_thread+0x5c/0x74 [ 4657.362810] Instruction dump: [ 4657.362812] fbc1fff0 fbe1fff8 f8010010 f821ff91 7c7e1b78 60000000 60000000 3d220008 [ 4657.362818] 39493f00 1d3e0900 e94a0000 7d2a4a14 <abe90048> 7c0004ac 3860006c 7fe4fb78 [ 4657.362825] ---[ end trace ee202cccd2211e5e ]--- [ 4657.365085] [ 4659.320264] Kernel panic - not syncing: Attempted to kill the idle task! [ 4659.325500] ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
Backported following 4 commits/patches from upstream[1]:
1. commit d658972 Author: Himangi Saraogi <email address hidden> Date: Tue Jul 22 23:40:19 2014 +0530
powerpc/perf/hv-24x7: Use kmem_cache_free
2. commit 48bee8a Author: Cody P Schafer <email address hidden> Date: Tue Sep 30 23:03:17 2014 -0700
powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations
3. https://lkml.org/lkml/2014/12/10/613 4. https://lkml.org/lkml/2014/12/10/36
to the vivid kernel[2]. The problem does not repro.
Will Canonical cherry-pick those commits or should we backport ? (they apply without conflicts).
[1] The patches 3 and 4 above were posted recently, Powerpc maintainer plans to merge them.
[2] git://kernel.ubuntu.com/ubuntu/ubuntu-vivid.git
Steps to recreate the problem:
1. Install Ubuntu 15.04 as a PowerVM guest.
2. Install perf tool
3. Run following scripts to test 24/7 Power8 hardware counter event with perf. tool
=== Script 1
#!/bin/bash
count=0;
offset=0x128
PERF_ARGS="-r 10 -C 0"
while [ $count -lt 100 ]; do
perf stat $PERF_ARGS -x ' ' perf stat $PERF_ARGS -x ' ' -e $EVENT ls
count=)
done
==== Script 2
#!/bin/bash
offset=0;
PERF_ARGS="-r 10 -C 0"
while [ $offset -lt 8192 ]; do
perf stat $PERF_ARGS -x ' ' perf stat $PERF_ARGS -x ' ' -e $EVENT ls
offset=)
done
After few iterations I hit the following BUG.
tt2.sh tt.sh domain= 0x2,offset= 6848,starting_ index=10/ 0.00% buildd/ linux-3. 16.0/kernel/ irq_work. c:157! ME,IR,DR, RI,LE> CR: 28042024 XER: 0000000a run+0x1c/ 0x30 +0xa0/0xe0 common+ 0x114/0x180 irq_restore+ 0x14/0x90 emit+0x3b8/ 0x660 (u 0x1e4/0x410 0xf8/0x1400 super_common+ 0x114/0x180 norets+ 0x8c/0xdc cede_processor+ 0x34/0x5020/ 0x50 (unreliable) enter_state+ 0x6c/0x140c0 0x500/0x51c cause_ipi+ 0x3c/0xc0 cause_ipi_ mux+0x88/ 0xc0 cause_ipi_ mux+0x88/ 0xc0 ipi_message_ pass+ task+0x118/ 0x140 cpu+0xc0/ 0x110 dynticks_ qs+0x200/ 0x230 rnp+0x14c/ 0x250 kthread+ 0x430/0x8e0 kernel_ thread+ 0x5c/0x74
tt2.sh tt.sh
tt2.sh tt.sh
275679187521558 hv_24x7/
tt2.sh tt.sh
[ 4657.314709] softirq: huh, entered softirq 7 SCHED c00000000010abc0 with preem
pt_count 00000100, exited with bfff0000?
[ 4657.314727] kernel BUG at /build/
[ 4657.314732] Oops: Exception in kernel mode, sig: 5 [#1]
[ 4657.314740] Modules linked in: rtc_generic pseries_rng
[ 4657.314749] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-25-generic #33-U
[ 4657.314755] task: c000000001375e00 ti: c0000000013d0000 task.ti: c0000000013d0000
[ 4657.314759] NIP: c0000000001e8ffc LR: c00000000001fe70 CTR: c000000000002800ic)
[ 4657.314770] MSR: 8000000000029033 <SF,EE,
[ 4657.314782] CFAR: c00000000001fe6c SOFTE: 0
GPR04: 0000000000000010 00000000009c0000 c000000001424a98 0000000000000002
GPR12: 8000000000009033 c00000000e9a0000 0000000006a3fcd0 0000000000000060
GPR16: 0000000000200000 0000000000000000 c000000000e57c00 0000000000000000
GPR20: c000000001595dca c000000001595478 0000000000000001 000000000000ffff
GPR28: c000000000e40380 c000000000e40300 c0000000013d3590 c000000000e56f08
[ 4657.314832] NIP [c0000000001e8ffc] irq_work_
[ 4657.314841] Call Trace:
4000 (unreliable)
[ 4657.314861] [c0000000013d34f0] [c00000000001ff90] timer_interrupt
[ 4657.314871] [c0000000013d3520] [c000000000002914] decrementer_
[ 4657.314884] --- Exception: 901 at arch_local_
[ 4657.314896] [c0000000013d3810] [c00000000012ed08] vprintk_
[ 4657.314908] [c0000000013d38e0] [c000000000a02650] printk+0x84/0x98
[ 4657.314918] [c0000000013d3910] [c0000000000b51b4] __do_softirq+
[ 4657.314927] [c0000000013d3a00] [c0000000000b57b8] irq_exit+
[ 4657.314948] [c0000000013d3a60] [c000000000002c14] doorbell_
[ 4657.314963] --- Exception: a01 at plpar_hcall_
[ 4657.314963] LR = check_and_
[ 4657.314997] [c0000000013d3df0] [c00000000084077c] cpuidle_
[ 4657.315030] [c0000000013d3f00] [c000000000d63ea8] start_kernel+
[ 4657.315047] Instruction dump:
[ 4657.315052] eba1ffe8 7c0803a6 ebc1fff0 ebe1fff8 4e800020 3c4c011f 3842c110 78290464
[ 4657.315068] 81290014 752a000f 7d380026 55291ffe <0b090000> 4bfffec8 60000000
60000000
[ 4657.315090] ---[ end trace ee202cccd2211e5d ]---
[ 4657.320224]
[ 4657.362675] Unable to handle kernel paging request for data at address 0xc000
000b35515048
[ 4657.362680] Faulting instruction address: 0xc00000000006a37c
[ 4657.362684] Oops: Kernel access of bad area, sig: 11 [#2]
[ 4657.362686] SMP NR_CPUS=2048 NUMA pSeries
[ 4657.362695] CPU: 12 PID: 7 Comm: rcu_sched Tainted: G D 3.16.0-25-
[ 4657.362699] task: c0000000eb581540 ti: c0000000eb604000 task.ti: c0000000eb60
[ 4657.362703] NIP: c00000000006a37c LR: c0000000000865a8 CTR: c00000000006a340
[ 4657.362706] REGS: c0000000eb607800 TRAP: 0300 Tainted: G D (3.16.0-25-generic)
00000000
[ 4657.362718] CFAR: c0000000000865a4 DAR: c000000b35515048 DSISR: 40000000 SOFTE: 0
GPR00: c0000000000865a8 c0000000eb607a80 c0000000013d50f0 00000000013d30d0
GPR08: 0000000000cc0000 c000000b35515000 c00000000e9a0000 0000000000000000
GPR12: c00000000006a340 c00000000e9a6c00 0000000000000000 0000000000000001
GPR20: 0000000000000000 c000000001389700 0000000000000000 0000000000000001
GPR28: c000000001420a68 0000000000000000 00000000013d30d0 0000000000000001
[ 4657.362758] NIP [c00000000006a37c] icp_hv_
[ 4657.362762] LR [c0000000000865a8] pSeries_
[ 4657.362765] Call Trace:
0 (unreliable)
[ 4657.362774] [c0000000eb607af0] [c0000000000865a8] pSeries_
[ 4657.362778] [c0000000eb607b20] [c0000000000426f0] smp_muxed_
0x70/0x90
[ 4657.362783] [c0000000eb607b60] [c0000000000f3a58] resched_
[ 4657.362786] [c0000000eb607b90] [c0000000000f3da0] resched_
[ 4657.362791] [c0000000eb607be0] [c00000000013f170] rcu_implicit_
[ 4657.362795] [c0000000eb607c10] [c00000000013de1c] force_qs_
[ 4657.362799] [c0000000eb607c90] [c0000000001407f0] rcu_gp_
[ 4657.362803] [c0000000eb607d80] [c0000000000e0820] kthread+0x110/0x130
[ 4657.362807] [c0000000eb607e30] [c00000000000a468] ret_from_
[ 4657.362810] Instruction dump:
[ 4657.362812] fbc1fff0 fbe1fff8 f8010010 f821ff91 7c7e1b78 60000000 60000000 3d220008
[ 4657.362818] 39493f00 1d3e0900 e94a0000 7d2a4a14 <abe90048> 7c0004ac 3860006c
7fe4fb78
[ 4657.362825] ---[ end trace ee202cccd2211e5e ]---
[ 4657.365085]
[ 4659.320264] Kernel panic - not syncing: Attempted to kill the idle task!
[ 4659.325500] ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
Backported following 4 commits/patches from upstream[1]:
1. commit d658972
Author: Himangi Saraogi <email address hidden>
Date: Tue Jul 22 23:40:19 2014 +0530
2. commit 48bee8a
Author: Cody P Schafer <email address hidden>
Date: Tue Sep 30 23:03:17 2014 -0700
3. https:/ /lkml.org/ lkml/2014/ 12/10/613 /lkml.org/ lkml/2014/ 12/10/36
4. https:/
to the vivid kernel[2]. The problem does not repro.
Will Canonical cherry-pick those commits or should we backport ?
(they apply without conflicts).
[1] The patches 3 and 4 above were posted recently, Powerpc
maintainer plans to merge them.
[2] git://kernel. ubuntu. com/ubuntu/ ubuntu- vivid.git