Repeatedly hotplugging CPUs crashes the kernel

Bug #1097213 reported by Naresh Kamboju on 2013-01-08
52
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro big.LITTLE
In Progress
Medium
Morten Rasmussen

Bug Description

I have developed a new test case, which does below steps
test case location:
https://linaro-private.git.linaro.org/gitweb?p=bL_tests/core-tests.git;a=blob;f=basic/switcher-tests/switcher_off_hotplug_cpu.sh

Test Description:
------------------------
# ==== switcher_off_hotplug_cpu.sh description ====
# when switcher is disabled in run time, we can get cpu2, cpu3 and cpu4 also online
# total we can see 5 cpus online. this test will hot-plug cpus 1,2,3 and 4.
# hot-plug cpus randomly one after other for 100 loops on each four different ways.
# ensure there would not be any kernel crash
# enable back the switcher after test

echo 0 > /sys/kernel/bL_switcher/active
i=0
while [ $i -lt 100 ];
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 0 > /sys/devices/system/cpu/cpu4/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu4/online
i=$(($i + 1))
done
echo 1 > /sys/kernel/bL_switcher/active

echo 0 > /sys/kernel/bL_switcher/active
i=0
while [ $i -lt 100 ];
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo 0 > /sys/devices/system/cpu/cpu4/online
echo 1 > /sys/devices/system/cpu/cpu4/online
i=$(($i + 1))
done
echo 1 > /sys/kernel/bL_switcher/active

echo 0 > /sys/kernel/bL_switcher/active
i=0
while [ $i -lt 100 ];
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 0 > /sys/devices/system/cpu/cpu4/online
echo 1 > /sys/devices/system/cpu/cpu4/online
echo 1 > /sys/devices/system/cpu/cpu3/online
i=$(($i + 1))
done
echo 1 > /sys/kernel/bL_switcher/active

echo 0 > /sys/kernel/bL_switcher/active
i=0
while [ $i -lt 100 ];
do
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 0 > /sys/devices/system/cpu/cpu4/online
echo 1 > /sys/devices/system/cpu/cpu4/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu1/online
i=$(($i + 1))
done
echo 1 > /sys/kernel/bL_switcher/active
echo "SUCCESS!!"
exit 0

Test output:
-----------------
test log has been attached to this bug.

Test summary:
--------------------
system stopped responding while test execution is in progress.

Linux kernel:
-----------------
Linux version 3.7.0-00655-g8cc8c22 (jenkins-build@ip-10-40-5-37)

Naresh Kamboju (naresh-kamboju) wrote :
Changed in linaro-big-little-system:
assignee: nobody → Nicolas Pitre (npitre)
description: updated
Amit Kucheria (amitk) wrote :

May I suggest a slightly shorter and more random script to do this kind of testing. This one hotplugs cores 2, 3, 4:

echo 0 > /sys/kernel/bL_switcher/active
i=1
while read -r foo; do
  array[i]="$foo";
 (( i++ ))
done < <(shuf -i 2-4 -n 3)

FIRST=${array[1]}
SECOND=${array[2]}
THIRD=${array[3]}

echo $FIRST
echo $SECOND
echo $THIRD

i=0
while [ $i -lt 100 ];
do
echo 0 > /sys/devices/system/cpu/$FIRST/online
echo 0 > /sys/devices/system/cpu/$SECOND/online
echo 1 > /sys/devices/system/cpu/$SECOND/online
echo 0 > /sys/devices/system/cpu/$THIRD/online
echo 1 > /sys/devices/system/cpu/$FIRST/online
echo 1 > /sys/devices/system/cpu/$THIRD/online
i=$(($i + 1))
done
echo 1 > /sys/kernel/bL_switcher/active

Alexander Sack (asac) wrote :

arrays feel like bashism... I would prefer to restrict ourselves to stuff that works in dash or even posh; at least validate if this works well on mksh (android shell)

Alexander Sack (asac) wrote :

oh ... on top of mksh our script should work in "busybox sh"

Alexander Sack (asac) wrote :

Instead of the while with array business i propose this:

asac@thinki:/tmp/linaro-android-build-tools$ list=`shuf -i 2-4 -n 3`
asac@thinki:/tmp/linaro-android-build-tools$ FIRST=`echo $list | awk '{ print $1 }'`
asac@thinki:/tmp/linaro-android-build-tools$ SECOND=`echo $list | awk '{ print $2 }'`
asac@thinki:/tmp/linaro-android-build-tools$ THIRD=`echo $list | awk '{ print $3 }'`

The problem is reproducible just by switching on and off any CPU between 5 and 10 times. Doing so various kernel traces can be observed:

https://pastebin.linaro.org/1383/
https://pastebin.linaro.org/1388/
https://pastebin.linaro.org/1385/
https://pastebin.linaro.org/1387/

All traces exhibit problems in functions found in "kernel/timer.c", always when the "timer_list" is used. The list is implemented using a "list_head" and either "->next" or "->previous" of that list reference a bogus address, hence the trace.

I was ready to throw the problem out on the base that it is a core timer/hotplug issue but the fault can't be reproduced on snowball using the same 3.7 kernel. Moreover, the problem isn't seen on "integration-linux-vexpress" nor "integration-android-vexpress", but those are 3.8-rc3 kernels.

Investitation is still underway.

I can't reproduce this on a 3.8-rc3 kernel - marking as won't fix for now.

Changed in linaro-big-little-system:
status: New → Invalid
Changed in linaro-big-little-system:
status: Invalid → In Progress

Reopening this after a chat with npitre.

When takeing the 3.7 baseline, no iks code, a kernel generated using vexpress_defconfig will crash while a kernel generated using "./scripts/kconfig/merge_scritps.sh linaro/configs/big-LITTLE-MP.conf linaro/configs/vexperss.conf linaro/configs/linaro-base.conf" won't. The diff between the two configs is 1062 line long, which amount to probably 300 config options.

Investigation continues.

Naresh Kamboju (naresh-kamboju) wrote :

Today i have tested core test suite with nico next branch kernel and notice the same problem still exists on Linux version 3.8.0-rc3-00651-g791ff5a.
kernel commit id:
linux$ git log -1 --oneline
791ff5a cpufreq: arm_big_little: Make 1.2 GHz max freq for IKS

test log:
https://pastebin.linaro.org/1418/

Bug reproduced on Android build=86
"https://android-build.linaro.org/jenkins/job/linaro-android-restricted_vexpress-tc2-isw-jb-gcc47-armlt-stable-open/86/"

Linux:
Linux version 3.7.0-00658-gb52fb3c (jenkins-build@ip-10-60-10-217) (gcc version 4.7.3 20130102 (prerelease) (Linaro GCC 4.7-2013.01-1~dev) ) #1 SMP Sun Jan 20 16:55:37 UTC 2013

Test log can be found at:
https://pastebin.linaro.org/1430/
https://validation.linaro.org/lava-server/scheduler/job/45535/log_file

Bug reproduced on Android build=90

Linux:
Linux version 3.8.0-rc3-00651-g791ff5a

Test log:
https://pastebin.linaro.org/1485/

David Zinman (dzinman) on 2013-01-29
Changed in linaro-big-little-system:
importance: Undecided → Critical
tags: added: bl-iks
David Zinman (dzinman) on 2013-01-29
Changed in linaro-big-little-system:
assignee: Nicolas Pitre (npitre) → Mathieu Poirier (mathieu.poirier)
David Zinman (dzinman) on 2013-01-29
Changed in linaro-big-little-system:
milestone: none → 13.02

More investigation on this problem yields that enabling CONFIG_RCU_FAST_NO_HZ seems to be at the source of the problem. When the config isn't set the problem can't be observed.

Moreover, the problem can be reproduced on snowball using a stock 3.8-rc4 kernel but it takes much, much more time.

Bug reproduced on Android build=99
https://android-build.linaro.org/builds/~linaro-android-restricted/vexpress-linaro-iks/#build=99

Linux:
Linux localhost 3.8.0-rc4-00666-g9286af1 #1 SMP Thu Jan 31 01:47:40 UTC 2013 armv7l GNU/Linux

Test log can be found at:
https://pastebin.linaro.org/1534/

For the last little while I've been working on this special branch [1] (test.2013.01.30a), prepared specifically for this problem by Paul M. Instead of dealing with 5 different trace (see comment #6), this special branch yields only one trace [2] and once again, the behavior goes away if RCU_FAST_NO_HZ is switched off. Fixing the problem on this branch will very likely fix the problem on mainline as well.

From the trace in [2] and quick glace at the code, one would be lead to beleive the running thread has been scheduled on the wrong CPU. But that would be a false deduction as reading the mpidr register yields the same processor as expected by td->cpu.

The problem is with 'smp_processor_id()', which really is a pointer to the current 'thread_info'.

'smpboot_thread_fn' is called by 'kthread' where it is disguised under 'threadfn()'. Just before calling 'threadfn' the current 'thread_info' holds the correct '->cpu' value but for some reason that value has changed when dereferenced from 'smp_processor_id()'.

A closer inspection of the situation reveals that the address of the 'thread_info' in 'kthread' is different than the one in 'smpboot_thread_fn'.

Since 'kthread' simply calls 'threadfn()' one would expect the address of the current 'thread_info' to be the same from both 'kthread()' and 'smpboot_thread_fn()'.

The investitation continues.

[1]. git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
[2]. https://pastebin.linaro.org/1648/

switcher_off_hotplug_cpu.sh PASSED on IKS kernel

Change:
RCU_FAST_NO_HZ has been disabled in Switcher Linux kernel. (nico/master kernel)
RCU_FAST_NO_HZ=n

Linux kernel version:
root@android:/ # uname -a
Linux localhost 3.7.6-00689-g79d1c74 #28 SMP Thu Feb 14 01:56:09 IST 2013 armv7l GNU/Linux

Test execution log:
===== switcher_off_hotplug_cpu.sh =====
SUCCESS!!
switcher-tests : PASS

https://pastebin.linaro.org/1718/

Changed in linaro-big-little-system:
status: In Progress → Fix Committed

CONFIG_RCU_FAST_NO_HZ now disabled.

switcher_off_hotplug_cpu.sh PASSED on IKS kernel

workaround for this issue is
RCU_FAST_NO_HZ has been disabled in Switcher Linux kernel. (nico/master kernel)
RCU_FAST_NO_HZ=n

which is not a fix, it is a just a work around. so i have changed the status to fix-committed to in-progress.
-Naresh Kamboju

Changed in linaro-big-little-system:
status: Fix Committed → In Progress
vishal (vishalbhoj) wrote :

With "RCU_FAST_NO_HZ=n" the bug becomes less frequent. The system has again hung in LAVA after few reboots. It happened with build 121 . Here are the logs:
https://validation.linaro.org/lava-server/scheduler/job/48514/log_file

RC-1 buid reproduced this problem.
With RC-1 build, cpu_hotplug_test failed.

Linux kernel and gcc versions.
[ 0.000000] Linux version 3.7.6-00695-gca4da8e (jenkins-build@ip-10-40-207-241) (gcc version 4.7.3 20130205 (prerelease) (Linaro GCC 4.7-2013.02-01-2~dev) ) #1 SMP

David Zinman (dzinman) on 2013-02-22
Changed in linaro-big-little-system:
assignee: Mathieu Poirier (mathieu.poirier) → Dave Martin (dave-martin-arm)
David Zinman (dzinman) on 2013-02-22
Changed in linaro-big-little-system:
importance: Critical → Medium
assignee: Dave Martin (dave-martin-arm) → Mathieu Poirier (mathieu.poirier)

bug has reproduced on https://android-build.linaro.org/builds/~linaro-android-restricted/vexpress-linaro-iks/#build=131

Linux kernel:
Linux localhost 3.7.6-00698-gb0bdf24 #1 SMP Tue Feb 26 05:46:17 UTC 2013 armv7

i can see kernel crash dump this time.

[ 188.408249] CPU3: shutdown
[ 188.480902] CPU4: shutdown
[ 188.491961] CPU1: Booted secondary processor
[ 188.505343] arm_big_little: bl_cpufreq_init: Initialized, cpu: 1, cluster 0
[ 188.528721] CPU2: Booted secondary processor
[ 188.545184] arm_big_little: bl_cpufreq_init: Initialized, cpu: 2, cluster 1
[ 188.575739] CPU3: Booted secondary processor
[ 188.592186] arm_big_little: bl_cpufreq_init: Initialized, cpu: 3, cluster 1
[ 188.685424] CPU4: Booted secondary processor
[ 188.698672] arm_big_little: bl_cpufreq_init: Initialized, cpu: 4, cluster 1
[ 188.772464] CPU1: shutdown
[ 188.877050] arm_big_little: bl_cpufreq_init: Initialized, cpu: 3, cluster 1
[ 248.957718] INFO: rcu_sched detected stalls on CPUs/tasks: { 4} (detected by)
[ 248.983812] Backtrace for cpu 0 (current):
[ 248.996080] [<c001469c>] (unwind_backtrace+0x0/0xf8) from [<c00135b8>] (smp_)
[ 249.025786] [<c00135b8>] (smp_send_all_cpu_backtrace+0x60/0xcc) from [<c0087)
[ 249.057055] [<c0087064>] (rcu_check_callbacks+0x584/0x6b0) from [<c0032cfc>])
[ 249.086759] [<c0032cfc>] (update_process_times+0x38/0x68) from [<c006abb0>] )
[ 249.115162] [<c006abb0>] (tick_sched_timer+0x80/0xec) from [<c0047470>] (__r)
[ 249.142001] [<c0047470>] (__run_hrtimer+0x84/0x1b4) from [<c0048050>] (hrtim)
[ 249.169620] [<c0048050>] (hrtimer_interrupt+0x110/0x2d8) from [<c001401c>] ()
[ 249.199584] [<c001401c>] (arch_timer_handler_virt+0x30/0x38) from [<c0082504)
[ 249.230852] [<c0082504>] (handle_percpu_devid_irq+0x90/0x130) from [<c007ed1)
[ 249.260818] [<c007ed10>] (generic_handle_irq+0x20/0x30) from [<c000ea68>] (h)
[ 249.287135] [<c000ea68>] (handle_IRQ+0x4c/0xb0) from [<c0008570>] (gic_handl)
[ 249.312410] [<c0008570>] (gic_handle_irq+0x28/0x5c) from [<c000ddc0>] (__irq)
[ 249.337421] Exception stack(0xc056fef8 to 0xc056ff40)

full log can be found at:
https://pastebin.linaro.org/1826/

the above link is local kernel build by vishal.
here I am attached test log with android build kernel https://pastebin.linaro.org/1829/

Download full text (3.5 KiB)

complete test log can be found at: https://pastebin.linaro.org/2024/

Stackdump:
  897.050119] CPU3: Booted secondary processor
[ 897.053685] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [ 897.090697] pgd = c0004000
[ 897.098781] [00000004] *pgd=00000000
[ 897.109480] Internal error: Oops: 805 [#1] SMP ARM
[ 897.123815] Modules linked in:
[ 897.133001] CPU: 2 Not tainted (3.9.0-rc3-00723-g91d15d3 #11)
[ 897.151282] PC is at run_timer_softirq+0x10c/0x288
[ 897.165622] LR is at run_timer_softirq+0x3c/0x288
[ 897.179698] pc : [<c0030ba4>] lr : [<c0030ad4>] psr: 600001d3
...
[ 899.757348] [<c0030ba4>] (run_timer_softirq+0x10c/0x288) from [<c0029544>] (__do_softirq+0xf4/0x250)
[ 899.784715] [<c0029544>] (__do_softirq+0xf4/0x250) from [<c00299ac>] (irq_exit+0x8c/0xc4)
[ 899.809219] [<c00299ac>] (irq_exit+0x8c/0xc4) from [<c000e978>] (handle_IRQ+0x3c/0x94)
[ 899.832938] [<c000e978>] (handle_IRQ+0x3c/0x94) from [<c00084f0>] (gic_handle_irq+0x28/0x5c)
[ 899.858221] [<c00084f0>] (gic_handle_irq+0x28/0x5c) from [<c000dd00>] (__irq_svc+0x40/0x50)

Reading symbols from /home/naresh/nico-linux/linux/kernel/timer.o...done.
(gdb) list *(run_timer_softirq+0x10c)
0x279c is in run_timer_softirq (include/linux/list.h:88).
83 * This is only for internal list manipulation where we know
84 * the prev/next entries already!
85 */
86 static inline void __list_del(struct list_head * prev, struct list_head * next)
87 {
88 next->prev = prev;
89 prev->next = next;
90 }
91
92 /**

Stackdump:
[ 900.344354] [<c000dd00>] (__irq_svc+0x40/0x50) from [<c02c8da8>] (bl_enter_powerdown+0x6c/0xd0)
[ 900.370414] [<c02c8da8>] (bl_enter_powerdown+0x6c/0xd0) from [<c02c6f84>] (cpuidle_enter+0x14/0x18)
[ 900.397515] [<c02c6f84>] (cpuidle_enter+0x14/0x18) from [<c02c73b4>] (cpuidle_enter_state+0x14/0x68)
[ 900.424877] [<c02c73b4>] (cpuidle_enter_state+0x14/0x68) from [<c02c74ac>] (cpuidle_idle_call+0xa4/0x198)
[ 900.453541] [<c02c74ac>] (cpuidle_idle_call+0xa4/0x198) from [<c000ee24>] (cpu_idle+0x9c/0x114)
[ 900.479598] [<c000ee24>] (cpu_idle+0x9c/0x114) from [<803d01a4>] (0x803d01a4)[ 900.500966] CPU1: stopping
[ 900.509068] [<c00142e0>] (unwind_backtrace+0x0/0xf8) from [<c00134e0>] (handle_IPI+0x198/0x1d8)

GDB disassembling code:
Reading symbols...

Read more...

the frequency of occurrence of this bug is less on current 3.9.0-rc7-00909-ge740431 kernel

This problem can't be reproduced on 3.10 if CONFIG_HMP_SCHED is not set. Nothing was received from the power-aware scheduler group hence putting on hold.

Amit Kucheria (amitk) wrote :

Does this problem only happen when hotplugging in IKS mode or in MP mode too?

summary: - switcher_off_hotplug_cpu.sh failed on IKS kernel
+ switcher_off_hotplug_cpu.sh fails on MP kernel

This bug is reported on 3.7 and 3.8 kernels.
I have seen this bug reproduced on IKS and MP builds with 3.7 and 3.8 kernel.

-Naresh

summary: - switcher_off_hotplug_cpu.sh fails on MP kernel
+ Repeatedly hotplugging CPUs crashes the kernel

Patch submitted - waiting to see the patch showing up in the tree before closing.

Changed in linaro-big-little-system:
status: In Progress → Fix Committed
Changed in linaro-big-little-system:
status: Fix Committed → In Progress
Changed in linaro-big-little-system:
assignee: Mathieu Poirier (mathieu.poirier) → nobody
Changed in linaro-big-little-system:
assignee: nobody → Morten Rasmussen (morten-rasmussen)
tags: added: bl-mp
removed: bl-iks
Amit Kucheria (amitk) wrote :

A patch is available, waiting for ARM to verify it and include it in the Beta patchset

This bug reproduced while running cpu_hotplug_latency_with_load in scheduler test suite on TC2.

LAVA job:
http://validation.linaro.org/scheduler/job/87385/log_file#L_43_1331
If you do not have access to this job can refer attached test log file.

Kernel panic log:
------------------------
[ 1260.266037] CPU1: Booted secondary processor
 [ 1262.266710] CPU1: shutdown
 [ 1262.274830] Unable to handle kernel NULL pointer dereference at virtual address 00000004
 [ 1262.298693] pgd = edcac000
 [ 1262.306661] [00000004] *pgd=00000000
 [ 1262.317209] Internal error: Oops: 805 [#1] SMP THUMB2
 [ 1262.332095] Modules linked in: gator
 [ 1262.342655] CPU: 0 PID: 5624 Comm: run.sh Not tainted 3.10.19-00262-gf3401c5 #1
 [ 1262.364208] task: ee9db0c0 ti: ecfd6000 task.ti: ecfd6000
 [ 1262.380137] PC is at migrate_timer_list+0x40/0x70
 [ 1262.394006] LR is at timer_cpu_notify+0x14b/0x1f8
 [ 1262.407875] pc : [<c03cf624>] lr : [<c03cdbbf>] psr: 200001f3
 [ 1262.407875] sp : ecfd7e70 ip : 00000000 fp : 00000000
 [ 1262.441733] r10: c16b6050 r9 : c062e220 r8 : 00200200
 [ 1262.457135] r7 : c0642400 r6 : ef0d63fc r5 : c16b7d44 r4 : 0000007e
 [ 1262.476381] r3 : ef0d63fc r2 : 00000000 r1 : c16b7d44 r0 : c0642400
 [ 1262.495628] Flags: nzCv IRQs off FIQs off Mode SVC_32 ISA Thumb Segment user
 [ 1262.517695] Control: 50c5387d Table: adcac06a DAC: 00000015
 [ 1262.534632]
 [ 1262.534632] PC: 0xc03cf5a4:

Full Kernel panic log attached to this bug.

Linux kernel version:
----------------------------
Linux version 3.10.19-00262-gf3401c5 (jenkins-build@ip-10-184-27-243) (gcc version 4.7.4 20131111 (prerelease) (Linaro GCC 4.7-2013.11) ) #1 SMP Thu Nov 21 01:38:10 UTC 2013

Android build:
-------------------
https://android-build.linaro.org/builds/~linaro-android-restricted/vexpress-linaro-lsk/#build=160
"parameters":
                "boot": "https://snapshots.linaro.org/android/~linaro-android-restricted/vexpress-linaro-lsk/160/boot.tar.bz2",
                "data": "https://snapshots.linaro.org/android/~linaro-android-restricted/vexpress-linaro-lsk/160/userdata.tar.bz2",
                "system": "https://snapshots.linaro.org/android/~linaro-android-restricted/vexpress-linaro-lsk/160/system.tar.bz2"

Kernel panic log:
------------------------
[ 1260.266037] CPU1: Booted secondary processor
 [ 1262.266710] CPU1: shutdown
 [ 1262.274830] Unable to handle kernel NULL pointer dereference at virtual address 00000004
 [ 1262.298693] pgd = edcac000
 [ 1262.306661] [00000004] *pgd=00000000
 [ 1262.317209] Internal error: Oops: 805 [#1] SMP THUMB2
 [ 1262.332095] Modules linked in: gator
 [ 1262.342655] CPU: 0 PID: 5624 Comm: run.sh Not tainted 3.10.19-00262-gf3401c5 #1
 [ 1262.364208] task: ee9db0c0 ti: ecfd6000 task.ti: ecfd6000
 [ 1262.380137] PC is at migrate_timer_list+0x40/0x70
 [ 1262.394006] LR is at timer_cpu_notify+0x14b/0x1f8
 [ 1262.407875] pc : [<c03cf624>] lr : [<c03cdbbf>] psr: 200001f3
 [ 1262.407875] sp : ecfd7e70 ip : 00000000 fp : 00000000
 [ 1262.441733] r10: c16b6050 r9 : c062e220 r8 : 00200200
 [ 1262.457135] r7 : c0642400 r6 : ef0d63fc r5 : c16b7d44 r4 : 0000007e
 [ 1262.476381] r3 : ef0d63fc r2 : 00000000 r1 : c16b7d44 r0 : c0642400
 [ 1262.495628] Flags: nzCv IRQs off FIQs off Mode SVC_32 ISA Thumb Segment user
 [ 1262.517695] Control: 50c5387d Table: adcac06a DAC: 00000015
 [ 1262.534632]
 [ 1262.534632] PC: 0xc03cf5a4:
...

[ 1265.172725] [<c03cf624>] (migrate_timer_list+0x40/0x70) from [<c03cdbbf>] (timer_cpu_notify+0x14b/0x1f8)
 [ 1265.200710] [<c03cdbbf>] (timer_cpu_notify+0x14b/0x1f8) from [<c0036f85>] (notifier_call_chain+0x45/0x54)
 [ 1265.228948] [<c0036f85>] (notifier_call_chain+0x45/0x54) from [<c001ebbd>] (__cpu_notify+0x1d/0x34)
 [ 1265.255644] [<c001ebbd>] (__cpu_notify+0x1d/0x34) from [<c001eccd>] (cpu_notify_nofail+0xd/0x14)
 [ 1265.281572] [<c001eccd>] (cpu_notify_nofail+0xd/0x14) from [<c03cb7ef>] (_cpu_down+0xb7/0x1b4)
 [ 1265.306986] [<c03cb7ef>] (_cpu_down+0xb7/0x1b4) from [<c03cb907>] (cpu_down+0x1b/0x30)
 [ 1265.330348] [<c03cb907>] (cpu_down+0x1b/0x30) from [<c03cc04b>] (store_online+0x27/0x50)
 [ 1265.354230] [<c03cc04b>] (store_online+0x27/0x50) from [<c027e361>] (dev_attr_store+0x11/0x18)
 [ 1265.379650] [<c027e361>] (dev_attr_store+0x11/0x18) from [<c00f0713>] (sysfs_write_file+0xeb/0x114)
 [ 1265.406349] [<c00f0713>] (sysfs_write_file+0xeb/0x114) from [<c00acbd7>] (vfs_write+0x73/0x10c)
 [ 1265.432024] [<c00acbd7>] (vfs_write+0x73/0x10c) from [<c00ace77>] (SyS_write+0x2f/0x50)
 [ 1265.455650] [<c00ace77>] (SyS_write+0x2f/0x50) from [<c000ce01>] (ret_fast_syscall+0x1/0x46)
 [ 1265.480544] Code: 686b 4629 682a 4638 (6053) 601a
 [ 1265.494668] ---[ end trace ee27e96cfb85187d ]---
 [ 1265.508273] Kernel panic - not syncing: Fatal exception
 [ 1265.523933] CPU2: stopping

Basil Eljuse (basil-eljuse) wrote :

Hi Naresh,

When i tested 13.11RC1 I was not able to reproduce this issue at all in an MP kernel.

I was running both
cpu_hotplug_latency_with_load & cpu_hotplug_latency_without_load and over multiple runs and they seem to pass consistently.

Since i was not able to access the LAVA job above in your previous post, can you please confirm if this crash was observed on an MP kernel?

If so what other tests were run in this cycle.

In my test cycle I had my full set of functional tests ran where this cpuhotplug tests were the last in the list. Trying to understand if reproducbility might increase if the tests were run in any particular order or not.

Hi Basil,

This bug occurs very rarely.
This bug is not specific to MP or IKS. I have seen this bug on the both the kernels when we were build IKS and MP separately.

The probability of bug occurrence is CPU hotplug/ and unplug in a loop. You can ref the bug description for more info. Where i was doing hotplug and unplug available CPUs in random order.

The reason we found this found while running cpu_hotplug_latency_with_load & cpu_hotplug_latency_without_load is that test case performs CPU hotplug/unplug continuously in a loop to measure the latency of CPU hotplug.

This is the one of the evidence to confirm it happended when "cpu_down" called in Kernel.

[ 1265.306986] [<c03cb7ef>] (_cpu_down+0xb7/0x1b4) from [<c03cb907>] (cpu_down+0x1b/0x30)
[ 1265.330348] [<c03cb907>] (cpu_down+0x1b/0x30) from [<c03cc04b>] (store_online+0x27/0x50)

-Naresh

Basil Eljuse (basil-eljuse) wrote :
Download full text (3.3 KiB)

Hi Naresh,

Thanks for the clarification. This bug is now reproducible to me also on the non-psci firmware.

I ran the 2 tests from your cpuhotplug suite in a 50 iteration sequence as below

./testrunner --run releasetest --n 50 --testcase cpu_hotplug_latency_with_load --suite cpuhotplug_suite --verbose
./testrunner --run releasetest --n 50 --testcase cpu_hotplug_latency_without_load --suite cpuhotplug_suite --verbose

And the test crashed with the following backtrace info. I wonder if this is something to do with the gic irq handler!

[26142.733262] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.10.19-55771-g20a9594 #1
 [26142.754821] [<c0011c8d>] (unwind_backtrace+0x1/0x9c) from [<c000fc01>] (show_stack+0x11/0x14)
 [26142.779962] [<c000fc01>] (show_stack+0x11/0x14) from [<c0011047>] (handle_IPI+0xe3/0x1b8)
 [26142.804078] [<c0011047>] (handle_IPI+0xe3/0x1b8) from [<c00084cf>] (gic_handle_irq+0x4b/0x50)
 [26142.829220] [<c00084cf>] (gic_handle_irq+0x4b/0x50) from [<c000c95b>] (__irq_svc+0x3b/0x5c)
 [26142.853841] Exception stack(0xef0cbf18 to 0xef0cbf60)
 [26142.868725] bf00: 00000000 00000018
 [26142.892837] bf20: 278edb1a 00000000 c16ba460 c16ba464 00000001 c0606f00 c06761e8 c0614ac8
 [26142.916949] bf40: 00000004 ef0ca000 5290e099 ef0cbf60 c02e7491 c02d88fa 60000173 ffffffff
 [26142.941071] [<c000c95b>] (__irq_svc+0x3b/0x5c) from [<c02d88fa>] (bl_enter_powerdown+0x4e/0x94)
 [26142.966731] [<c02d88fa>] (bl_enter_powerdown+0x4e/0x94) from [<c02d75c3>] (cpuidle_enter_state+0x2b/0xa8)
 [26142.994953] [<c02d75c3>] (cpuidle_enter_state+0x2b/0xa8) from [<c02d76b9>] (cpuidle_idle_call+0x79/0x140)
 [26143.023173] [<c02d76b9>] (cpuidle_idle_call+0x79/0x140) from [<c000d9d1>] (arch_cpu_idle+0xd/0x28)
 [26143.049605] [<c000d9d1>] (arch_cpu_idle+0xd/0x28) from [<c00502c5>] (cpu_startup_entry+0x5d/0x164)
 [26143.076031] [<c00502c5>] (cpu_startup_entry+0x5d/0x164) from [<800081b5>] (0x800081b5)
 [26143.099372] IPI backtrace for cpu 0
 [26143.109637]
 [26143.114008] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.19-55771-g20a9594 #1
 [26143.135547] task: c05e0388 ti: c05d4000 task.ti: c05d4000
 [26143.151455] PC is at bl_enter_powerdown+0x4e/0x94
 [26143.165308] LR is at arch_counter_read+0x15/0x18
 [26143.178904] pc : [<c02d88fa>] lr : [<c02e7491>] psr: 60000173
 [26143.178904] sp : c05d5f30 ip : 5290e099 fp : c05d4000
 [26143.212742] r10: 00000000 r9 : c0614ac8 r8 : c06761e8
 [26143.228129] r7 : c0606f00 r6 : 00000001 r5 : c1696464 r4 : c1696460
 [26143.247360] r3 : 00000000 r2 : 278eecae r1 : 00000018 r0 : 00000000
 [26143.266591] Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA Thumb Segment kernel
 [26143.288897] Control: 50c5387d Table: ad80006a DAC: 00000015

However now i have given more runs with both psci and non-psci firmware to see how reproducible is this across the firmware variants.

Please note that I normally run pm-qa tests also which has cpu_hotplug_08 script which does random hotplug of cpus over 100 iterations. Also the non-functional tests do have cpuhotplug tests which unplugs known cpus from either of the domains. One difference i noticed with your cpuhotplug suite is that you have got traces als...

Read more...

Basil Eljuse (basil-eljuse) wrote :

Hi Naresh,

I can confirm that the problem is reproducible with psci firmware also.

In that run it seems to give some additional crash info about null pointer derefernce:

[ 1307.348570] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
 [ 1307.372435] pgd = c0004000
 [ 1307.380400] [0000000c] *pgd=00000000
 [ 1307.390942] Internal error: Oops: 5 [#1] SMP THUMB2
 [ 1307.405315] Modules linked in:
 [ 1307.414322] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.19-55771-g20a9594 #1
 [ 1307.435869] task: ef0a8a40 ti: ef0c4000 task.ti: ef0c4000
 [ 1307.451787] PC is at get_next_timer_interrupt+0x7e/0x208
 [ 1307.467440] LR is at get_next_timer_interrupt+0x55/0x208
 [ 1307.483095] pc : [<c0026d56>] lr : [<c0026d2d>] psr: 000f00b3
 [ 1307.483095] sp : ef0c5f20 ip : c169d808 fp : 00000001
 [ 1307.516943] r10: ef0d6000 r9 : c062c148 r8 : 000188ec
 [ 1307.532339] r7 : 400188eb r6 : c16a2a40 r5 : 000000f5 r4 : 80000000
 [ 1307.551578] r3 : 00000000 r2 : 000188ea r1 : ef0d67bc r0 : 000000ea
 [ 1307.570818] Flags: nzcv IRQs off FIQs on Mode SVC_32 ISA Thumb Segment kernel
 [ 1307.593133] Control: 50c5387d Table: acb8406a DAC: 00000015

Basil Eljuse (basil-eljuse) wrote :

Hi Naresh,

As part of isolating the problem, based on the advice from developers at my end i disabled the trace commands from the test code and tried rerunning the same tests as 50 iteration before.

The tests declared a failure at the end but crucially did not crash.

So clearly this issue can be isolated to a problem with the tracing component rather than cpu hotplug logic. This kind of concurs with the observation that cpuhotplug tests from both the pm-qa and non-functional test suite does not show this failure.

So i would suggest to refocus on this issue with the tracing logic in mind.

Tixy (Jon Medhurst) (tixy) wrote :

When trying to reproduce this on the 13.11 release image I notice that the functions the testcase is trying to trace _do_ exist. I.e.we're not affected by bug #1204116 whereby the tracepoints don't get inserted because the functions have been optimised out by the compiler. So this means the function trace framework will be being called from cpu_down, cpu_up and other places.

When I ran the tests with the following:

   ./testrunner --run releasetest --n 50 --testcase cpu_hotplug_latency_with_load --suite cpuhotplug_suite --verbose

I got a crash after 100 minutes with a nullptr derefence in get_next_timer_interrupt called from tick_nohz_idle_enter. (See attached 'serial.log' file).

Looking at the console output I see this happened one second after the 2028th cpu enable which if my maths is right would be in the 11th test run in the 2nd outer loop iteration and 8th inner loop iteration, i.e. not at any special point in particular. And the fact it happened one second after the enable means it should have been just after the final 'sleep 1' in the inner loop, which is at a point where its not doing anything with trace enabling or cpu hotplugging, just iterating around the loop and echoing some output before another 'sleep 1'.

Here is one more scenario, kernel crashed just after the boot.

This test was performed with 13.11 release build and LAVA vexpress-tc2-02. This means that LAVA TC2 devices are configured to enable IKS switcher by default on tc2 by using kernel cmd argument "no_bL_switcher=0". one can see the boot log which is initializing. ( big.LITTLE switcher initializing)

1. boot successful
2. dmesg shows at 58 Second time stamp
  - CPU PMU: unable to set irq affinity (irq=100, cpu=0)
  - CPU PMU: unable to set irq affinity (irq=101, cpu=1)
I have seen this kernel messages while running perf test cases on LSK kernel. But I am not sure why these messages are popped up without running perf test cases.

3. Kernel crashed at 63 Second time stamp.
Please find attached kernel-crash-log-after-boot

[ 0.000000] Booting Linux on physical CPU 0x100
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.10.20-00297-g6144a4a (jenkins-build@ip-10-164-52-202) (gcc version 4.8.3 20131111 (prerelease) (Linaro GCC 4.8-2013.11) ) #1 SMP Sun Nov 24 06:47:43 UTC 2013

...
[ 63.153049] Unable to handle kernel paging request at virtual address 1e000002
[ 63.174371] pgd = c0004000
[ 63.182337] [1e000002] *pgd=00000000
[ 63.192876] Internal error: Oops: 5 [#1] SMP THUMB2
[ 63.207238] Modules linked in: gator
[ 63.217778] CPU: 1 PID: 2360 Comm: iptables Not tainted 3.10.20-00297-g6144a4a #1
[ 63.239829] task: eeb01080 ti: ee55a000 task.ti: ee55a000
[ 63.255733] PC is at check_tty_count+0x28/0x94
[ 63.268816] LR is at check_tty_count+0x1d/0x94
[ 63.281898] pc : [<c0273d6c>] lr : [<c0273d61>] psr: 00070033
[ 63.281898] sp : ee55be70 ip : 010cc000 fp : eebbfd88
[ 63.315736] r10: 00000000 r9 : 00000008 r8 : ef029690
[ 63.331123] r7 : eef4f2b0 r6 : c0435604 r5 : ef218400 r4 : 00000002
[ 63.350354] r3 : ef21857c r2 : 1e000002 r1 : 00000000 r0 : c06a7958
[ 63.369587] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user
[ 63.391123] Control: 50c5387d Table: ae73806a DAC: 00000015
[ 63.408048]
[ 63.408048] PC: 0xc0273cec:
...
[ 65.836271] [<c0273d6c>] (check_tty_count+0x28/0x94) from [<c0275b15>] (tty_release+0x39/0x420)
[ 65.861915] [<c0275b15>] (tty_release+0x39/0x420) from [<c00b41d1>] (__fput+0x65/0x164)
[ 65.885509] [<c00b41d1>] (__fput+0x65/0x164) from [<c00327bd>] (task_work_run+0x89/0xac)
[ 65.909358] [<c00327bd>] (task_work_run+0x89/0xac) from [<c00203df>] (do_exit+0x243/0x72c)
[ 65.933718] [<c00203df>] (do_exit+0x243/0x72c) from [<c002137f>] (do_group_exit+0x2f/0x80)
[ 65.958077] [<c002137f>] (do_group_exit+0x2f/0x80) from [<c00213e3>] (SyS_exit_group+0x13/0x14)
[ 65.983720] [<c00213e3>] (SyS_exit_group+0x13/0x14) from [<c000cc01>] (ret_fast_syscall+0x1/0x46)
[ 66.009872] Code: f505 73be 429a d003 (6812) 3401
[ 66.023997] ---[ end trace da84db0ff0b4fd92 ]---
[ 66.037600] Kernel panic - not syncing: Fatal exception
[ 67.077989] SMP: failed to stop secondary CPUs

Tixy (Jon Medhurst) (tixy) wrote :

After applying the power down cache flushing fixes in http://lists.linaro.org/pipermail/linaro-kernel/2013-December/009236.html
A test run with cpu_hotplug_latency test took 10 hours before crashing. It had completed a full 50 itterations of cpu_hotplug_latency_with_load and was half way through the _without_load version.

Crash didn't look like previous ones, quite possibly another indication of memory corruption...

[35686.048470] Internal error: Oops - bad syscall: ef0c6000 [#1] SMP THUMB2
[35686.048796] Unable to handle kernel paging request at virtual address 00200200
[35686.048804] pgd = c0004000
[35686.048816] [00200200] *pgd=00000000
[35686.107999] Modules linked in:
[35686.117010] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.20-00004-gc12d235 #1
[35686.138563] task: ef0a7540 ti: ef0c6000 task.ti: ef0c6000
[35686.154476] PC is at 0xef0d6670
[35686.163724] LR is at 0xffd6dfe5
[35686.172980] pc : [<ef0d6670>] lr : [<ffd6dfe5>] psr: 60000153

Alex Shi (alex-shi) wrote :

Answer #41
I can't see connection between cpu_idle null ptr and your patchset. If it is a cache/memory coherence issue. it would happened any where. and most of time, there is no any panic pop up. just kernel hang without any output.

Tixy (Jon Medhurst) (tixy) wrote :

Responding to comment #42

The crashes reported don't seem to have reproducible symptoms and are triggered by test code which is hotplugging cpus, so it seemed reasonable to hope cache coherency bugs in the cpu power down code might be the cause.

As regard to "most of time, there is no any panic pop up. just kernel hang without any output", I wasn't aware that this was what people are seeing, the only problems I'd seen and what Naresh and Basil's latest comments seem to be about, are crashes after repeatedly running the cpu_hotplug_latency_with{out}_load test.

Tixy (Jon Medhurst) (tixy) wrote :

As the cpu_hotplug_latency tests only do 1 hotplug every 3 seconds and
take hours to reproduce the crash I thought I would try hammering the
system a bit harder. Using the scripts in the original bug report
description (without the toggling of bL_switcher as that seems to cause
problems) this has hotplugged cpus 100,000 times in about a hour without
crashes or lockups. I then added function trace commands to do the same
sort of things as cpu_hotplug_latency tests do and that didn't cause
problems either.

I'm testing today's LSK build [1] (which has the cache maintenance fixes
in) and have executed disablesuspend.sh beforehand to that the Linaro
animated wallpaper is creating bursts of CPU activity.

[1] https://android-build.linaro.org/builds/~linaro-android/vexpress-lsk/#build=100

Tixy (Jon Medhurst) (tixy) wrote :

I can't reproduce the problems with toggling of bL_switcher I alluded to in the previous comment. From memory, the problem manifest as a stack bactrace and warning about not being able to bring a cpu online or offline. The warnings were non fatal and the test script and system carried on functioning.

Tixy (Jon Medhurst) (tixy) wrote :

test case:
http://people.linaro.org/~naresh.kamboju/cpu_hotplug_test.sh

LAVA job running above test case:
http://validation.linaro.org/scheduler/job/90497/log_file#L_37_430

Above test could take several hours to finish.

Repeatedly hotplugging CPU test executed for 41 hours and did not see kernel crash on LSK build [1]. Which confirms the kernel crash reported in above comments are not because of repeated cpu hotplug.

Test case:
http://people.linaro.org/~naresh.kamboju/cpu_hotplug_test.sh

LAVA job completed above test case:
http://validation.linaro.org/scheduler/job/90497

LSK Android build:
[1] https://android-build.linaro.org/builds/~linaro-android/vexpress-lsk/#build=100

Tixy (Jon Medhurst) (tixy) wrote :

Just reproduced this on the 13.12 release image...

- I created a clean 13.12 LSK Android image and copied script for comment #44 to the 'sdcard' partition.
- I booted image from USB on TC2 and waited a minute or so for most CPU activity to stop.
- Ran disablesuspend.sh so that the Linaro animated wallpaper woul run continuously.
- Connected to vexpress from my PC using ADB.
- Ran 'adb shell /sdcard/hotplug.sh'
- After realising I wasn't logging output from that command I stopped it with Ctrl-C
- Ran 'adb shell /sdcard/hotplug.sh >~/hotplug.log'
- Shortly after the kernel carashed with a NULL pointer exception in timer_softirq (see attached log)

Tixy (Jon Medhurst) (tixy) wrote :

Trying again with exactly the same steps as in my previous comment didn't result in any crash after 4 hours and 100,000 or so hotplugs.

So this problem is back...

I've attached two patches: one reverts a feature from Chris R. that has nothing to do with the crash itself. It only permits the other patch written by Morten to apply without too much trouble. Note that both patches had to be massaged.

This solution is based on observations I made back in June of 2013. At the time removing the feature that mandates processes to wait a specific amount of time before migrating to another HMP domain fixed the problem. Unfortunately that didn't make it to the official tree we are working with.

I suggest the patches be applied and the hotplug test restarted multiple times. If we can't reproduce the problem then I'll investigate what in the original feature (process migration delay) causes "prev->next" to be in the weeds.

Chris Redpath (chris-redpath) wrote :

Hi Mathieu,

For me, the symptoms don't really support the scheduler being the cause of this issue, although I can completely accept that since we do more work on wakeup/reschedule we will change the relative probabilities of hitting any pre-existing race conditions. The sched clock accesses are restricted in the current patch set so that we don't use offline CPU clocks, so that old bug should not be present either.

Do you have a handle on how many of the crashes are in the timer softirq handler versus other bad-looking pointers? I see a few different ones over the course of the investigation here. Also, the previous successful reproduction on snowball without HMP present makes me think the issue might be a very tiny race window somewhere in generic hotplug code, possibly only on v7 platforms or the x86 people would have seen it.

I know you have looked in detail at this issue when it's been present in previous releases - did you do a review of the timer migration code for hotplug? It almost feels like some timer has not completed migrating before the softirq is delivered for it.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers