Kernel panic when trying to offline CPU1

Bug #707003 reported by Philipp Zabel
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-ti-omap4 (Ubuntu)
Fix Released
Medium
Paolo Pisati
Maverick
Won't Fix
Undecided
Paolo Pisati
Natty
Fix Released
Medium
Paolo Pisati

Bug Description

Running Maverick with the kernel from the TI OMAP trunk PPA (linux-ti-omap4 2.6.35-980.1release9), there is a kernel panic shortly after trying to offline CPU1:

$ echo 0 | sudo tee /sys/devices/system/cpu/cpu1/online

-->

[ 109.583618] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 109.592071] pgd = c0004000
[ 109.594879] [00000008] *pgd=00000000
[ 109.598632] Internal error: Oops: 17 [#1] PREEMPT SMP
[ 109.603881] last sysfs file: /sys/devices/system/cpu/cpu1/online
[ 109.610137] Modules linked in: fuse omap_gpu dm_crypt bt_drv(C) st_drv(C) rfcomm tiwlan_drv l2cap sco bluetooth rfkill twl4030_pwrbutton sdio dm_mirror dm_region_hash dm_log btrfs
[ 109.626892] CPU: 0 Tainted: G C (2.6.35-980-omap4 #1release9-Ubuntu)
[ 109.634735] PC is at snd_ctl_elem_list+0x180/0x24c
[ 109.639739] LR is at cpu_idle+0x4c/0xd0
[ 109.643737] pc : [<c0445288>] lr : [<c0045270>] psr: 60000113
[ 109.643737] sp : c079bfd0 ip : 00773612 fp : 00000000
[ 109.655700] r10: 00000000 r9 : 411fc092 r8 : 80034680
[ 109.661163] r7 : c07ab0cc r6 : c003702c r5 : 00000000 r4 : c079a000
[ 109.667938] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : c0037030
[ 109.674743] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 109.682373] Control: 10c53c7d Table: a6fc804a DAC: 00000015
[ 109.688354] Process swapper (pid: 0, stack limit = 0xc079a2f8)
[ 109.694427] Stack: (0xc079bfd0 to 0xc079c000)
[ 109.698974] bfc0: c0853844 c0008c98 c0008f00 00000000
[ 109.707489] bfe0: 271beb3f c0037030 00000000 10c53c7d c07f5ae0 80008080 00000000 00000000
[ 109.716033] [<c0445288>] (snd_ctl_elem_list+0x180/0x24c) from [<c009b740>] (run_timer_softirq+0x0/0x224)
[ 109.725921] [<c009b740>] (run_timer_softirq+0x0/0x224) from [<00000000>] (0x0)
[ 109.733459] Code: e3c13d7f e3c3303f e59d0014 e1a02302 (e5933008)
[ 109.740478] ---[ end trace 3ab1d51f9e4c3baa ]---
[ 109.745300] Kernel panic - not syncing: Attempted to kill the idle task!
[ 109.752319] [<c004a7ac>] (unwind_backtrace+0x0/0xe4) from [<c05829e4>] (panic+0x58/0xe4)
[ 109.760803] [<c05829e4>] (panic+0x58/0xe4) from [<c0090e30>] (do_exit+0x70/0x344)
[ 109.768615] [<c0090e30>] (do_exit+0x70/0x344) from [<c0047b94>] (die+0xe8/0xfc)
[ 109.776245] [<c0047b94>] (die+0xe8/0xfc) from [<c004dd2c>] (__do_kernel_fault+0x64/0x84)
[ 109.784729] [<c004dd2c>] (__do_kernel_fault+0x64/0x84) from [<c0587d50>] (do_page_fault+0x27c/0x2a4)
[ 109.794250] [<c0587d50>] (do_page_fault+0x27c/0x2a4) from [<c00433e8>] (do_DataAbort+0x34/0x98)
[ 109.803344] [<c00433e8>] (do_DataAbort+0x34/0x98) from [<c0585aec>] (__dabt_svc+0x4c/0x60)
[ 109.811950] Exception stack(0xc079bf88 to 0xc079bfd0)
[ 109.817230] bf80: c0037030 00000000 00000000 00000000 c079a000 00000000
[ 109.825775] bfa0: c003702c c07ab0cc 80034680 411fc092 00000000 00000000 00773612 c079bfd0
[ 109.834655] bfc0: c0045270 c0445288 60000113 ffffffff
[ 109.839904] [<c0585aec>] (__dabt_svc+0x4c/0x60) from [<c0445288>] (snd_ctl_elem_list+0x180/0x24c)
[ 109.849182] [<c0445288>] (snd_ctl_elem_list+0x180/0x24c) from [<c009b740>] (run_timer_softirq+0x0/0x224)
[ 109.859069] [<c009b740>] (run_timer_softirq+0x0/0x224) from [<00000000>] (0x0)

Tags: patch armel
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Just tested and this bug doesn't happen with current Natty's kernel.

linux-image-2.6.35-1101-omap4 2.6.35-1101.4

root@panda-natty:~# echo 0 | sudo tee /sys/devices/system/cpu/cpu1/online
0
root@panda-natty:~# dmesg
[ 1926.179595] CPU1: shutdown
root@panda-natty:~# echo 1 | sudo tee /sys/devices/system/cpu/cpu1/online
1
root@panda-natty:~# dmesg
[ 1926.179595] CPU1: shutdown
[ 1931.928710] CPU1: Booted secondary processor

tags: added: armel
Revision history for this message
Bryan Wu (cooloney) wrote :

Oh, weird. I can reproduce it on 2.6.38-rc2 kernel:

http://pastebin.ubuntu.com/558885/

Revision history for this message
Bryan Wu (cooloney) wrote :

Philipp:

Did you see this message in you dmesg:
[ 6634.133392] CPU1: Unknown IPI message 0x1

-Bryan

Revision history for this message
Philipp Zabel (ph5) wrote :

I do not see the "Unknown IPI message" line. The panic occurs a while after shutting down CPU1, not when trying to boot it again.
I had enough time to get the dmesg via ssh. The full log is attached.

Revision history for this message
Tobin Davis (gruemaster) wrote :

I can not reproduce this on the current natty kernel (2.6.35-1101.4).

On maverick with the latest updated kernel (2.6.35-903.21) running this command shuts down the cpu with no dmesg error, however the system doesn't return to a shell prompt. Opening a new terminal and running with 1 reenables the cpu, but also fails to return the shell prompt.

Revision history for this message
Paolo Pisati (p-pisati) wrote :

@Tobin: the attached patch fix the terminal hangs in maverick/ti-omap4, but unfortunately when i try to put the cpu back online i get:

1) a solid hang with no output on console
2) a garbled console interwound with a panic message

from what i could gather from 2, it seems it's the same issue Philipp is experiencing, but i can't get a "clean" panic/stack trace with this kernel.

@Philipp: i can't reproduce your panic:

git://dev.omapzoom.org/pub/scm/integration/kernel-ubuntu.git kernel-ubuntu

i tried different tags/sha:

Ubuntu-2.6.35-980.1release9
Ubuntu-2.6.35-980.1release1
Ubuntu-2.6.35-ti903.13+release3

is it correct?

Revision history for this message
Paolo Pisati (p-pisati) wrote :

reverting 587ba4e34d53149ca9bf4c53265ab6fe7e203b46 fixes the panic (see attached patch).

http://people.canonical.com/~ppisati/lp707003/linux-image-2.6.35-903-omap4_2.6.35-903.22_armel.deb

this is a maverick/ti-omap4 kernel (with the two patches mentioned in this thread applied) that fixes this problem.

Changed in linux-ti-omap4 (Ubuntu):
status: New → In Progress
assignee: nobody → Paolo Pisati (p-pisati)
Revision history for this message
Bryan Wu (cooloney) wrote :

Both putting cpu offline and putting it back to online works on my Panda board with latest 1208.12 natty kernel.

http://pastebin.ubuntu.com/599657/

Changed in linux-ti-omap4 (Ubuntu):
milestone: none → maverick-updates
importance: Undecided → Medium
Paolo Pisati (p-pisati)
Changed in linux-ti-omap4 (Ubuntu Maverick):
assignee: nobody → Paolo Pisati (p-pisati)
status: New → In Progress
Paolo Pisati (p-pisati)
Changed in linux-ti-omap4 (Ubuntu Natty):
status: In Progress → Fix Released
Revision history for this message
Tobin Davis (gruemaster) wrote :

Is this bug going to get fixed for Maverick? I tested the kernel at http://people.canonical.com/~ppisati/lp707003/linux-image-2.6.35-903-omap4_2.6.35-903.22_armel.deb and it works fine.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "0001-oprofile-Fix-the-hang-while-taking-the-cpu-offline.patch" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Tobin Davis (gruemaster) wrote :

Since we only have one more SRU update for Maverick scheduled at this point, I am marking this as "won't fix".

Changed in linux-ti-omap4 (Ubuntu Maverick):
status: In Progress → Won't Fix
Changed in linux-ti-omap4 (Ubuntu):
status: In Progress → Fix Released
milestone: maverick-updates → natty-updates
Changed in linux-ti-omap4 (Ubuntu Natty):
milestone: maverick-updates → natty-updates
Changed in linux-ti-omap4 (Ubuntu Maverick):
milestone: none → maverick-updates
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.