Kernel panic when trying to offline CPU1

Bug #707003 reported by Philipp Zabel on 2011-01-24
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-ti-omap4 (Ubuntu)
Medium
Paolo Pisati
Maverick
Undecided
Paolo Pisati
Natty
Medium
Paolo Pisati

Bug Description

Running Maverick with the kernel from the TI OMAP trunk PPA (linux-ti-omap4 2.6.35-980.1release9), there is a kernel panic shortly after trying to offline CPU1:

$ echo 0 | sudo tee /sys/devices/system/cpu/cpu1/online

-->

[ 109.583618] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 109.592071] pgd = c0004000
[ 109.594879] [00000008] *pgd=00000000
[ 109.598632] Internal error: Oops: 17 [#1] PREEMPT SMP
[ 109.603881] last sysfs file: /sys/devices/system/cpu/cpu1/online
[ 109.610137] Modules linked in: fuse omap_gpu dm_crypt bt_drv(C) st_drv(C) rfcomm tiwlan_drv l2cap sco bluetooth rfkill twl4030_pwrbutton sdio dm_mirror dm_region_hash dm_log btrfs
[ 109.626892] CPU: 0 Tainted: G C (2.6.35-980-omap4 #1release9-Ubuntu)
[ 109.634735] PC is at snd_ctl_elem_list+0x180/0x24c
[ 109.639739] LR is at cpu_idle+0x4c/0xd0
[ 109.643737] pc : [<c0445288>] lr : [<c0045270>] psr: 60000113
[ 109.643737] sp : c079bfd0 ip : 00773612 fp : 00000000
[ 109.655700] r10: 00000000 r9 : 411fc092 r8 : 80034680
[ 109.661163] r7 : c07ab0cc r6 : c003702c r5 : 00000000 r4 : c079a000
[ 109.667938] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : c0037030
[ 109.674743] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 109.682373] Control: 10c53c7d Table: a6fc804a DAC: 00000015
[ 109.688354] Process swapper (pid: 0, stack limit = 0xc079a2f8)
[ 109.694427] Stack: (0xc079bfd0 to 0xc079c000)
[ 109.698974] bfc0: c0853844 c0008c98 c0008f00 00000000
[ 109.707489] bfe0: 271beb3f c0037030 00000000 10c53c7d c07f5ae0 80008080 00000000 00000000
[ 109.716033] [<c0445288>] (snd_ctl_elem_list+0x180/0x24c) from [<c009b740>] (run_timer_softirq+0x0/0x224)
[ 109.725921] [<c009b740>] (run_timer_softirq+0x0/0x224) from [<00000000>] (0x0)
[ 109.733459] Code: e3c13d7f e3c3303f e59d0014 e1a02302 (e5933008)
[ 109.740478] ---[ end trace 3ab1d51f9e4c3baa ]---
[ 109.745300] Kernel panic - not syncing: Attempted to kill the idle task!
[ 109.752319] [<c004a7ac>] (unwind_backtrace+0x0/0xe4) from [<c05829e4>] (panic+0x58/0xe4)
[ 109.760803] [<c05829e4>] (panic+0x58/0xe4) from [<c0090e30>] (do_exit+0x70/0x344)
[ 109.768615] [<c0090e30>] (do_exit+0x70/0x344) from [<c0047b94>] (die+0xe8/0xfc)
[ 109.776245] [<c0047b94>] (die+0xe8/0xfc) from [<c004dd2c>] (__do_kernel_fault+0x64/0x84)
[ 109.784729] [<c004dd2c>] (__do_kernel_fault+0x64/0x84) from [<c0587d50>] (do_page_fault+0x27c/0x2a4)
[ 109.794250] [<c0587d50>] (do_page_fault+0x27c/0x2a4) from [<c00433e8>] (do_DataAbort+0x34/0x98)
[ 109.803344] [<c00433e8>] (do_DataAbort+0x34/0x98) from [<c0585aec>] (__dabt_svc+0x4c/0x60)
[ 109.811950] Exception stack(0xc079bf88 to 0xc079bfd0)
[ 109.817230] bf80: c0037030 00000000 00000000 00000000 c079a000 00000000
[ 109.825775] bfa0: c003702c c07ab0cc 80034680 411fc092 00000000 00000000 00773612 c079bfd0
[ 109.834655] bfc0: c0045270 c0445288 60000113 ffffffff
[ 109.839904] [<c0585aec>] (__dabt_svc+0x4c/0x60) from [<c0445288>] (snd_ctl_elem_list+0x180/0x24c)
[ 109.849182] [<c0445288>] (snd_ctl_elem_list+0x180/0x24c) from [<c009b740>] (run_timer_softirq+0x0/0x224)
[ 109.859069] [<c009b740>] (run_timer_softirq+0x0/0x224) from [<00000000>] (0x0)

Ricardo Salveti (rsalveti) wrote :

Just tested and this bug doesn't happen with current Natty's kernel.

linux-image-2.6.35-1101-omap4 2.6.35-1101.4

root@panda-natty:~# echo 0 | sudo tee /sys/devices/system/cpu/cpu1/online
0
root@panda-natty:~# dmesg
[ 1926.179595] CPU1: shutdown
root@panda-natty:~# echo 1 | sudo tee /sys/devices/system/cpu/cpu1/online
1
root@panda-natty:~# dmesg
[ 1926.179595] CPU1: shutdown
[ 1931.928710] CPU1: Booted secondary processor

tags: added: armel
Bryan Wu (cooloney) wrote :

Oh, weird. I can reproduce it on 2.6.38-rc2 kernel:

http://pastebin.ubuntu.com/558885/

Bryan Wu (cooloney) wrote :

Philipp:

Did you see this message in you dmesg:
[ 6634.133392] CPU1: Unknown IPI message 0x1

-Bryan

Philipp Zabel (philipp-zabel) wrote :

I do not see the "Unknown IPI message" line. The panic occurs a while after shutting down CPU1, not when trying to boot it again.
I had enough time to get the dmesg via ssh. The full log is attached.

Tobin Davis (gruemaster) wrote :

I can not reproduce this on the current natty kernel (2.6.35-1101.4).

On maverick with the latest updated kernel (2.6.35-903.21) running this command shuts down the cpu with no dmesg error, however the system doesn't return to a shell prompt. Opening a new terminal and running with 1 reenables the cpu, but also fails to return the shell prompt.

Paolo Pisati (p-pisati) wrote :

@Tobin: the attached patch fix the terminal hangs in maverick/ti-omap4, but unfortunately when i try to put the cpu back online i get:

1) a solid hang with no output on console
2) a garbled console interwound with a panic message

from what i could gather from 2, it seems it's the same issue Philipp is experiencing, but i can't get a "clean" panic/stack trace with this kernel.

@Philipp: i can't reproduce your panic:

git://dev.omapzoom.org/pub/scm/integration/kernel-ubuntu.git kernel-ubuntu

i tried different tags/sha:

Ubuntu-2.6.35-980.1release9
Ubuntu-2.6.35-980.1release1
Ubuntu-2.6.35-ti903.13+release3

is it correct?

Paolo Pisati (p-pisati) wrote :

reverting 587ba4e34d53149ca9bf4c53265ab6fe7e203b46 fixes the panic (see attached patch).

http://people.canonical.com/~ppisati/lp707003/linux-image-2.6.35-903-omap4_2.6.35-903.22_armel.deb

this is a maverick/ti-omap4 kernel (with the two patches mentioned in this thread applied) that fixes this problem.

Changed in linux-ti-omap4 (Ubuntu):
status: New → In Progress
assignee: nobody → Paolo Pisati (p-pisati)
Bryan Wu (cooloney) wrote :

Both putting cpu offline and putting it back to online works on my Panda board with latest 1208.12 natty kernel.

http://pastebin.ubuntu.com/599657/

Changed in linux-ti-omap4 (Ubuntu):
milestone: none → maverick-updates
importance: Undecided → Medium
Paolo Pisati (p-pisati) on 2011-04-29
Changed in linux-ti-omap4 (Ubuntu Maverick):
assignee: nobody → Paolo Pisati (p-pisati)
status: New → In Progress
Paolo Pisati (p-pisati) on 2011-05-03
Changed in linux-ti-omap4 (Ubuntu Natty):
status: In Progress → Fix Released
Tobin Davis (gruemaster) wrote :

Is this bug going to get fixed for Maverick? I tested the kernel at http://people.canonical.com/~ppisati/lp707003/linux-image-2.6.35-903-omap4_2.6.35-903.22_armel.deb and it works fine.

The attachment "0001-oprofile-Fix-the-hang-while-taking-the-cpu-offline.patch" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Tobin Davis (gruemaster) wrote :

Since we only have one more SRU update for Maverick scheduled at this point, I am marking this as "won't fix".

Changed in linux-ti-omap4 (Ubuntu Maverick):
status: In Progress → Won't Fix
Changed in linux-ti-omap4 (Ubuntu):
status: In Progress → Fix Released
milestone: maverick-updates → natty-updates
Changed in linux-ti-omap4 (Ubuntu Natty):
milestone: maverick-updates → natty-updates
Changed in linux-ti-omap4 (Ubuntu Maverick):
milestone: none → maverick-updates
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments