Pandaboard fails to boot with LT 3.3 based kernel with SCHED_MC on

Bug #989349 reported by John Rigby
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linaro Ubuntu
Fix Released
High
Fathi Boudra
linaro-landing-team-ti
Fix Released
Critical
Unassigned

Bug Description

packaged lt kernel Linux version 3.3.1-38-linaro-lt-omap fails to boot on 4460 based panda boards

Revision history for this message
John Rigby (jcrigby) wrote :
visibility: private → public
Revision history for this message
Ricardo Salveti (rsalveti) wrote : Re: Pandaboard fails to boot with LT 3.3 based kernel

Actually this is happening with 4430 as well, but not at every boot. Now with 4460 I was never able to boot it completely.

summary: - 4460 panda fails to boot
+ Pandaboard fails to boot with LT 3.3 based kernel
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Using 12.04 image.

Hwpack: http://snapshots.linaro.org/precise/hwpacks/lt-panda-x11-base/10/hwpack_linaro-lt-panda-x11-base_20120426-10_armhf_supported.tar.gz
Rootfs: http://snapshots.linaro.org/precise/images/ubuntu-desktop/119/linaro-precise-ubuntu-desktop-20120426-119.tar.gz

Kernel:

linux-linaro-lt-omap (3.3.1-38.38~lt~ci~00000000000001+1335324366~4f97fc42) oneiric; urgency=low

  [ John Rigby ]
  * Initial 3.3 Linux Linaro aka linux-linaro-q-3.3
  * Rebase on linux-linaro-3.3-rc3-2012.02.1
  * Workaround a config enforce problem
  * Hacked lt-omaponly tree as sauce template for tilt-3.3
  *

  [ Linaro CI ]
  * LINARO: LT: CI: Autogenerated Packaged Kernel
  * Add linaro and ubuntu packaging and sauce
  * linux-linaro-lt-omap 3.3.1-38.38~lt~ci~00000000000001+1335324366
        created from:
        LT omap tree:
        tree:git://git.linaro.org/landing-teams/working/ti/kernel.git
        branch:tilt-3.3
        commit:b59fd629de1fc019c7dc05a03716142ef002f7b4
        CI linux-linaro tree:
        tree:git://git.linaro.org/ubuntu/linux-linaro-q.git
        branch:linaro-shared-only-3.3-se
        commit:6daa5241957db041a62d251ca25813201f734ebe
        Flavour specific package files from:
        tree:git://git.linaro.org/people/jcrigby/linux-lt-ci-pack-info.git
        branch:lt-omap-3.3-package-info
        commit:1a20777e0b1889903778329d1918b816cdf21b7f

 -- Linaro CI <email address hidden> Wed, 25 Apr 2012 03:32:02 +0000

Changed in linaro-ubuntu:
status: New → Confirmed
importance: Undecided → Critical
Revision history for this message
warmcat (andy-warmcat) wrote :

Ugh.

I asked Jassi to check it out.

Revision history for this message
Abhishek Paliwal (abhishek-paliwal) wrote :

4460 - Never able to bootup
4430 - Booted up after 4 attempts.

Bootup Logs: http://paste.ubuntu.com/949110/

Revision history for this message
Jassi Brar (jassisinghbrar) wrote :

Ricardo,

# linux-linaro-lt-omap 3.3.1-38.38~lt~ci~00000000000001+1335324366
        created from:
        LT omap tree:
        tree:git://git.linaro.org/landing-teams/working/ti/kernel.git
        branch:tilt-3.3
        commit:b59fd629de1fc019c7dc05a03716142ef002f7b4
I just tested this with it's omap4plus_defconfig, and am unable to reproduce the problem.

# CI linux-linaro tree:
        tree:git://git.linaro.org/ubuntu/linux-linaro-q.git
        branch:linaro-shared-only-3.3-se
        commit:6daa5241957db041a62d251ca25813201f734ebe
 This commit does lag behind the needed omap2plus_cpufreq.c. rename patches
at tip of tilt-3.3 but still the boot-log doesn't look like it came from this either.

Surely there is problem, but I am unable to reproduce it.
Could you please suggest how do I get the exact kernel to build locally ?

Thanks.

Paul Larson (pwlars)
tags: added: boots linaro-ubuntu panda
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The resulted tree (LT + Sauce + Configs) can be found at:
https://github.com/jcrigby/packaged-linux-linaro-3.3-ci/tree/lt-omap

The config we used at the package side is also tracked at https://github.com/jcrigby/packaged-linux-linaro-3.3-ci/blob/lt-omap/debian.linaro/config/config.common.ubuntu

Could be that the new config is covering some other areas and drivers/features that is causing such issues, will try a few rebuilds locally just with omap4_defconfig to see if it works better.

Revision history for this message
Dmitry Dudkin (ddv) wrote :

Jassi,

I guess that official lt-omap kernel was builded with wrong config file (config.common.ubuntu). And this config file is missing some critical omap4 drivers (like OMAP_SCM_DEV, OMAP_TEMP_SENSOR and etc)

Revision history for this message
David Long (dave-long) wrote :

I build linux-linaro-lt-ti-3.3-2012.04 from the kernel source tarball using omap4plus_defconfig and it build and booted fine. I tried building it with the config from rootfs/boot in the 12.04 rootfs tarball (which is very different) and it will not even build. It gets an undeclared "omap_dvfs_lock" when compiling arch/arm/mach-omap2/smartreflex.c.

David Zinman (dzinman)
Changed in linaro-ubuntu:
milestone: none → 12.05
Revision history for this message
Anmar Oueja (anmar) wrote :

Andy: I know our 3.4 is broken so not to leave Ricardo Hanging, can we please fix this in our 3.3 branch?

Changed in linaro-landing-team-ti:
importance: Undecided → Critical
milestone: none → 2012.05
Revision history for this message
Anmar Oueja (anmar) wrote :

Ricardo: Can you please add your steps for Andy to reproduce?

Revision history for this message
warmcat (andy-warmcat) wrote :

From what Dave and Jassi found, it seems this is not a problem with tilt-3.3 so much as wrong config in Ubuntu build. There isn't a problem with the omap4plus_defconfig the kernel comes with.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Checking again with the config used by the kernel provided by TI (at the ti-omapdev PPA), the only extra config we enabled is SCHED_MC, which is probably causing the boot issue here.

After disabling SCHED_MC at a local test I was able to boot it again on my 4460.

I'm building a newer kernel with the extra configs from omap4plus_defconfig and with sched_mc disable, will post the results once the build is done.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Confirming the issues, the only changes I needed to do to make it to boot at my 4460 again:
-CONFIG_SCHED_MC=y
-CONFIG_SCHED_SMT=y
+# CONFIG_SCHED_MC is not set
+# CONFIG_SCHED_SMT is not set

So can you try to enable SCHED_MC to see if you're able to reproduce the issue locally?

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Any update from the LT on this bug?

Changed in linaro-ubuntu:
importance: Critical → High
summary: - Pandaboard fails to boot with LT 3.3 based kernel
+ Pandaboard fails to boot with LT 3.3 based kernel with SCHED_MC on
Changed in linaro-ubuntu:
milestone: 12.05 → none
Fathi Boudra (fboudra)
Changed in linaro-ubuntu:
milestone: none → 12.06
Revision history for this message
Anmar Oueja (anmar) wrote :

We should test with 3.4 kernel and update the bug accordingly.

Revision history for this message
David Long (dave-long) wrote :

I reproduced this in Linaro 3.3 just by changing SCHED_MC. It hangs right after freeing the initrd memory, and eventually panics after detecting the hang.

I put in a few debug printk's where the problem is happening in init_post() and now it seems to boot successfully most (but not all) of the time. So it looks like we're dealing with a race. There is some synchronization code right before the problem area so I will start by looking there.

Revision history for this message
warmcat (andy-warmcat) wrote :

Dave thanks for looking at it. Does the problem exist on tilt-tracking with SCHED_MC? If the problem is also in tracking we should fix it there and either backport to 3.3 or leave it.

Revision history for this message
David Long (dave-long) wrote :

I've booted tilt-tracking with SCHED_MC a couple dozen times now with no evidence of the problem. So it seems likely it's fixed in this more recent kernel.

Revision history for this message
Dmitry Dudkin (ddv) wrote :

>So it seems likely it's fixed in this more recent kernel.
It is not fixed because no changes from 5/17 . I guess that it work because they just disable many other things. And are you sure that you actually compiled with SCHED_MC because it may be automatically disabled at compile time if some dependencies not found.

Revision history for this message
David Long (dave-long) wrote :

By "later kernel" I meant the 3.4 tilt-tracking kernel. Do you require 3.3?

I am certain I have SCHED_MC in the kernel, as further verified by the existence of :

/sys/devices/system/cpu/sched_mc_power_savings

on my 4460 test board.

Revision history for this message
Dmitry Dudkin (ddv) wrote :

New kernel again??? It is crazy...I need a STABLE kernel...not a LATER kernel. Every new kernel it is new bugs. Last time I hear same with audio problems on 3.1. 3.1 were a good kernel but developers desired to drop it and move to 3.3...And now 3.3 does not work at all on Pandaboard. Do you think that is good solution to fix audio??? No kernel = no audio = no problem :-). Is it your creed?

P.S. 12/05 still use 3.3 kernel and I guess that they require 3.3.

Revision history for this message
warmcat (andy-warmcat) wrote :

Hm well, "does not work at all on PandaBoard" is over-egging the pudding a bit, there was some conflict in config around SCHED_MC. Deconfiguring that or using later kernel get good results.

I am not sure why you are raging about audio but I hope the exercise cleared your arteries and improved your health.

Over time, more and more of the content we previously had to carry ourselves is appearing upstream for OMAP4, what we are doing is retargeting the rest of our patch load continuously on that. This gives us the best chance to keep providing kernels that converge well with best quality upstream content.

What happened here was a delta appeared between ubuntu packaged config and the config we provide and test with.

Revision history for this message
Scott Bambrough (scottb) wrote :

SCHED_MC is now dead AFAIK, is there any point in investigating this further?

Revision history for this message
warmcat (andy-warmcat) wrote :

Since it's workable in tilt-tracking, which will very shortly become tilt-3.4, I closed it as "fix committed".

Changed in linaro-landing-team-ti:
status: New → Fix Committed
Revision history for this message
Fathi Boudra (fboudra) wrote :

Assign to me for testing once the tilit-3.4 kernel package is available.

Changed in linaro-ubuntu:
assignee: nobody → Fathi Boudra (fboudra)
Revision history for this message
Fathi Boudra (fboudra) wrote :

Verified on package: linux-image-3.4.0-1-linaro-lt-omap 3.4.0-1.1~120625232503
From hwpack: http://snapshots.linaro.org/precise/hwpacks/lt-panda-x11-base/139/

Set to Fix-Commited.

Changed in linaro-ubuntu:
status: Confirmed → Fix Committed
warmcat (andy-warmcat)
Changed in linaro-landing-team-ti:
status: Fix Committed → Fix Released
Changed in linaro-ubuntu:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.