perf failure on panda (omap4)

Bug #843628 reported by Avik Sil
44
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Linaro Ubuntu
Fix Released
High
Avik Sil
linaro-landing-team-ti
Fix Released
Undecided
David Long

Bug Description

Though perf lists the PMU events but fails to gather performance counter statistics on panda

# perf list

List of pre-defined events (to be used in -e):

  cpu-cycles OR cycles [Hardware event]
  instructions [Hardware event]
  cache-references [Hardware event]
  cache-misses [Hardware event]
  branch-instructions OR branches [Hardware event]
  branch-misses [Hardware event]
  bus-cycles [Hardware event]

  cpu-clock [Software event]
  task-clock [Software event]
  page-faults OR faults [Software event]
[...]

# perf stat sleep 1
  Error: cache-misses event is not supported.
  Fatal: Not all events could be opened.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Which kernel version are you using? Can you also reproduce this with the TI LT kernel (ones available with the lt-panda-hwpacks)?

Changed in linaro-ubuntu:
milestone: none → 11.09
Revision history for this message
Fathi Boudra (fboudra) wrote :
Changed in linaro-ubuntu:
assignee: nobody → Avik Sil (aviksil)
Changed in linaro-ubuntu:
status: New → Confirmed
importance: Undecided → High
Changed in linux-linaro-omap (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Fathi Boudra (fboudra) wrote :

From Dave L. (http://lists.linaro.org/pipermail/linaro-dev/2011-September/007560.html) :

I have been unable to reproduce the boot hang problem after unreverting the interrupt patch. Can those experiencing it please verify they are using the latest released lmc tools to create their SD card? And can you please send me your config file and Pandaboard revision?

It has been suggested that the limited sample count in the given example is consistent with the fact the ARM oprofile code uses HZ as the sampling frequency (even though it uses a separate timer). I'm not yet familiar with the x86 code for setting the profiling interval on the fly but Frederic Turgis's suggesting of doing the same thing on ARM makes a whole lot of sense. We should make this a new requirement.

Once the interrupt issue is resolved I might suggest sampling cpu-cycles as a workaround to real-time sampling granularity, except that there apparently is an issue with reliably getting interrupts from the PMU. Does anyone know if this is still a problem in the A9 (I've only seen it discussed regarding the A8)? If it's still an issue I think it simply kills using PMU event counters with oprofile.

We need to do a little work to make configuring hardware events counters into the kernel easier. A recent change means that you need to set at least a couple independent config options for this. This should be simple to fix.

Revision history for this message
Vinod Krishnamoni (vinod-krishnamoni) wrote :

Does the fact that this is not attached to any milestone for linux-linaro-omap (Ubuntu) mean the fix will not be delivered into a linux-linaro-omap (Ubuntu) release?

Changed in linaro-ubuntu:
milestone: 11.09 → 11.10
Fathi Boudra (fboudra)
Changed in linaro-ubuntu:
milestone: 11.10 → 11.11
Revision history for this message
hehemouse (hehemouse) wrote :

I can't get any PMU interrupt on pandaboard, although I am not using perf or oprofile. I am just write my own code to operate the PMU. The overflow register shows that overflow happens, but the PMU interrupt is not triggered. Is this a bug of hardware or a bug of BSP?

I am using source code in http://git.linaro.org/gitweb?p=kernel/linux-linaro-2.6.39.git;a=summary, but with the following patch:
http://lists.linaro.org/pipermail/linaro-dev/2011-April/004019.html

Fathi Boudra (fboudra)
Changed in linaro-ubuntu:
milestone: 11.11 → none
Revision history for this message
Al Grant (al-grant) wrote :

Why was the milestone removed?

Revision history for this message
Fathi Boudra (fboudra) wrote : Re: [Bug 843628] Re: perf failure on panda (omap4)

On 17 November 2011 19:25, Al Grant <email address hidden> wrote:
> Why was the milestone removed?

It has been removed for Linaro Ubuntu project. Developer Platform
isn't working actively to resolve this bug. We rely on TI LT to get it
fixed. Dave Long was looking initially at the issue but we don't have
received any ETA since a long time.

Revision history for this message
David Long (dave-long) wrote :

The plan is to revert the revert that broke this very shortly. That change will require soak time in case the boot failure problem that caused the revert in the first place resurfaces. Therefore we are targeting December for release of this change.

Changed in linaro-landing-team-ti:
assignee: nobody → David Long (dave-long)
Anmar Oueja (anmar)
Changed in linaro-landing-team-ti:
status: New → In Progress
Revision history for this message
warmcat (andy-warmcat) wrote :

Guys on tilt-tracking we have a patch from Dave Long

http://git.linaro.org/gitweb?p=landing-teams/working/ti/kernel.git;a=patch;h=32a92107a7137ea32a2209f83dd4bae42ad4ec7c

that re-enables these interrupts for 4430 Panda. We don't get any boot crashes so far.

It doesn't hurt 4460 Panda, but it doesn't succeed get any results on that yet.

tilt-tracking -->

http://git.linaro.org/gitweb?p=landing-teams/working/ti/kernel.git;a=shortlog;h=refs/heads/tilt-tracking

Changed in linaro-ubuntu:
milestone: none → 11.12
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Seems perf is completely broken with tilt-linux-linaro-3.1 this time.

This is what I'm getting with kernel 3.1.0-1402-linaro-lt-omap #5rsalveti1, based on today's tilt-linux-linaro-3.1 head: http://paste.ubuntu.com/770821/

Revision history for this message
Paweł Moll (pawel-moll) wrote :
Revision history for this message
Will Deacon (will-deacon) wrote :

Hi guys,

This looks like the same problem the imx guys were seeing in:

https://bugs.launchpad.net/linaro-landing-team-freescale/+bug/893653

I'll mail Nico on linaro-dev about including the fix from mainline. Note that OMAP isn't providing PMU platform_device so perf isn't going to work with these fixes, you'll need to pick some more out of linux-next and off the list for that.

Will

Revision history for this message
warmcat (andy-warmcat) wrote :

About the platform_device, In tracking, for 4430 it's enabled and believed workable. We can't get the unit to start up on 4460 yet.

Revision history for this message
warmcat (andy-warmcat) wrote :

Will, I just updated tilt-linux-linaro-3.1 with latest linux-linaro-3.1 from Nicolas tonight.

Revision history for this message
warmcat (andy-warmcat) wrote :

... and tilt-linux-linaro-3.1 now has the platform device patch in via Dave.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Avik, can you verify this bug with the 11.12 RC image and post back the results? Thanks.

Revision history for this message
Avik Sil (aviksil) wrote :

With 11.12 RC image, perf seems to work fine:

# perf stat sleep 1

 Performance counter stats for 'sleep 1':

         11.810303 task-clock # 0.011 CPUs utilized
                 1 context-switches # 0.000 M/sec
                 0 CPU-migrations # 0.000 M/sec
               148 page-faults # 0.013 M/sec
           3514014 cycles # 0.298 GHz
                 0 stalled-cycles-frontend # 0.00% frontend cycles idle
                 0 stalled-cycles-backend # 0.00% backend cycles idle
           1777780 instructions # 0.51 insns per cycle
            197443 branches # 16.718 M/sec
             71889 branch-misses # 36.41% of all branches

       1.037902836 seconds time elapsed

Changed in linaro-ubuntu:
status: Confirmed → Fix Committed
Changed in linaro-ubuntu:
status: Fix Committed → Fix Released
warmcat (andy-warmcat)
Changed in linaro-landing-team-ti:
status: In Progress → Fix Released
Revision history for this message
warmcat (andy-warmcat) wrote :

Dave's also found the remaining 4460-specific PMU init issue, and his fix

http://git.linaro.org/gitweb?p=landing-teams/working/ti/kernel.git;a=patch;h=fab7a4a5a0cb4c8fc5692970be6e62aece99599c

is on tilt-3.1, tilt-linux-linaro-3.1 (linaro ubuntu basis) and tilt-tracking.

Fathi Boudra (fboudra)
no longer affects: linux-linaro-omap (Ubuntu)
no longer affects: linux-linaro
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.