Perf test fails on Pandaboard (3.4 TILT)

Bug #1018092 reported by Ricardo Salveti
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linaro Ubuntu
Won't Fix
Medium
Unassigned
linaro-landing-team-ti
Won't Fix
Undecided
David Long
linux-ti-omap4 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Test case available at lava-tests, and which can also be used from https://code.launchpad.net/~linaro-maintainers/lava-test/lava-test-perf.

./run-perf-test.sh
+ whoami
+ [ root != root ]
+ uname -r
+ cut -f 1 -d-
+ KERNELVER=3.4.0
+ apt-cache search linux-linaro-tools-3.4.0
+ head -1
+ cut -f 1 -d
+ PKGNAME=linux-linaro-tools-3.4.0-1-linaro-llt-origen
+ PERFBIN_PREFIX=/usr/bin/perf_
+ uname -r
+ awk -F - {print $1"-"$2}
+ PERFBIN_VER=3.4.0-1
+ [ ! -e /usr/bin/perf_3.4.0-1 ]
+ echo Performing perf record test...
Performing perf record test...
+ TCID=perf record test
+ perf record -e cycles -o perf-lava-test.data stress -c 4 -t 10
+ tee perf-record.log
stress: info: [5325] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
stress: info: [5325] successful run completed in 10s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.766 MB perf-lava-test.data (~33480 samples) ]
+ grep -ao [0-9]\+[ ]\+samples perf-record.log
+ cut -f 1 -d
+ samples=33480
+ [ 33480 -gt 1 ]
+ echo perf record test : PASS
perf record test : PASS
+ rm perf-record.log
+ echo Performing perf report test...
Performing perf report test...
+ TCID=perf report test
+ perf+ tee perf-report.log
 report -i perf-lava-test.data
# ========
# captured on: Tue Jun 26 20:03:22 2012
# hostname : linaro-ubuntu-desktop
# os release : 3.4.0-1-linaro-lt-omap
# perf version : 3.4.0
# arch : armv7l
# nrcpus online : 2
# nrcpus avail : 2
# cpudesc : ARMv7 Processor rev 10 (v7l)
# total memory : 974156 kB
# cmdline : /usr/bin/perf_3.4.0-1 record -e cycles -o perf-lava-test.data stress -c 4 -t 10
# event : name = cycles, type = 1, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 15, 16 }
# HEADER_CPU_TOPOLOGY info available, use -I to display
# ========
#
# Events: 19K cpu-clock
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
    38.10% stress libc-2.15.so [.] random_r
    36.33% stress libc-2.15.so [.] random
    20.79% stress stress [.] atoll_b
     3.69% stress libc-2.15.so [.] rand
     0.85% stress stress [.] main
     0.16% stress [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
     0.04% stress [kernel.kallsyms] [k] _raw_spin_unlock_irq
     0.03% stress [kernel.kallsyms] [k] __do_softirq
     0.01% stress [kernel.kallsyms] [k] vfp_support_entry
     0.01% stress [kernel.kallsyms] [k] vfp_save_state
     0.01% stress [kernel.kallsyms] [k] lock_acquire
     0.01% stress [kernel.kallsyms] [k] rcu_process_gp_end.isra.23
     0.01% stress [kernel.kallsyms] [k] kfree_skbmem
     0.00% stress [kernel.kallsyms] [k] debug_check_no_locks_freed
     0.00% stress [kernel.kallsyms] [k] filemap_fault
     0.00% stress [kernel.kallsyms] [k] proc_flush_task_mnt

#
# (For a higher level overview, try: perf report --sort comm,dso)
#
+ grep -c -e ^[ ]\+[0-9]\+.[0-9]\+% perf-report.log
+ pcnt_samples=16
+ [ 16 -gt 1 ]
+ echo perf report test : PASS
perf report test : PASS
+ rm perf-report.log perf-lava-test.data
+ echo Performing perf stat test...
Performing perf stat test...
+ TCID=perf stat test
+ perf stat -e cycles stress -c 4 -t 10
+ tee perf-stat.log
  Error: open_counter returned with 19 (No such device). /bin/dmesg may provide additional information.

  Fatal: Not all events could be opened.

+ grep -o [0-9,]\+[ ]\+cycles perf-stat.log
+ sed s/,//g
+ cut -f 1 -d
+ cycles=
+ [ -gt 1 ]
./run-perf-test.sh: 71: [: -gt: unexpected operator
+ echo perf stat test : FAIL
perf stat test : FAIL
+ rm perf-stat.log
+ echo Performing 'perf test'...
Performing 'perf test'...
+ TCID=perf test
+ perf test
+ sed -e s/FAILED!/FAIL/g -e s/Ok/PASS/g -e s/ [0-9]\+:/perf test -/g -e s/:/ :/g
perf test - vmlinux symtab matches kallsyms : FAIL

perf test - detect open syscall event : FAIL

perf test - detect open syscall event on all cpus : FAIL

perf test - read samples using the mmap interface : FAIL

perf test - parse events tests :invalid or unsupported event : 'syscalls :sys_enter_open'
Run 'perf list' for a list of valid events
 FAIL

perf test - Validate PERF_RECORD_* events & perf_sample fields : FAIL

perf test - Test perf pmu format parsing : PASS

Using:
Hwpack: hwpack_linaro-lt-panda-x11-base_20120626-139_armhf_supported.tar.gz
Rootfs: linaro-precise-ubuntu-desktop-20120626-247.tar.gz

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Kernel:
linux-linaro-lt-omap-3.4 (3.4.0-1.1~120625232503) precise; urgency=low

  * Packaged version of lt-omap kernel created from:
    Kernel:
    Repo: git://git.linaro.org/landing-teams/working/ti/kernel.git
    Branch: tilt-3.4
    Head:
        commit 61abde21bf6272908466e2d13acd479743b4f0b3
        Author: Jaswinder Singh <jaswinder.singh.org>
        Date: Fri Jun 22 11:26:22 2012 +0800

            omapdss temp hack ignore resume runtime_pm fail

            Signed-off-by: Jaswinder Singh <jaswinder.singh.org>

    Board config fragment:
    Repo: git://git.linaro.org/kernel/configs.git
    Branch: config-boards-3.4
    Head:
        commit b7274f7e11c782039d94a6662c823e339c21dd50
        Author: Ricardo Salveti de Araujo <ricardo.salveti.org>
        Date: Mon Jun 25 15:30:04 2012 -0300

            configs: omap4: disabling CPU_IDLE due bug 971091

            Signed-off-by: Ricardo Salveti de Araujo <ricardo.salveti.org>

    Ubuntu and Linaro Base config fragments:
    Repo: git://git.linaro.org/kernel/configs.git
    Branch: config-core-3.4
    Head:
        commit 072bd6cb29aa25a3fe0f6be08af48f5a58eae849
        Author: Ricardo Salveti de Araujo <ricardo.salveti.org>
        Date: Mon Jun 25 18:06:06 2012 -0300

            configs: ubuntu: disabling CGROUPS as default

            Not yet stable enough to be used as default for all boards we currently
            support at Linaro.

            Signed-off-by: Ricardo Salveti de Araujo <ricardo.salveti.org>

    Packaging template:
    Repo: git://git.linaro.org/ubuntu/linux-linaro-quantal.git
    Branch: linaro-ubuntu-packaging-3.4
    Head:
        commit 3da2b9971c9fbd9d039bf902ee8d31fdc8cc16f2
        Author: John Rigby <john.rigby.org>
        Date: Mon Jun 25 15:04:23 2012 -0600

            LINARO: template debian.linaro based on omap only version

            Signed-off-by: John Rigby <john.rigby.org>

 -- John Rigby <email address hidden> Mon, 25 Jun 2012 15:04:22 -0600

Changed in linaro-ubuntu:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
warmcat (andy-warmcat) wrote :

Dave please see if you can reproduce on tilt-3.4 / omap4plus_defconfig and/or figure out what if anything we're missing.

Changed in linaro-landing-team-ti:
assignee: nobody → David Long (dave-long)
Revision history for this message
Paolo Pisati (p-pisati) wrote :

affects ubuntu ti-omap4 too

Changed in linux-ti-omap4 (Ubuntu):
status: New → Confirmed
Revision history for this message
warmcat (andy-warmcat) wrote :

Dave said that because we disable hw counters due to errata in the hw, perf -e will fail.

Revision history for this message
Paolo Pisati (p-pisati) wrote :

any way to workaround it?

or shall we just say "it's broken" and mark it as "won't fix"?

Revision history for this message
warmcat (andy-warmcat) wrote :

Well perf should work itself OK using the fallback timer-based method, it's specifically this -e mode.

The hardware counters are in there and do stuff, the problem is (IIRC overflow) interrupts are prone to get lost, distorting the numbers to uselessness, and that problem is in the ARM core and lay undiscovered for a long while so is in all (AFAIK) shipping chips. So we disabled them, as I say normal perf should be workable.

I think we have to, as you suggest, do a WONTFIX on it.

Changed in linaro-landing-team-ti:
status: New → Incomplete
status: Incomplete → Won't Fix
Revision history for this message
David Long (dave-long) wrote :

What Andy said. We chose timer mode in lieu of enabling hardware counters, since the hardware is broken.

Revision history for this message
Joseph Pusdesris (joemp) wrote :

I understand why the performance counters are disabled, but I would like to re-enable them for use with DS-5 streamline. How can I enable them?

Revision history for this message
warmcat (andy-warmcat) wrote :

I'm afraid we don't use DS-5 on our side... if you know a Lukas Snetler in ARM it might be worth having a chat though.

Revision history for this message
Joseph Pusdesris (joemp) wrote :

I mean, I can't find how they were disabled. Looking at the kernel config, profiling and hw perf monitors seems to be enabled. Was some bit of code commented out somewhere in the kernel/drivers?

Revision history for this message
Chris Kenna (cjkenna) wrote :

I am wondering the same thing as Joseph (joemp) in the #10 comment. On my Pandaboard, I see that my kernel is configured correctly, and hw perf events come up in dmesg, like so...

$ uname -a
Linux pandaboard01 3.2.0-1419-omap4 #26-Ubuntu SMP PREEMPT Wed Sep 12 14:32:40 UTC 2012 armv7l armv7l armv7l GNU/Linux

$ grep PERF_EVENTS /boot/config-3.2.0-1419-omap4
CONFIG_HAVE_PERF_EVENTS=y
CONFIG_HW_PERF_EVENTS=y
CONFIG_PERF_EVENTS=y

$ dmesg|grep -i perf
[ 0.072387] Initializing cgroup subsys perf_event
[ 0.164306] hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 counters available

... and yet when I run a `perf stat` command I receive an error:

$ sudo perf stat -e cycles ls /
  Error: open_counter returned with 19 (No such device). /bin/dmesg may provide additional information.

  Fatal: Not all events could be opened.

There is nothing in dmesg.

Revision history for this message
David Long (dave-long) wrote :

I found the magic sauce for getting the PMU event counters working in recent upstream code. On the 4460 the EMU clock has to be forced on. A patch for this, and for missing PMU/CTI interrupt mods is being applied to our 3.4 branch now. With this it will be possible to build a kernel with hardware event counters enabled and use perf to collect counter data. Bear in mind there is still nothing we can do to really work around the problem of occasional lost PMU interrupts, as that is a hardware issue.

Fathi Boudra (fboudra)
Changed in linaro-ubuntu:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.