Perf-stat tool does not support ipc and ipc_rate monitoring on NVIDIA Grace system

Bug #2063461 reported by Brad Figg
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-nvidia-6.5 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

PROBLEM:

While trying to monitor the ipc group for using perf-stat tool, despite it being supported, I get the following errors:
$ sudo perf list | grep ipc
  ipc
  ipc_rate
  retired_ipc
  spec_ipc

$ sudo -S perf stat -a -M ipc -- sudo -S stress-ng --cpu 0 -t 10s
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (instructions).
/bin/dmesg | grep -i perf may provide additional information.
$ sudo -S perf stat -a -M ipc_rate -- sudo -S stress-ng --cpu 0 -t 10s
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (instructions).
/bin/dmesg | grep -i perf may provide additional information.

However, I can get the groups retired_ipc and spec_ipc to work:
$ sudo -S perf stat -a -M retired_ipc -- sudo -S stress-ng --cpu 0 -t 10s
Value 0 contains non-numeric: ' '

 Performance counter stats for 'system wide':

        96,818,964 INST_RETIRED # 0.58 retired_ipc
       166,601,455 CPU_CYCLES

       0.013516186 seconds time elapsed

$ sudo -S perf stat -a -M spec_ipc -- sudo -S stress-ng --cpu 0 -t 10s
Value 0 contains non-numeric: ' '

 Performance counter stats for 'system wide':

        91,053,297 INST_SPEC # 0.58 spec_ipc
       156,558,810 CPU_CYCLES

       0.009877355 seconds time elapsed

SOLUTION:
Please accept the pull request which cherry-picks the following two upstream commits:

d43f5491210197196458c1454f2be0eb66d3e4d1 perf vendor events arm64: Update stall_slot workaround
 for N2 r0p3
4473949074c35072f598bd525ae51d5455f05745 perf vendor events arm64: Update N2 and V2 metrics and
 events using Arm telemetry repo

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia-6.5/6.5.0-1021.22 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-nvidia-6.5' to 'verification-done-jammy-linux-nvidia-6.5'. If the problem still exists, change the tag 'verification-needed-jammy-linux-nvidia-6.5' to 'verification-failed-jammy-linux-nvidia-6.5'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-nvidia-6.5-v2 verification-needed-jammy-linux-nvidia-6.5
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (75.2 KiB)

This bug was fixed in the package linux-nvidia-6.5 - 6.5.0-1021.22

---------------
linux-nvidia-6.5 (6.5.0-1021.22) jammy; urgency=medium

  * jammy/linux-nvidia-6.5: 6.5.0-1021.21 -proposed tracker (LP: #2065892)

  * Packaging resync (LP: #1786013)
    - [Packaging] debian.nvidia-6.5/dkms-versions -- update from kernel-versions
      (main/2024.04.29)

  * Address out-of-bounds issue when using TPM SPI interface (LP: #2067429)
    - tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer buffer

  * Perf-stat tool does not support ipc and ipc_rate monitoring on NVIDIA Grace
    system (LP: #2063461)
    - perf vendor events arm64: Update stall_slot workaround for N2 r0p3
    - perf vendor events arm64: Update N2 and V2 metrics and events using Arm
      telemetry repo
    - perf jevents: Add a new expression builtin strcmp_cpuid_str()
    - perf jevents metric: Fix type of strcmp_cpuid_str

  * PCI/MSI: Prevent MSI hardware interrupt number truncation (LP: #2065721)
    - PCI/MSI: Prevent MSI hardware interrupt number truncation

  * Apply patch to set CONFIG_EFI_CAPSULE_LOADER=y for arm64 (LP: #2067111)
    - NVIDIA: [Config] EFI: set CAPSULE_LOADER=y for arm64

  * pull-request: Fixes: b2b56a163230 ("gpio: tegra186: Check GPIO pin
    permission before access.") (LP: #2064549)
    - Revert "NVIDIA: SAUCE: Revert "gpio: tegra186: Check GPIO pin permission
      before access.""
    - gpio: tegra186: Fix tegra186_gpio_is_accessible() check

  [ Ubuntu: 6.5.0-41.41 ]

  * mantic/linux: 6.5.0-41.41 -proposed tracker (LP: #2065893)
  * CVE-2024-21823
    - VFIO: Add the SPR_DSA and SPR_IAX devices to the denylist
    - dmaengine: idxd: add a new security check to deal with a hardware erratum
    - dmaengine: idxd: add a write() method for applications to submit work

  [ Ubuntu: 6.5.0-40.40 ]

  * mantic/linux: 6.5.0-40.40 -proposed tracker (LP: #2063709)
  * [Mantic] Compile broken on armhf (cc1 out of memory) (LP: #2060446)
    - Revert "minmax: relax check to allow comparison between unsigned arguments
      and signed constants"
    - Revert "minmax: allow comparisons of 'int' against 'unsigned char/short'"
    - Revert "minmax: allow min()/max()/clamp() if the arguments have the same
      signedness."
    - Revert "minmax: add umin(a, b) and umax(a, b)"
  * Drop fips-checks script from trees (LP: #2055083)
    - [Packaging] Remove fips-checks script
  * alsa/realtek: adjust max output valume for headphone on 2 LG machines
    (LP: #2058573)
    - ALSA: hda/realtek: fix the hp playback volume issue for LG machines
  * Mantic update: upstream stable patchset 2024-03-27 (LP: #2059284)
    - asm-generic: make sparse happy with odd-sized put_unaligned_*()
    - powerpc/mm: Fix null-pointer dereference in pgtable_cache_add
    - arm64: irq: set the correct node for VMAP stack
    - drivers/perf: pmuv3: don't expose SW_INCR event in sysfs
    - powerpc: Fix build error due to is_valid_bugaddr()
    - powerpc/mm: Fix build failures due to arch_reserved_kernel_pages()
    - powerpc/64s: Fix CONFIG_NUMA=n build due to create_section_mapping()
    - x86/boot: Ignore NMIs during very early boot
    - powerpc: pmd_move_must_withdraw() is on...

Changed in linux-nvidia-6.5 (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.