[LTCTest][OPAL][OP920] cpupower idle-info is not listing stop4 and stop5 idle states when all CORES are guarded

Bug #1771780 reported by bugproxy on 2018-05-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Critical
Canonical Kernel Team
linux (Ubuntu)
Critical
Joseph Salisbury
Bionic
Critical
Joseph Salisbury
Cosmic
Critical
Joseph Salisbury

Bug Description

== SRU Justification ==
During testing, IBM found that cpupower idle-info is not listing stop4 and
stop5 idle states when all CORES are guarded. A patch has been
submitted upstream by IBM. However, the patch has not landed in
linux-next or mainline as of yet. For that reason, this SRU request is
being sent as a SAUCE patch request.

== Fix ==
UBUNTU: SAUCE: cpuidle/powernv : init all present cpus for deep states

== Regression Potential ==
Low. Limited to powerpc.

== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

== Comment: #0 - PAVAMAN SUBRAMANIYAM - 2018-05-16 04:07:59 ==
---Problem Description---
cpupower idle-info is not listing stop4 and stop5 idle states when all CORES are guarded

---uname output---
Linux ltc-wspoon11 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = P9

---Debugger---
A debugger is not configured

---Steps to Reproduce---
Install a P9 Open Power Hardware with Ubuntu 18.04 OS.

root@ltc-wspoon11:~# uname -a
Linux ltc-wspoon11 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

root@ltc-wspoon11:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Then guard an entire processor and also guard all the CORES in the processor 0 except for 1 single core.

root@ltc-wspoon11:~# opal-gard list
 ID | Error | Type | Path
-----------------------------------------------------------------------
 00000001 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core0
 00000002 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core1
 00000003 | 00000000 | Manual | /Sys0/Node0/Proc1
 00000004 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core1
 00000005 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core0
 00000006 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core0
 00000007 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core1
 00000008 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core0
 00000009 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core1
 0000000a | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core0
 0000000b | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core1
 0000000c | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core0
 0000000d | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core1
 0000000e | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core0
 0000000f | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core1
 00000010 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core0
 00000011 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core1
 00000012 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core0
 00000013 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core1
 00000014 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core0
 00000015 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core1
 00000016 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ0/EX1/Core1
 00000017 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ0/EX1/Core0
=======================================================================

Then execute the cpupower idle-info command to check the idle states being shown in the OS.

root@ltc-wspoon11:~# cpupower idle-info
CPUidle driver: powernv_idle
CPUidle governor: menu
analyzing CPU 0:

Number of idle states: 7
Available idle states: snooze stop0_lite stop0 stop1_lite stop1 stop2_lite stop2
snooze:
Flags/Description: snooze
Latency: 0
Usage: 774653
Duration: 7698954
stop0_lite:
Flags/Description: stop0_lite
Latency: 1
Usage: 2751
Duration: 11363825
stop0:
Flags/Description: stop0
Latency: 2
Usage: 2343
Duration: 915084
stop1_lite:
Flags/Description: stop1_lite
Latency: 5
Usage: 20
Duration: 1533
stop1:
Flags/Description: stop1
Latency: 5
Usage: 1103
Duration: 1016794
stop2_lite:
Flags/Description: stop2_lite
Latency: 10
Usage: 5
Duration: 765
stop2:
Flags/Description: stop2
Latency: 10
Usage: 113729
Duration: 2850877810

Userspace tool common name: /usr/bin/cpupower

The userspace tool has the following bit modes: 64-bit

Userspace rpm: linux-tools-common

Userspace tool obtained from project website: na

== Comment: #8 - Akshay Adiga <email address hidden> - 2018-05-16 13:29:55 ==
Patch is posted on linux mailing list
https://patchwork.ozlabs.org/patch/914575/

CVE References

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-167913 severity-critical targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: triage-g
Manoj Iyer (manjo) on 2018-05-21
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: p9
Changed in ubuntu-power-systems:
status: New → Triaged
Changed in linux (Ubuntu):
status: New → Triaged
Changed in linux (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Critical
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: Triaged → In Progress
Joseph Salisbury (jsalisbury) wrote :

I built a Bionic test kernel with the patch posted in the description. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1771780

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-image-unsigned, linux-modules and linux-modules-extra .deb packages.

Thanks in advance!

Changed in linux (Ubuntu Artful):
importance: Undecided → Critical
Changed in linux (Ubuntu Xenial):
importance: Undecided → Critical
no longer affects: linux (Ubuntu Artful)
no longer affects: linux (Ubuntu Xenial)
Download full text (4.4 KiB)

------- Comment From <email address hidden> 2018-05-22 01:57 EDT-------
I have installed the Bionic test kernel with the patches provided on the machine.

root@ltc-wspoon11:~# uname -a
Linux ltc-wspoon11 4.15.0-20-generic #22~lp1771780 SMP Mon May 21 17:43:29 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Then guard an entire processor and also guard all the CORES in the processor 0 except for 1 single core.

root@ltc-wspoon11:~# ./probe_cpus.sh -L
CHIP ID: 0 CORE ID: 1 THREADS: 4 CPUs: 0 1 2 3
CHIP ID: 0 CORE ID: 2 THREADS: 4 CPUs: 4 5 6 7
CHIP ID: 0 CORE ID: 3 THREADS: 4 CPUs: 8 9 10 11

-----------------------------
p[0]
eq[0]
ex[0,1]
c[1,2,3]
-----------------------------

----------Processor Layout-------------------
p[0]
+---EQ00----+ +---EQ02----+ +---EQ04----+
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
|EX-0 C1 | | | | |
+ - - - - - + + - - - - - + + - - - - - +
|EX-1 C2 | | | | |
+ - - - - - + + - - - - - + + - - - - - +
|EX-1 C3 | | | | |
+-----------+ +-----------+ +-----------+

+---EQ01----+ +---EQ03----+ +---EQ05----+
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
| | | | | |
+-----------+ +-----------+ +-----------+

root@ltc-wspoon11:~# opal-gard list
ID | Error | Type | Path
-----------------------------------------------------------------------
00000001 | 00000000 | Manual | /Sys0/Node0/Proc1
00000002 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core0
00000003 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core1
00000004 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core0
00000005 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core1
00000006 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core1
00000007 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core0
00000008 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core0
00000009 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core1
0000000a | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core0
0000000b | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core1
0000000c | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core0
0000000d | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core1
0000000e | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core0
0000000f | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core1
00000010 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core0
00000011 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core1
00000012 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core0
00000013 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core1
00000014 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core0
00000015 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core1
=======================================================================

Then verified if all t...

Read more...

Joseph Salisbury (jsalisbury) wrote :
description: updated
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-06-06 03:08 EDT-------
Patch has been accepted upstream and is available

https://git.kernel.org/powerpc/c/ac9816dcbab53c57bcf1d7b15370b0

Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
bugproxy (bugproxy) on 2018-06-15
tags: added: verification-done-bionic
removed: verification-needed-bionic
Manoj Iyer (manjo) on 2018-06-18
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (49.5 KiB)

This bug was fixed in the package linux - 4.15.0-24.26

---------------
linux (4.15.0-24.26) bionic; urgency=medium

  * linux: 4.15.0-24.26 -proposed tracker (LP: #1776338)

  * Bionic update: upstream stable patchset 2018-06-06 (LP: #1775483)
    - drm: bridge: dw-hdmi: Fix overflow workaround for Amlogic Meson GX SoCs
    - i40e: Fix attach VF to VM issue
    - tpm: cmd_ready command can be issued only after granting locality
    - tpm: tpm-interface: fix tpm_transmit/_cmd kdoc
    - tpm: add retry logic
    - Revert "ath10k: send (re)assoc peer command when NSS changed"
    - bonding: do not set slave_dev npinfo before slave_enable_netpoll in
      bond_enslave
    - ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
    - ipv6: sr: fix NULL pointer dereference in seg6_do_srh_encap()- v4 pkts
    - KEYS: DNS: limit the length of option strings
    - l2tp: check sockaddr length in pppol2tp_connect()
    - net: validate attribute sizes in neigh_dump_table()
    - llc: delete timers synchronously in llc_sk_free()
    - tcp: don't read out-of-bounds opsize
    - net: af_packet: fix race in PACKET_{R|T}X_RING
    - tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets
    - net: fix deadlock while clearing neighbor proxy table
    - team: avoid adding twice the same option to the event list
    - net/smc: fix shutdown in state SMC_LISTEN
    - team: fix netconsole setup over team
    - packet: fix bitfield update race
    - tipc: add policy for TIPC_NLA_NET_ADDR
    - pppoe: check sockaddr length in pppoe_connect()
    - vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multi
    - amd-xgbe: Add pre/post auto-negotiation phy hooks
    - sctp: do not check port in sctp_inet6_cmp_addr
    - amd-xgbe: Improve KR auto-negotiation and training
    - strparser: Do not call mod_delayed_work with a timeout of LONG_MAX
    - amd-xgbe: Only use the SFP supported transceiver signals
    - strparser: Fix incorrect strp->need_bytes value.
    - net: sched: ife: signal not finding metaid
    - tcp: clear tp->packets_out when purging write queue
    - net: sched: ife: handle malformed tlv length
    - net: sched: ife: check on metadata length
    - llc: hold llc_sap before release_sock()
    - llc: fix NULL pointer deref for SOCK_ZAPPED
    - net: ethernet: ti: cpsw: fix tx vlan priority mapping
    - virtio_net: split out ctrl buffer
    - virtio_net: fix adding vids on big-endian
    - KVM: s390: force bp isolation for VSIE
    - s390: correct module section names for expoline code revert
    - microblaze: Setup dependencies for ASM optimized lib functions
    - commoncap: Handle memory allocation failure.
    - scsi: mptsas: Disable WRITE SAME
    - cdrom: information leak in cdrom_ioctl_media_changed()
    - m68k/mac: Don't remap SWIM MMIO region
    - block/swim: Check drive type
    - block/swim: Don't log an error message for an invalid ioctl
    - block/swim: Remove extra put_disk() call from error path
    - block/swim: Rename macros to avoid inconsistent inverted logic
    - block/swim: Select appropriate drive on device open
    - block/swim: Fix array bounds check
    - block/swim: Fix IO error at end of medium
    -...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers