[LTCTest][OPAL][OP920] cpupower idle-info is not listing stop4 and stop5 idle states when all CORES are guarded

Bug #1771780 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Critical
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Critical
Joseph Salisbury
Bionic
Fix Released
Critical
Joseph Salisbury
Cosmic
Fix Released
Critical
Joseph Salisbury

Bug Description

== SRU Justification ==
During testing, IBM found that cpupower idle-info is not listing stop4 and
stop5 idle states when all CORES are guarded. A patch has been
submitted upstream by IBM. However, the patch has not landed in
linux-next or mainline as of yet. For that reason, this SRU request is
being sent as a SAUCE patch request.

== Fix ==
UBUNTU: SAUCE: cpuidle/powernv : init all present cpus for deep states

== Regression Potential ==
Low. Limited to powerpc.

== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

== Comment: #0 - PAVAMAN SUBRAMANIYAM - 2018-05-16 04:07:59 ==
---Problem Description---
cpupower idle-info is not listing stop4 and stop5 idle states when all CORES are guarded

---uname output---
Linux ltc-wspoon11 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = P9

---Debugger---
A debugger is not configured

---Steps to Reproduce---
Install a P9 Open Power Hardware with Ubuntu 18.04 OS.

root@ltc-wspoon11:~# uname -a
Linux ltc-wspoon11 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

root@ltc-wspoon11:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Then guard an entire processor and also guard all the CORES in the processor 0 except for 1 single core.

root@ltc-wspoon11:~# opal-gard list
 ID | Error | Type | Path
-----------------------------------------------------------------------
 00000001 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core0
 00000002 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core1
 00000003 | 00000000 | Manual | /Sys0/Node0/Proc1
 00000004 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core1
 00000005 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core0
 00000006 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core0
 00000007 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core1
 00000008 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core0
 00000009 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core1
 0000000a | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core0
 0000000b | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core1
 0000000c | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core0
 0000000d | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core1
 0000000e | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core0
 0000000f | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core1
 00000010 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core0
 00000011 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core1
 00000012 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core0
 00000013 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core1
 00000014 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core0
 00000015 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core1
 00000016 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ0/EX1/Core1
 00000017 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ0/EX1/Core0
=======================================================================

Then execute the cpupower idle-info command to check the idle states being shown in the OS.

root@ltc-wspoon11:~# cpupower idle-info
CPUidle driver: powernv_idle
CPUidle governor: menu
analyzing CPU 0:

Number of idle states: 7
Available idle states: snooze stop0_lite stop0 stop1_lite stop1 stop2_lite stop2
snooze:
Flags/Description: snooze
Latency: 0
Usage: 774653
Duration: 7698954
stop0_lite:
Flags/Description: stop0_lite
Latency: 1
Usage: 2751
Duration: 11363825
stop0:
Flags/Description: stop0
Latency: 2
Usage: 2343
Duration: 915084
stop1_lite:
Flags/Description: stop1_lite
Latency: 5
Usage: 20
Duration: 1533
stop1:
Flags/Description: stop1
Latency: 5
Usage: 1103
Duration: 1016794
stop2_lite:
Flags/Description: stop2_lite
Latency: 10
Usage: 5
Duration: 765
stop2:
Flags/Description: stop2
Latency: 10
Usage: 113729
Duration: 2850877810

Userspace tool common name: /usr/bin/cpupower

The userspace tool has the following bit modes: 64-bit

Userspace rpm: linux-tools-common

Userspace tool obtained from project website: na

== Comment: #8 - Akshay Adiga <email address hidden> - 2018-05-16 13:29:55 ==
Patch is posted on linux mailing list
https://patchwork.ozlabs.org/patch/914575/

CVE References

Revision history for this message
bugproxy (bugproxy) wrote : dmesg log is attached

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-167913 severity-critical targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: triage-g
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: p9
Changed in ubuntu-power-systems:
status: New → Triaged
Changed in linux (Ubuntu):
status: New → Triaged
Changed in linux (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Critical
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Bionic):
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Bionic test kernel with the patch posted in the description. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1771780

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-image-unsigned, linux-modules and linux-modules-extra .deb packages.

Thanks in advance!

Changed in linux (Ubuntu Artful):
importance: Undecided → Critical
Changed in linux (Ubuntu Xenial):
importance: Undecided → Critical
no longer affects: linux (Ubuntu Artful)
no longer affects: linux (Ubuntu Xenial)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (4.4 KiB)

------- Comment From <email address hidden> 2018-05-22 01:57 EDT-------
I have installed the Bionic test kernel with the patches provided on the machine.

root@ltc-wspoon11:~# uname -a
Linux ltc-wspoon11 4.15.0-20-generic #22~lp1771780 SMP Mon May 21 17:43:29 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Then guard an entire processor and also guard all the CORES in the processor 0 except for 1 single core.

root@ltc-wspoon11:~# ./probe_cpus.sh -L
CHIP ID: 0 CORE ID: 1 THREADS: 4 CPUs: 0 1 2 3
CHIP ID: 0 CORE ID: 2 THREADS: 4 CPUs: 4 5 6 7
CHIP ID: 0 CORE ID: 3 THREADS: 4 CPUs: 8 9 10 11

-----------------------------
p[0]
eq[0]
ex[0,1]
c[1,2,3]
-----------------------------

----------Processor Layout-------------------
p[0]
+---EQ00----+ +---EQ02----+ +---EQ04----+
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
|EX-0 C1 | | | | |
+ - - - - - + + - - - - - + + - - - - - +
|EX-1 C2 | | | | |
+ - - - - - + + - - - - - + + - - - - - +
|EX-1 C3 | | | | |
+-----------+ +-----------+ +-----------+

+---EQ01----+ +---EQ03----+ +---EQ05----+
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
| | | | | |
+ - - - - - + + - - - - - + + - - - - - +
| | | | | |
+-----------+ +-----------+ +-----------+

root@ltc-wspoon11:~# opal-gard list
ID | Error | Type | Path
-----------------------------------------------------------------------
00000001 | 00000000 | Manual | /Sys0/Node0/Proc1
00000002 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core0
00000003 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX0/Core1
00000004 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core0
00000005 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX0/Core1
00000006 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core1
00000007 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX0/Core0
00000008 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core0
00000009 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ3/EX1/Core1
0000000a | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core0
0000000b | 00000000 | Manual | /Sys0/Node0/Proc0/EQ2/EX1/Core1
0000000c | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core0
0000000d | 00000000 | Manual | /Sys0/Node0/Proc0/EQ1/EX1/Core1
0000000e | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core0
0000000f | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX0/Core1
00000010 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core0
00000011 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ4/EX1/Core1
00000012 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core0
00000013 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX0/Core1
00000014 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core0
00000015 | 00000000 | Manual | /Sys0/Node0/Proc0/EQ5/EX1/Core1
=======================================================================

Then verified if all t...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
description: updated
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-06-06 03:08 EDT-------
Patch has been accepted upstream and is available

https://git.kernel.org/powerpc/c/ac9816dcbab53c57bcf1d7b15370b0

Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
bugproxy (bugproxy)
tags: added: verification-done-bionic
removed: verification-needed-bionic
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (49.5 KiB)

This bug was fixed in the package linux - 4.15.0-24.26

---------------
linux (4.15.0-24.26) bionic; urgency=medium

  * linux: 4.15.0-24.26 -proposed tracker (LP: #1776338)

  * Bionic update: upstream stable patchset 2018-06-06 (LP: #1775483)
    - drm: bridge: dw-hdmi: Fix overflow workaround for Amlogic Meson GX SoCs
    - i40e: Fix attach VF to VM issue
    - tpm: cmd_ready command can be issued only after granting locality
    - tpm: tpm-interface: fix tpm_transmit/_cmd kdoc
    - tpm: add retry logic
    - Revert "ath10k: send (re)assoc peer command when NSS changed"
    - bonding: do not set slave_dev npinfo before slave_enable_netpoll in
      bond_enslave
    - ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
    - ipv6: sr: fix NULL pointer dereference in seg6_do_srh_encap()- v4 pkts
    - KEYS: DNS: limit the length of option strings
    - l2tp: check sockaddr length in pppol2tp_connect()
    - net: validate attribute sizes in neigh_dump_table()
    - llc: delete timers synchronously in llc_sk_free()
    - tcp: don't read out-of-bounds opsize
    - net: af_packet: fix race in PACKET_{R|T}X_RING
    - tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets
    - net: fix deadlock while clearing neighbor proxy table
    - team: avoid adding twice the same option to the event list
    - net/smc: fix shutdown in state SMC_LISTEN
    - team: fix netconsole setup over team
    - packet: fix bitfield update race
    - tipc: add policy for TIPC_NLA_NET_ADDR
    - pppoe: check sockaddr length in pppoe_connect()
    - vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multi
    - amd-xgbe: Add pre/post auto-negotiation phy hooks
    - sctp: do not check port in sctp_inet6_cmp_addr
    - amd-xgbe: Improve KR auto-negotiation and training
    - strparser: Do not call mod_delayed_work with a timeout of LONG_MAX
    - amd-xgbe: Only use the SFP supported transceiver signals
    - strparser: Fix incorrect strp->need_bytes value.
    - net: sched: ife: signal not finding metaid
    - tcp: clear tp->packets_out when purging write queue
    - net: sched: ife: handle malformed tlv length
    - net: sched: ife: check on metadata length
    - llc: hold llc_sap before release_sock()
    - llc: fix NULL pointer deref for SOCK_ZAPPED
    - net: ethernet: ti: cpsw: fix tx vlan priority mapping
    - virtio_net: split out ctrl buffer
    - virtio_net: fix adding vids on big-endian
    - KVM: s390: force bp isolation for VSIE
    - s390: correct module section names for expoline code revert
    - microblaze: Setup dependencies for ASM optimized lib functions
    - commoncap: Handle memory allocation failure.
    - scsi: mptsas: Disable WRITE SAME
    - cdrom: information leak in cdrom_ioctl_media_changed()
    - m68k/mac: Don't remap SWIM MMIO region
    - block/swim: Check drive type
    - block/swim: Don't log an error message for an invalid ioctl
    - block/swim: Remove extra put_disk() call from error path
    - block/swim: Rename macros to avoid inconsistent inverted logic
    - block/swim: Select appropriate drive on device open
    - block/swim: Fix array bounds check
    - block/swim: Fix IO error at end of medium
    -...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.