Ubuntu18.04: PowerNV - cpupower monitor will not work when cpu0 is offline

Bug #1743541 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Canonical Kernel Team
linux (Ubuntu)
Fix Released
High
Joseph Salisbury
Bionic
Fix Released
High
Joseph Salisbury

Bug Description

== Comment: #0 - Shriya R. Kulkarni <email address hidden> - 2018-01-16 04:58:48 ==
Problem Description :
=============

cpupower monitor fails to show stop states when cpu 0 is made offline.

Testing :
=====
root@ltc-wspoon12:~# echo 0 > /sys/devices/system/cpu/cpu0/online
root@ltc-wspoon12:~# cpupower monitor
WARNING: at least one cpu is offline
No HW Cstate monitors found
root@ltc-wspoon12:~# cpupower -c 12 monitor
WARNING: at least one cpu is offline
No HW Cstate monitors found
root@ltc-wspoon12:~#

root@ltc-wspoon12:~# echo 1 > /sys/devices/system/cpu/cpu0/online
root@ltc-wspoon12:~# cpupower -c 12 monitor
              |Idle_Stats
PKG |CORE|CPU | snoo | stop | stop | stop | stop
   0| 12| 12| 0.00| 0.00| 0.00| 0.00| 0.01

Details :
====
uname -a : Linux ltc-wspoon12 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:15:55 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

OS : Ubuntu 18.04
Machine : Witherspoon ( DD2.1) and Boston (DD.01)

Patch :
====
Patch that fixes the issue : https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=dbdc468f35ee827cab2753caa1c660bdb832243a

== Comment: #1 - VIPIN K. PARASHAR <email address hidden> - 2018-01-16 05:09:40 ==
(In reply to comment #0)

> Patch :
> ====
> Patch that fixes the issue :
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/
> ?id=dbdc468f35ee827cab2753caa1c660bdb832243a

$ git log dbdc468f35ee827 -1
commit dbdc468f35ee827cab2753caa1c660bdb832243a
Author: Abhishek Goel <email address hidden>
Date: Wed Nov 15 14:10:02 2017 +0530

    cpupower : Fix cpupower working when cpu0 is offline

    cpuidle_monitor used to assume that cpu0 is always online which is not
    a valid assumption on POWER machines. This patch fixes this by getting
    the cpu on which the current thread is running, instead of always using
    cpu0 for monitoring which may not be online.

    Signed-off-by: Abhishek Goel <email address hidden>
    Signed-off-by: Shuah Khan <email address hidden>
$

Commit dbdc468f3 is available with 4.15-rc2 onwards.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-163623 severity-high targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
tags: added: triage-g
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit dbdc468f35ee827c. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1743541

Can you test this kernel and see if it resolves this bug?

Changed in linux (Ubuntu):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-01-16 23:25 EDT-------
Hi ,
I tried to setup this kernel : http://kernel.ubuntu.com/~jsalisbury/lp1743541/ ,

but to install cpupower tool , I need : linux-tools-<kernel version>-generic , linux-tools-common package ,

From the above link i see only linux-tools-common_4.13.0-17.20_all.deb is provided and hence I am unable to test it .

ltc-wspoon12:~/shriya/kernel# dpkg -i linux-tools-common_4.13.0-17.20_all.deb
(Reading database ... 113696 files and directories currently installed.)
Preparing to unpack linux-tools-common_4.13.0-17.20_all.deb ...
Unpacking linux-tools-common (4.13.0-17.20) over (4.13.0-17.20) ...
Setting up linux-tools-common (4.13.0-17.20) ...
root@ltc-wspoon12:~/shriya/kernel# cpupower
The program 'cpupower' is currently not installed. You can install it by typing:
apt install linux-tools-common

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Sorry for not posting the tools packages. It should be available now:
http://kernel.ubuntu.com/~jsalisbury/lp1743541

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-01-18 00:36 EDT-------
Hi ,
Can you please provide ppc64el packages , here I see 'amd' packages in the link mentioned above.

Ex : http://kernel.ubuntu.com/~jsalisbury/lp1743541/linux-tools-4.13.0-17_4.13.0-17.20_amd64.deb

Thanks,
Shriya

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

There is a ppc64 tools package there now. Can you see if that package works for you?

Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-02-22 03:29 EDT-------
Hi ,

Trying to setup the kernel : facing this issue :

dpkg: dependency problems prevent configuration of linux-image-extra-4.13.0-17-generic:
linux-image-extra-4.13.0-17-generic depends on linux-image-4.13.0-17-generic; however:
Package linux-image-4.13.0-17-generic is not installed.

dpkg: error processing package linux-image-extra-4.13.0-17-generic (--install):
dependency problems - leaving unconfigured
Setting up linux-libc-dev:ppc64el (4.13.0-17.20) ...
Setting up linux-source-4.13.0 (4.13.0-17.20) ...
dpkg: dependency problems prevent configuration of linux-tools-4.13.0-17:
linux-tools-4.13.0-17 depends on libbinutils (<< 2.29.2); however:
Version of libbinutils:ppc64el on system is 2.30-5ubuntu1.

Thanks.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Hi, if this is on bionic, you need to wait until 23rd Feb respin is done, which I guess for you means Monday unfortunately. Since 4.15 landed in bionic now, but d-i was not rebuild correctly in time, thus there was a missmatch of things temporarily.

Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Revision history for this message
Manoj Iyer (manjo) wrote :

IBM, could you please re-test with the latest bionic and report back here if the bug is still valid?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-03-06 06:19 EDT-------
Verified on latest kernel :
===============
Working as expected and the issue is fixed.

root@ltc-wspoon4:~# echo 0 > /sys/devices/system/cpu/cpu0/online
root@ltc-wspoon4:~# cpupower monitor
WARNING: at least one cpu is offline
|Idle_Stats
PKG |CORE|CPU | snoo | stop | stop | stop | stop
0| 0| 1| 0.00| 0.00| 0.00| 0.00| 99.24
0| 0| 2| 0.00| 0.00| 0.00| 0.00| 58.79
0| 0| 3| 0.00| 0.00| 0.00| 0.00| 58.79
0| 4| 4| 0.00| 0.00| 0.00| 0.00| 59.77
0| 4| 5| 0.00| 0.00| 0.00| 0.00| 0.00
0| 4| 6| 0.00| 0.00| 0.00| 0.00| 0.00
0| 4| 7| 0.00| 0.00| 0.00| 0.00| 0.00
0| 8| 8| 0.00| 0.00| 0.00| 0.00| 12.08

uname -r : 4.15.0-10-generic

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-17 04:43 EDT-------
Verified on latest kernel on Ubuntu 1804 : Issue is still reproducible
-----------------------------------------------------------------

root@ltc-boston27:~# uname -r
4.15.0-15-generic

root@ltc-boston27:~# cat /sys/devices/system/cpu/cpu0/online
0
root@ltc-boston27:~# cpupower monitor
WARNING: at least one cpu is offline
No HW Cstate monitors found

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (10.1 KiB)

------- Comment From <email address hidden> 2018-05-22 06:39 EDT-------
This issue is still reproducible with latest kernel : 4.15.0-22-generic

root@ltc-wspoon12:~# cpupower monitor sleep 60
sleep took 60.03991 seconds and exited with status 0
|Idle_Stats
PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop
0| 0| 0| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 81.49| 6.52| 10.87
0| 0| 1| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 57.06| 1.50| 40.91
0| 0| 2| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.06| 0.00| 99.94
0| 0| 3| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.06| 0.00| 99.94
0| 4| 4| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 42.93| 0.00| 56.99
0| 4| 5| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 4.25| 0.00| 95.74
0| 4| 6| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.05| 0.00| 99.95
0| 4| 7| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.05| 0.00| 99.95
0| 8| 8| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.71
0| 8| 9| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.03| 0.00| 99.96
0| 8| 10| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 2.52| 0.00| 97.83
0| 8| 11| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.03| 0.00| 99.96
0| 12| 12| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 1.62| 0.00| 98.48
0| 12| 13| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 12| 14| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 12| 15| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 24| 16| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 4.65| 0.00| 95.08
0| 24| 17| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.05| 0.00| 99.95
0| 24| 18| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 24| 19| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 28| 20| 0.18| 8.47| 0.00| 0.00| 0.00| 0.00| 90.43| 0.00| 0.63
0| 28| 21| 0.00| 0.00| 0.00| 0.00| 3.91| 0.00| 0.07| 0.00| 96.02
0| 28| 22| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.11| 0.00| 99.89
0| 28| 23| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.07| 0.00| 99.93
0| 40| 24| 0.21| 65.05| 0.00| 0.00| 0.00| 0.00| 2.77| 0.00| 31.63
0| 40| 25| 0.27| 11.39| 0.00| 0.00| 0.00| 0.00| 2.74| 0.15| 85.13
0| 40| 26| 0.02| 19.97| 66.05| 0.00| 0.00| 0.00| 0.00| 0.00| 13.62
0| 40| 27| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.06| 0.00| 99.94
0| 44| 28| 0.01| 12.42| 67.47| 0.00| 0.74| 0.00| 3.22| 0.81| 14.88
0| 44| 29| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.08| 0.00| 99.92
0| 44| 30| 0.47| 14.39| 0.00| 0.00| 0.00| 0.00| 84.32| 0.00| 0.50
0| 44| 31| 0.00| 2.38| 0.00| 0.00| 0.00| 0.00| 0.06| 0.00| 97.56
0| 56| 32| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.02| 0.00| 99.74
0| 56| 33| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 56| 34| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.04| 0.00| 99.96
0| 56| 35| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 2.15| 0.00| 97.59
0| 60| 36| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.31| 4.89| 94.56
0| 60| 37| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 8.11| 0.00| 91.88
0| 60| 38| 0.00|...

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-05-22 06:45 EDT-------
(In reply to comment #15)
> IBM, could you please re-test with the latest bionic and report back here if
> the bug is still valid?

Hello Canonical,

This bug is still being seen with latest kernels.
4.15.0-22-generic is the most recent one seeing it.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

This bug has now been closed in LP. If the issue is recurring with more recent kernels, could you raise a new defect referring back to this one (LP#1743541)?
Thanks.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-05-30 02:56 EDT-------
(In reply to comment #23)
>
> This bug has now been closed in LP. If the issue is recurring with more
> recent kernels, could you raise a new defect referring back to this one
> (LP#1743541)?
> Thanks.

Closing this bug in Bugzilla as verification failed.

Hi Shriya,

Please raise a new bug for this again as suggested by Canonical.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.