powerstat -R erroneously reports no RAPL domains

Bug #1467014 reported by John Lenton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
powerstat (Ubuntu)
Invalid
High
Colin Ian King

Bug Description

I grabbed powerstat from git to check out the -t option, and got

$ ./powerstat -tR
Device does not have any RAPL domains, cannot power measure power usage.
$ sudo ./powerstat -tR
Device does not have any RAPL domains, cannot power measure power usage.

I think this is erroneous; powertop claims to be reading them fine:

sudo powertop
Loaded 750 prior measurements
RAPL device for cpu 0
RAPL device for cpu 0
RAPL device for cpu 0

on this arrandale i7.

Revision history for this message
Colin Ian King (colin-king) wrote :

Which kernel are you using?

Can you tar up the following:

tar cvf - /sys/devices/virtual/powercap/ > powercap.tar

and let me know if /sys/class/powercap exists on your machine.

Finally, which CPU do you have?

cat /proc/cpuinfo | grep CPU

Thanks!

Changed in powerstat (Ubuntu):
importance: Undecided → High
assignee: nobody → Colin Ian King (colin-king)
status: New → In Progress
Changed in powerstat (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Colin Ian King (colin-king) wrote :

It seems that the RAPL kernel driver only supports the following CPUs:

        RAPL_CPU(0x2a, rapl_defaults_core),/* Sandy Bridge */
        RAPL_CPU(0x2d, rapl_defaults_core),/* Sandy Bridge EP */
        RAPL_CPU(0x37, rapl_defaults_atom),/* Valleyview */
        RAPL_CPU(0x3a, rapl_defaults_core),/* Ivy Bridge */
        RAPL_CPU(0x3c, rapl_defaults_core),/* Haswell */
        RAPL_CPU(0x3d, rapl_defaults_core),/* Broadwell */
        RAPL_CPU(0x3f, rapl_defaults_hsw_server),/* Haswell servers */
        RAPL_CPU(0x4f, rapl_defaults_hsw_server),/* Broadwell servers */
        RAPL_CPU(0x45, rapl_defaults_core),/* Haswell ULT */
        RAPL_CPU(0x4E, rapl_defaults_core),/* Skylake */
        RAPL_CPU(0x4C, rapl_defaults_atom),/* Braswell */
        RAPL_CPU(0x4A, rapl_defaults_atom),/* Tangier */
        RAPL_CPU(0x56, rapl_defaults_core),/* Future Xeon */
        RAPL_CPU(0x5A, rapl_defaults_atom),/* Annidale */

where as powertop directly reads the MSRs (hence powertop has to have root access rights) and it detects that the MSR_PKG_ENERGY_STATUS is readable so accesses the RAPL energy MSRs and hence can provide the RAPL stats. I'll chat to Intel about the lack of support for your processor in the RAPL driver if you can provide me with the exact CPU info.

Revision history for this message
Colin Ian King (colin-king) wrote :

Just to clarify, I currently believe this is a bug in the kernel driver, so I think once you CPU is enabled in the driver it will show up in powerstat.

Revision history for this message
John Lenton (chipaca) wrote :

I'm using 3.19.0-22-generic. I don't have a /sys/devices/virtual/powercap/. /sys/class/powercap exists but is empty.

$ grep CPU /proc/cpuinfo
model name : Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz
model name : Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz
model name : Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz
model name : Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz

Revision history for this message
Colin Ian King (colin-king) wrote :

RAPL was introduced on Sandybridge, so Arrandale does not support this, so perhaps powertop is incorrect. Powertop just reads MSR 0x611using:

        ret = read_msr(first_cpu, MSR_PKG_ENERY_STATUS, &value);
        if (ret > 0) {
                rapl_domains |= PKG_DOMAIN_PRESENT;
                RAPL_DBG_PRINT("Domain : PKG present\n");
        } else {
                RAPL_DBG_PRINT("Domain : PKG Not present\n");
        }

..and read_msr() in src/lib.cpp in powertop should return -1 if MSR 0x611 does not exist. So it is indeed curious that this does not fail. It's as if the MSR exists, but perhaps RAPL is not really supported correctly on Arrandale.

CPU model number 06 25 (Arrandale) does not support RAPL, tools such as turbostat detect the CPU ID and bail out for Arrandale, see http://sourceforge.net/p/openipmi/linux-ipmi/ci/a729617c58529be0be8faa22c5d45748bb0f12e5/tree/tools/power/x86/turbostat/turbostat.c, see lines 765-789.

I need to discuss this further with Intel

Revision history for this message
Colin Ian King (colin-king) wrote :

And the MSR 0x611 MSR_PKG_ENERGY_STATUS was only introduced in Sandybridge according to "Table 35-12. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge" of the Intel® 64 and IA-32 Architectures Software Developer’s Manual.

I therefore believe powertop may be reading power MSRs that may be bogus for Arrandale.

Revision history for this message
Colin Ian King (colin-king) wrote :

What does the following do:

sudo modprobe msr
sudo rdmsr 0x611

Revision history for this message
John Lenton (chipaca) wrote :

john@fogey:~$ sudo rdmsr 0x611
rdmsr: CPU 0 cannot read MSR 0x00000611

Revision history for this message
John Lenton (chipaca) wrote :

perhaps relevantly, here is the strace of powertop trying to read the msr.

24773 write(1, "RAPL device for cpu 0\n", 22) = 22
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbb08, 8, 1553) = -1 EIO (Input/output error)
24773 close(4) = 0
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbb08, 8, 1561) = -1 EIO (Input/output error)
24773 close(4) = 0
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbb08, 8, 1593) = -1 EIO (Input/output error)
24773 close(4) = 0
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbb08, 8, 1601) = -1 EIO (Input/output error)
24773 close(4) = 0
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbae8, 8, 1542) = -1 EIO (Input/output error)
24773 close(4) = 0
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbae8, 8, 1542) = -1 EIO (Input/output error)
24773 close(4) = 0
24773 access("/dev/cpu/0/msr", R_OK) = 0
24773 open("/dev/cpu/0/msr", O_RDONLY) = 4
24773 pread(4, 0x7ffd966dbae8, 8, 1542) = -1 EIO (Input/output error)
24773 close(4) = 0

Revision history for this message
Colin Ian King (colin-king) wrote :

OK, so the bug seems to be bogus reporting by powertop rather than powerstat.

Revision history for this message
John Lenton (chipaca) wrote :

Yes. Sorry to have you reading the manuals on the weekend!

Revision history for this message
Colin Ian King (colin-king) wrote :

OK, so lets close this and perhaps can you file a bug against powertop instead?

summary: - powertat -R erroneously reports no RAPL domains
+ powerstat -R erroneously reports no RAPL domains
Changed in powerstat (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.