intel_pstate can be modified with service parameter

Bug #2016842 reported by Jiping Ma
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
In Progress
Undecided
Jiping Ma

Bug Description

Brief Description
In Icelake server, isolcpu core frequeccy seem to cap to base frequency.
Ex: Machine with 2.0GHz , configure the node with,

Use Ice lake CPU.
Use low letancy
Set some cores are isolation core
Enanle K8S CPU manager as static  ( I am not sure this is mandate to cuase this issue )

Then see each Core frequency, isolation core frequency seem to stick to "base frequency"

[root@application-isolated /]# grep "cpu MHz" /proc/cpuinfo
cpu MHz         : 3100.000
cpu MHz         : 3100.000 <- Other cores showed turbo boosted frequency
cpu MHz         : 3100.040 <- isol cpu. Stick to base frequench (2.0GHz)
cpu MHz         : 2934.800 <- isol cpu. Stick to base frequench (2.0GHz)
cpu MHz         : 2929.562
cpu MHz         : 2936.802
cpu MHz         : 2960.046
cpu MHz         : 2920.794

[root@application-isolated /]# cat /sys/devices/system/cpu/cpu2/cpufreq/energy_performance_preference
performance
[root@application-isolated /]# cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_driver
intel_pstate
[root@application-isolated /]# cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
performance
[root@application-isolated /]# cat /sys/devices/system/cpu/cpu2/cpufreq/base_frequency
2000000 <- Same as base frequency

I executed "cpupower" command on isolated core, and noticed that cpupower failed to get frequency value from either HW and SW.

controller-0:~/cpupinning/cpupinning$ cpupower -c 2 frequency-info
analyzing CPU 2:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 2
  CPUs which need to have their frequency coordinated by software: 2
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 800 MHz - 3.10 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 800 MHz and 3.10 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware   <- Failed to get value from HW
  current CPU frequency:  Unable to call to kernel <- Failed to get value from SW
  boost state support:
    Supported: yes
    Active: yes

From Ice Lake, HWP ( HardWare Pstate ) was introduced and enable by default. So I guess linux 5.10 failed to get value from HWP. So I am not sure whether "Actual frequency as down to 2.0GHz" or "It just a display issue".

Severity

Major: If actual frequency is set to base frequency.
Minor: When this is just display issue.
Steps to Reproduce{}

1: Use Icelake server such as CoyotePass.
2: Enable K8S CPU manager as "static"
3: Assign some core to "isolatec core"
4: Unlock host.
5: Check isolated core number.
6: Check core frequency with  grep "cpu MHz" /proc/cpuinfo
   -> Isolation core showed base frequency.
Expected Behavior

isolation cores's frequency are also turbo boosted.

Actual Behavior
isolation core's frequency is look like keep to set base frequency.
Reproducibility

Reproducible

System Configuration

Any configuration with low latency profile. ( standard profile does now show this issue. ) Only be happened with Ice lake based servers which has HWP (Hardware P-State ) feature.
Load info (eg: 2022-03-10_20-00-07)

N/A

Last Pass

N/A

Alarms

No alarms

Test Activity

N/A

Workaround

Disable HWP based p-state with kernel command such as  intel_pstate=no_hwp or  intel_pstate=passive .

Ex:
controller-0:/home/sysadmin# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.10.74-200.1644.tis.rt.el7.x86_64 root=UUID=6e7a84f0-1611-4b81-acd8-65c602a67ecd ro security_profile=standard module_blacklist=integrity,ima tboot=false crashkernel=512M biosdevname=0 console=tty0 iommu=pt usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=0 softlockup_panic=0 softdog.soft_panic=1 intel_iommu=on user_namespace.enable=1 skew_tick=1 LANG=en_US.UTF-8 hugepagesz=1G hugepages=20 hugepagesz=2M hugepages=0 default_hugepagesz=1G irqaffinity=6-27,30-55,62-83,86-111 isolcpus=2-5,58-61 rcu_nocbs=2-27,30-55,58-83,86-111 nohz_full=2-5,58-61 kthread_cpus=0-1,28-29,56-57,84-85 audit=0 audit_backlog_limit=8192 nopti nospectre_v2 nospectre_v1 intel_pstate=no_hwp

As a result, isolation core's frequency seemed also turbo boosted.

[root@isolated-besteffort /]# grep "cpu MHz" /proc/cpuinfo
cpu MHz         : 2899.990
cpu MHz         : 2899.991
cpu MHz         : 3500.000 <- isolated core
cpu MHz         : 2900.074 <- isolated core
cpu MHz         : 2900.020
cpu MHz         : 2899.992

But I am not sure how to persist this setting.

Jiping Ma (jma11)
Changed in starlingx:
assignee: nobody → Jiping Ma (jma11)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/880712

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/880714

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/880712
Committed: https://opendev.org/starlingx/stx-puppet/commit/876446e35fd6273c2beb43359a82e40af52a0eaf
Submitter: "Zuul (22348)"
Branch: master

commit 876446e35fd6273c2beb43359a82e40af52a0eaf
Author: Jiping Ma <email address hidden>
Date: Sun Apr 9 19:33:14 2023 -0700

    New service parameter for intel_pstate

    This commit implements changing boot commandline intel_pstate using
    the service parameter mechanism. The new "intel_pstate" parameter
    gets stored as a system-wide service parameter and is instrumented
    for controller, worker and storage personalities.

    intel_pstate can be set to [disable, passive, force, no_hwp, hwp_only,
    support_acpi_ppc, per_cpu_perf_limits], and none is the default.
    https://www.kernel.org/doc/Documentation/admin-guide/pm/intel_pstate.rst

    The service parameter command will be used to change intel_pstate
    kernel boot parameters. The new value will take effect only after
    node unlock (reboot).

    Testing:
    PASS: Verify standard installation
    PASS: Verify service parameter configuration using valid/invalid values
    PASS: Verify intel_pstate boot parameters on all nodes

    Partial-Bug: 2016842
    Signed-off-by: Jiping Ma <email address hidden>
    Change-Id: I3c52e4b26ca8c657e2d62030d07f36eff30bb9e5

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.