Bring back ondemand.service or switch kernel default governor for pstate - pstate now defaults to performance governor

Bug #1885730 reported by Julian Andres Klode on 2020-06-30
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Groovy
Focal
Undecided
Unassigned
Groovy
Undecided
Unassigned
systemd (Ubuntu)
Status tracked in Groovy
Focal
Undecided
Unassigned
Groovy
Undecided
Unassigned

Bug Description

In a recent merge from Debian we lost ondemand.service, meaning all CPUs now run in Turbo all the time when idle, which is clearly suboptimal.

The discussion in bug 1806012 seems misleading, focusing on p-state vs other drivers, when in fact, the script actually set the default governor for the pstate driver on platforms that use pstate. Everything below only looks at systems that use pstate.

pstate has two governors: performance and powerstate. performance runs CPU at maximum frequency constantly, and powersave can be configured using various energy profiles energy profiles:

- performance
- balanced performance
- balanced power
- power

It defaults to balanced performance, I think, but I'm not sure.

Whether performance governor is faster than powersave governor is not even clear. https://www.phoronix.com/scan.php?page=article&item=linux50-pstate-cpufreq&num=5 benchmarked them, but did not benchmark the individual energy profiles.

For a desktop/laptop, the expected behavior is the powersave governor with balanced_performance on AC and balanced_power on battery.

I don't know about servers or VMs, but the benchmark series seems to indicate it does not really matter much performance wise.

I think most other distributions configure their kernels to use the powersave governor by default, whereas we configure it to use the performance governor and then switch it later in the boot to get the maximum performance during bootup. It's not clear to me that's actually useful.

Julian Andres Klode (juliank) wrote :

Someone probably needs to look at non-pstate systems as I have no idea about them.

summary: - Bring back ondemand.service - pstate now defaults to performance
- governor
+ Bring back ondemand.service or switch kernel default governor for pstate
+ - pstate now defaults to performance governor
Sebastien Bacher (seb128) wrote :

Tagging rls-gg-incoming so it's reviewed, that has a performance impact on desktop and ideally should have been discussed before landing rather than afterfact

tags: added: rls-gg-incoming

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1885730

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Balint Reczey (rbalint) wrote :

The commit message removing ondemand.service has several bug references, too:
https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=65f46a7d14b335e5743350dbbc5b5ef1e72826f7

remove Ubuntu-specific ondemand.service
New processors handle scaling/throttling in internal firmware
(e.g. intel_pstate), and do not require OS config.

Additionally, nobody else does this, not even Debian.

And finally, this has caused problems for years, e.g.:

https://bugs.launchpad.net/ubuntu/+source/sysvinit/+bug/1497375
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1503773
https://bugs.launchpad.net/ubuntu/+source/sysvinit/+bug/1480320
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1579278
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1806012
https://bugs.launchpad.net/charm-sysconfig/+bug/1873028

IMO the kernel is a better place for setting the default governor properly and can even set different governors in cloud-specific kernels.
If the decision is to control the governor in user space in Ubuntu I'd prefer a solution shipped in an other package because systemd does too many things already.

Dimitri John Ledkov (xnox) wrote :

@balint

Kernel has no facility to startup in one mode, and later transition to another.

I think maybe we should measure the difference between "performance, then on demand" vs "balanced performance".

If the difference is not significant, maybe we can simply change the kernel default to "balanced performance"

Julian Andres Klode (juliank) wrote :

@rbalint As said before the kernel messages and bugs are irrelevant and wrong. They pretend like intel_pstate is different, when in fact it's this script that is configuring it here. And yes, it needs OS config.

Nor do other distros not do this, but we do it differently. We set the CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y option, and then transition to that at the end of the boot. Other distributions do not set CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE and hence the kernel has powersave as the default.

By removing this script, we are diverging from other distros, not becoming closer to them - for pstates, anyway.

Changed in linux (Ubuntu):
status: Incomplete → New
status: New → Confirmed
tags: removed: rls-gg-incoming
Balint Reczey (rbalint) wrote :

@xnox @juliank IMO there is no real need for different boot time and post-boot governor.

I think where we would like to save power or be less noisy the (possibly) faster boot does not have huge impact on user satisfaction.

I agree that fans should not be on all the time in laptops/desktops.

If we agree that there is no need for separate boot time and post-boot governor then I think the proper place to set the right default is the kernel.

I'd love to hear the Kernel Team's opinion on the matter because there were quite of lot of discussions in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1579278 .

tags: added: id-5efdfa465220b783b19272c2
Balint Reczey (rbalint) wrote :

I have a freshly installed 20.10 system running on a 2012 MacBook Air (MBA 5,2) and it is completely silent and cold when being idle:

rbalint@chaos:~$ sudo cpupower frequency-info
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: Cannot determine or is not supported.
  hardware limits: 800 MHz - 2.80 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 800 MHz and 2.80 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 915 MHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes
    2600 MHz max turbo 4 active cores
    2600 MHz max turbo 3 active cores
    2600 MHz max turbo 2 active cores
    2800 MHz max turbo 1 active cores

@seb128, @juliank I'm not sure if there is anything to fix in the user space, but please report which laptops you experienced issues with. Those may need firmware/kernel fixes.

Changed in systemd (Ubuntu Groovy):
status: New → Invalid
Julian Andres Klode (juliank) wrote :

The performance governor is the right choice for servers, but it's not the right choice on non-server platforms, it's also not the default kernel setting, it was set because we have the ondemand.service in userspace that can change it back to ondemand (or well we have the service because of that change in the kernel :D).

Fans do not necessarily spin, and you might not actually notice any significant changes in power usage, but the expectation of a desktop user is that the CPU scales its frequencies down, which recent-ish Intel CPUs (Skylake+) on like a ThinkPad T480s - which manage the pstates in hardware instead of software like the old MacBook does - don't do.

If we compare this to Red Hat, what they do is CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y in RHEL and CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y in fedora.

Power usage, at 3-6% CPU usage:

Powersave: I see 0.9-1.4W power usage on the cores
Performance, I see 1.6-2.5W

Julian Andres Klode (juliank) wrote :

passing intel_pstate=disable_hwp on the kernel commandline causes the kernel to scale the Core i5-8250U down to 1.6 GHz in performance mode, but that's still a bit off from the 900 MHz it scales down to in powersave mode.

I believe Windows also does not run the CPUs in performance mode by default on mobile devices (but in balanced or balanced performance), I don't know about stationary ones.

Performance governor on laptops should be restricted to gamemode.

Colin Ian King (colin-king) wrote :

The choice was made from running analysis on a wide range of Intel machines, old and new. We are trying to select the optimal choice for a wide range of CPUs for a wide range of use cases. Generally speaking, the intel-pstate governor has deeper understanding of the processor features and can access CPU metrics that can guide it to making an informed choice.

From our understanding, The intel-pstate driver should be the optimal choice for Intel Sandy Bridge CPUs onwards. The intel-pstate driver supports only the performance and powersave governors. In benchmarking we didn't observe much computational difference between the too once the CPU is fully loaded. However, cranking up or cranking down the load one will discover that the performance setting is more responsive than powersave. The overall compute throughput when fully loaded is the same, it's just a case that powersave may take a little longer to crank up to the full speed.

It makes sense to default to powersave for most scenarios, especially for laptop users.

Pre-Intel Sandy Bridge or non-x86 CPUs will default back to the non-intel pstate governor.

So, question:

Which kernel(s) are you referring to?

Julian Andres Klode (juliank) wrote :

@Colin: I agree with all of that.

Our kernel-side default is not powersave, but performance, across generic and oem, at the very least:

$ grep CPU_FREQ_DEFAULT_GOV_.*=y /boot/config-5.*
/boot/config-5.4.0-26-generic:CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
/boot/config-5.4.0-42-generic:CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
/boot/config-5.6.0-1018-oem:CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
/boot/config-5.6.0-1020-oem:CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y

We used to set that to powersave (and ondemand on non-pstate) in ondemand.service, but have since removed the service in groovy.

I believe the default governor kernel-side outside Ubuntu is usually CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND, which translates to ondemand pre-pstates, and powersave on pstates (compare Fedora), whereas Enterprise systems usually pick PERFORMANCE too (compare RHEL)

- probably because most distributions focus on normal end users and enterprise on server and workstation. We don't have that distinction of course, so I'm not sure what the best way out is - default to powersave/ondemand and make server installer write performance - or vice versa default to performance and make ubiquity configure powersave for desktop.

Dimitri John Ledkov (xnox) wrote :

@colin-king @juliank

It feels to me that the oem flavour should default to (powersave/ondemand), as it is more-or-less laptop kernel flavour.

I feel like generic kernel flavour should remain on performance.

I feel like we should have a unit, that for chassis=laptop turns on (powersave/ondemand). Possibly shipped in like procps package. Or there should be like graphical desktop integration to control this (aka game mode).

Is there a per-chasis type setting in kernel? as in something like CONFIG_WHEN_ON_LAPTOP_DEFAULT_GOV_* ?

Do above actionable things make sense?

Matthieu Baerts (matttbe) wrote :

Hello!

Regarding the comment #8, I didn't get the same positive experience on my side. It was more closer to what is described in comment #9. See bug 1889479 for more details.

I would suggest switching back to powersave/ondemand either with a new service or the kernel config. Having a dedicated service could be confusing for people who try to change the kernel settings. But it could be more flexible.

Cheers,
Matt

Dan Streetman (ddstreet) wrote :

> I would suggest switching back to powersave/ondemand either with a new service or the kernel config.

re: new service, the existing package cpufrequtils (and related package cpufreqd) provides a configurable service to manage governor settings (and other related settings). The old ondemand service was not configurable at all and caused quite a bit of unexpected problems, as well as 'battling' (overriding) the cpufrequtils service when it was installed.

> Having a dedicated service could be confusing for people who try to change the kernel settings.

indeed, it was, especially when there were multiple services to (try to) control the settings that conflicted with each other.

Dan Streetman (ddstreet) wrote :

> In benchmarking we didn't observe much computational difference between the too once the CPU is fully loaded. However, cranking up or cranking down the load one will discover that the performance setting is more responsive than powersave.

this is exactly the problem in production environments; workloads can be 'bursty' which can see not-insignificant performance reduction when using powersave. Many enterprise users even go so far as to disable C-states (and ASPM, and APST, etc...).

> It makes sense to default to powersave for most scenarios, especially for laptop users.

for laptop users, yeah. I question if 'most scenarios' is accurate.

Balint Reczey (rbalint) on 2020-08-03
Changed in systemd (Ubuntu Focal):
status: New → Fix Released
Balint Reczey (rbalint) wrote :

I've added the OEM Solutions Group team for awareness. I'm not sure what the final fix will be since servers' and desktops'/laptops' ideal default seem to be different, but most likely the certification tests should be adjusted if we don't end up restoring the previous behaviour of the ondemand.service unconditionally.

The latest LTS release, 20.04 is not affected so the certification test changes are probably not very urgent.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers