Comment 6 for bug 1806012

Revision history for this message
Trent Lloyd (lathiat) wrote :

Confirmed that pcc-cpufreq *can* be used in preference to intel_pstate even on a CPU that supports intel_pstate if the firmware tables are setup to request such. One such server is an E5-2630 v3 HP DL360 G9 (shuckle).

On the default "dynamic" firmware setting you get driver=pcc-cpufreq + governor=ondemand, with the "OS Control" setting you get driver=intel_pstate + governor=powersave.

As above this would explain why the very poor performance is only seen without "OS Control" set, and then, only on some hardware. Since the firmware is in control of the CPU power states in pcc-cpufreq mode the exact frequencies / the rate they are changed / etc are partly under BIOS control. Secondarily it's using an entirely separate kernel path for when and how to choose these frequencies.

Note that when pcc-cpufreq is in use the startup script (xenial:/etc/init.d/ondemand, bionic:/lib/systemd/set-cpufreq) will use ondemand and not powersave (contrary to what the bug report description states). If a system using 'cpufreq' is somehow getting the powersave governor set, this is a bug, but I haven't seen any case where that would be true as of yet.

Also note that in Xenial, the ondemand script runs "sleep 60" before setting the governor, apparently to let most desktops boot to the login screen. So any method that tries to override this setting may fail on Xenial if it runs before the 60 seconds is up (e.g. /etc/rc.local, an init script, sysctl, etc)

I did find that we have 1 other method of setting the governor, which is a charm ~canonical-bootstack/sysconfig which had an option added to allow setting the governor to performance (though it doesn't default to that). This charm installs the cpufrequtils package which also seems to default to 'ondemand'. However if this charm was configured with governor=powersave on such a cpufreq system, we would expect very poor performance. Secondly when configured with governor=performance on Xenial it runs before the 'ondemand' script finishes its 60 second wait, so the change gets reverted. But it will work when first deployed if no reboot is done. (Bug: https://bugs.launchpad.net/bootstack-ops/+bug/1822774)

To my mind this leaves two remaining questions:
 - Are we ever getting into a state where we have scaling-driver=pcc-cpufreq or acpi-cpufreq, but governor=powersave. Such a case is likely a bug. I haven't found any such case as yet unless someone deployed the sysconfig charm with governor=powersave explicitly set (which I have not ruled out)

 - Is there some specific hardware where scaling-driver=pcc-cpufreq and scaling-governor=ondemand performs poorly. I have yet to run a benchmark on my example hardware to find out.