Comment 1 for bug 1917813

Revision history for this message
In , dsmythies (dsmythies-linux-kernel-bugs) wrote :

Created attachment 294171
Graph of load sweep up and down at 347 Hertz.

Consider a steady state periodic single threaded workflow, with a work/sleep frequency of 347 Hertz and a load somewhere in the ~75% range at the steady state operating point.
For the intel-cpufreq CPU frequency scaling driver and powersave governor and hwp disabled, it goes indefinitely without any issues.
For the acpi-cpufreq CPU frequency scaling driver and ondemand governor, it goes indefinitely without any issues.
For the intel-cpufreq CPU frequency scaling driver and powersave governor and hwp enabled, it suffers from overruns.

Why?

For unknown reasons, HWP seems to incorrectly decide that the processor is idle and spins the PLL down to a very low frequency. Upon exit from the sleep portion of the periodic workflow it takes a very long time (on the order of 20 milliseconds (supporting data for that statement will added in a later posting)), resulting in the periodic job no being able to complete its work before the next interval, whereas it normally has plenty of time to do its work. Actually, typical worst case overruns are around 12 milliseconds, or several work/sleep periods (i.e. it takes a very long time to catch up.)

The probability of this occurring is about 3%, but varies significantly. Obviously, the recovery time is also a function of EPP, but mostly this work has been done with the default EPP of 128. I believe this to be a sampling and anti-aliasing issue, but can not prove it because HWP is black box. My best GUESS is:

If the periodic load is busy on a jiffy boundary, such that the tick is on.
Then if it is sleeping at the next jiffy boundary, with a pending wake such that idle state 2 was used.
  Then if the rest of the system was idle such that HWP decides to spin down the PLL.
    Then it is highly probable that upon that idle state 2 exit, the PLL is too slow to ramp up and the task will overrun as a result.
Else everything will be fine.

For a 1000 Hz kernel the above suggests that a work/sleep frequency of 500 Hz should behave in a binary way, either lots of overruns or none. It does.
For a 1000 Hz kernel the above suggests that a work/sleep frequency of 333.333 Hz should behave in a binary way, either lots of overruns or none. It does.
Note: in all cases the sleep time has to be within the window of opportunity.

Now, actually I can not prove if the idle state 2 part is a cause or consequence, but it never happens with it disabled, but at the cost of significant power.

Another way this issue would manifest itself is as seeming to be an extraordinary idle exit latency, but would be rather difficult to isolate as the cause.

processors tested:
Intel(R) Core(TM) i5-9600K CPU @ 3.70GHz (mine)
Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz (not mine)

HWP has been around for years, why am I just reporting this now?

I never owned an HWP capable processor before. My older i7-2600K based test computer was getting a little old, so I built a new test computer. I noticed this issue the same day I first enabled HWP. That was months ago (notice the dates on the graphs that will eventually be added to this), and I tried, repeatedly, to get help from Intel via the linux-pm e-mail list.

Now, given the above system response issue, a new test was developed to focus specifically on this issue, dubbed the "Inverse Impulse Response" test. It examines in great detail the CPU frequency rise time after a brief, less than 1 millisecond, gap in an otherwise continuous workflow. I'll attach graphs and details in subsequent postings to this bug report.

While I believe this is an issue entirely within HWP, I have not been able to prove that there was nothing sent from the kernel somehow telling HWP to spin down.

Notes:

CPU affinity does not need to be forced, but sometimes is for data acquisition.

1000 hertz kernels were tested back to kernel 5.2, all failed.

Kernel 5.10-rc7 (I have yet to compile 5.10) also fails.

A 250 hertz kernel was tested, and it did not have this issue in this area. Perhaps elsewhere, I didn't look.

Both teo and menu idle governors were tested, and while both suffer from the unexpected CPU frequency drop, teo seems much worse. However failure points for both governors are repeatable.

The test computers were always checked for any throttling log sticky bits, and regardless were never anywhere even close to throttling.

Note, however that every HWP capable computer I was to acquire data from has at least one of those sticky bits set after boot, so they need to be reset before any test that might want to examine them afterwards.