HWP and C1E are incompatible - Intel processors

Bug #1917813 reported by Doug Smythies
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)

Bug Description

Modern Intel Processors (since Skylake) with HWP (HardWare Pstate) control enabled and Idle State 2, C1E, enabled can incorrectly drop the CPU frequency with an extremely slow recovery time.

The fault is not within HWP itself, but within the internal idle detection logic. One difference between OS driven pstate control and HWP driven pstate control is that the OS knows the system was not actually idle, but HWP does not. Another difference is the incredibly sluggish recovery with HWP.

The problem only occurs when Idle State 2, C1E, is involved. Not all processors have the C1E idle state. The issue is independent of C1E auto-promotion, which is turned off in general, as far as I know.

With all idle states enabled the issue is rare. The issue would manifest itself in periodic workflows, and would be extremely difficult to isolate (It took me over 1/2 a year).

The purpose of this bug report is to link to the upstream bug report, where readers can find tons of detail. I'll also set it to confirmed, as it has already been verified on 4 different processor models, and I do not want the bot asking me for files that are not required.

Workarounds include:
. don't use HWP.
. disable idle state 2, C1E
. change the C1E idle state to use MWAIT 0x03 instead of MWAIT 0x01 (still in test. documentation on the MWAIT least significant nibble is scant).

Changed in linux (Ubuntu):
status: New → Confirmed
summary: - HWP and C1E are incompatible - Intel prcoessors
+ HWP and C1E are incompatible - Intel processors
Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.