MTRR test fails on Kaby Lake Processors

Bug #1781920 reported by Karin Willers
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Firmware Test Suite
Fix Released
Undecided
Alex Hung

Bug Description

On a system with a Kaby Lake CPU, the 'fwts mtrr' test fails. Here is the relevant output from 'results.log':

mtrr: MTRR tests.
--------------------------------------------------------------------------------
MTRR overview
-------------
Reg 0: 0x0000000080000000 - 0x00000000ffffffff ( 2048 MB) Uncached
Reg 1: 0x000000007e000000 - 0x000000007fffffff ( 32 MB) Uncached
Reg 2: 0x000000007d800000 - 0x000000007dffffff ( 8 MB) Uncached

Test 1 of 3: Validate the kernel MTRR IOMEM setup.
Detected more than one type has been set. From Intel Software Developer Manual
11.11.4.1 MTRR Precedences, If two or more variable memory ranges match and one
of the memory types is UC, the UC memory type used. FWTS will check with type
UNCACHED here.
FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xc0000000 to 0xcfffffff
(0000:00:02.0) has incorrect attribute Uncached.

Test 2 of 3: Validate the MTRR setup across all processors.
PASSED: Test 2, All processors have the a consistent MTRR setup.

Test 3 of 3: Test for AMD MtrrFixDramModEn being cleared by the BIOS.
SKIPPED: Test 3, CPU is not an AMD, cannot test.

Reason for the failure seems to be

"The reason you can't find most of the memory in the variable-range MTRRs on your Skylake system is that it is not there! The Haswell system has an IA32_MTRR_DEF_TYPE of '00', which maps to Uncached (and is the "traditional" way to program the MTRRs), while the Skylake system has an IA32_MTRR_DEF_TYPE of '06', which maps to the WriteBack type. So on the Skylake, any address range that is not mapped to an MTRR will be assumed to be a WriteBack type, and there is no need for the MTRRs to explicitly map system memory. (See Section 11.11 of Volume 3 of the Intel Architectures SW Developer's Manual, document 325384)."

(Quoted from https://software.intel.com/en-us/forums/software-tuning-performance-optimization-platform-monitoring/topic/720897)

Alex Hung (alexhung)
Changed in fwts:
assignee: nobody → Alex Hung (alexhung)
status: New → In Progress
Revision history for this message
Alex Hung (alexhung) wrote :

@Karin Willers

IA32_MTRR_DEF_TYPE is defined as MSR 255h in Table B-2 in Intel® 64 and IA-32 Architectures Software Developer'ss Manual Volume 3.

This MSR is used by fwts to get the default mtrr type, as in https://github.com/ColinIanKing/fwts/blob/master/src/bios/mtrr/mtrr.c#L185

Will you be able to check this MSR on your target systems?

Alex Hung (alexhung)
Changed in fwts:
status: In Progress → Incomplete
Revision history for this message
Alex Hung (alexhung) wrote :

Typo Correction to #1 - IA32_MTRR_DEF_TYPE is MSR 2FFh, not 255h

Revision history for this message
Karin Willers (karin-n) wrote : Re: [Bug 1781920] Re: MTRR test fails on Kaby Lake Processors

Hello Alex,

will do so after my vacation in the week after next week. I have access
to both
Skylake and Kaby Lake machines.

Thanks, Karin
---

On 2018-08-01 19:14, Alex Hung wrote:
> @Karin Willers
>
> IA32_MTRR_DEF_TYPE is defined as MSR 255h in Table B-2 in Intel® 64 and
> IA-32 Architectures Software Developer'ss Manual Volume 3.
>
> This MSR is used by fwts to get the default mtrr type, as in
> https://github.com/ColinIanKing/fwts/blob/master/src/bios/mtrr/mtrr.c#L185
>
> Will you be able to check this MSR on your target systems?
>
>
> ** Changed in: fwts
> Status: In Progress => Incomplete
>
> --
> You received this bug notification because you are subscribed to the
> bug
> report.
> https://bugs.launchpad.net/bugs/1781920
>
> Title:
> MTRR test fails on Kaby Lake Processors
>
> Status in Firmware Test Suite:
> Incomplete
>
> Bug description:
> On a system with a Kaby Lake CPU, the 'fwts mtrr' test fails. Here is
> the relevant output from 'results.log':
>
> mtrr: MTRR tests.
>
> --------------------------------------------------------------------------------
> MTRR overview
> -------------
> Reg 0: 0x0000000080000000 - 0x00000000ffffffff ( 2048 MB) Uncached
> Reg 1: 0x000000007e000000 - 0x000000007fffffff ( 32 MB) Uncached
> Reg 2: 0x000000007d800000 - 0x000000007dffffff ( 8 MB) Uncached
>
> Test 1 of 3: Validate the kernel MTRR IOMEM setup.
> Detected more than one type has been set. From Intel Software
> Developer Manual
> 11.11.4.1 MTRR Precedences, If two or more variable memory ranges
> match and one
> of the memory types is UC, the UC memory type used. FWTS will check
> with type
> UNCACHED here.
> FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xc0000000
> to 0xcfffffff
> (0000:00:02.0) has incorrect attribute Uncached.
>
> Test 2 of 3: Validate the MTRR setup across all processors.
> PASSED: Test 2, All processors have the a consistent MTRR setup.
>
> Test 3 of 3: Test for AMD MtrrFixDramModEn being cleared by the BIOS.
> SKIPPED: Test 3, CPU is not an AMD, cannot test.
>
>
> Reason for the failure seems to be
>
> "The reason you can't find most of the memory in the variable-range
> MTRRs on your Skylake system is that it is not there! The Haswell
> system has an IA32_MTRR_DEF_TYPE of '00', which maps to Uncached (and
> is the "traditional" way to program the MTRRs), while the Skylake
> system has an IA32_MTRR_DEF_TYPE of '06', which maps to the WriteBack
> type. So on the Skylake, any address range that is not mapped to an
> MTRR will be assumed to be a WriteBack type, and there is no need for
> the MTRRs to explicitly map system memory. (See Section 11.11 of
> Volume 3 of the Intel Architectures SW Developer's Manual, document
> 325384)."
>
> (Quoted from https://software.intel.com/en-us/forums/software-tuning-
> performance-optimization-platform-monitoring/topic/720897)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/fwts/+bug/1781920/+subscriptions

Revision history for this message
Alex Hung (alexhung) wrote :

This may not be related to IA32_MTRR_DEF_TYPE MSR but to MTRR registers.

The IA32_MTRR_DEF_TYPE sets the default cache type if cache type is not specified by other methods, as below:

=======================================
11.11.2.1 IA32_MTRR_DEF_TYPE MSR
The IA32_MTRR_DEF_TYPE MSR (named MTRRdefType MSR for the P6 family processors) sets the default properties of the regions of physical memory that are not encompassed by MTRRs.
=======================================

However, the bug description clearly says three MTRRs are used and the cache type is "uncached". FWTS also provides section 11.11.4.1 that suggests MTRR has higher precedences than the default type.

Revision history for this message
Karin Willers (karin-n) wrote :

The values of IA32_MTRR_DEF_TYPE on both Skalyke and Kaby Lake are 0xc06. Please find attached some relevant system informations for these processors. For reference I did include the same information for a Lenovo X220 laptop computer (Dual core Intel Core i5-2450M), where the 'fwts mtrr' run completes without errors.

Revision history for this message
Alex Hung (alexhung) wrote :

@Karin,

When checking iomem, fwts also checks whether PCI(e)'s memory ranges are prefetchable. If the ranges are prefetchable, they should not be uncached in MTRR.

For example, I got failures on an Intel skylake system

FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xc0000000 to 0xcfffffff
(0000:01:00.0) has incorrect attribute Uncached.
FAILED [MEDIUM] MTRRIncorrectAttr: Test 1, Memory range 0xd0000000 to 0xd1ffffff
(0000:01:00.0) has incorrect attribute Uncached.

They are failures because "lspci -vn" (see below) shows the above ranges, "0xc0000000 to 0xcfffffff" and "0xd0000000 to 0xd1ffffff" are prefetchable but the MTRR for the ranges "0x00000000c0000000 - 0x00000000ffffffff" are uncached

00:01.0 0604: 8086:1901 (rev 07) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 122
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 00003000-00003fff
        Memory behind bridge: ec000000-ed0fffff
        Prefetchable memory behind bridge: 00000000c0000000-00000000d1ffffff
        Capabilities: <access denied>
        Kernel driver in use: pcieport

You can also check whether this applies to your tested systems. You can also upload the output of "lspci -vn" and I will take a look too.

Revision history for this message
Karin Willers (karin-n) wrote :

Please find attached the output of the 'lspci -vn' command on the Kaby Lake machine.

I did some more investigations and found that the FWTS failures also occurs on the customer's
older/other platforms with Intel IvyBridge and AMD Bald Eagle processors. So maybe this is a
consistent UEFI issue of setting up the MTRR registers?!

Revision history for this message
Alex Hung (alexhung) wrote :

@Karin,

I checked the lspci log you uploaded, and the error indeed points to a prefetchable memory region as below:

00:02.0 0300: 8086:591d (rev 04) (prog-if 00 [VGA controller])
 DeviceName: Onboard - Video
 Subsystem: 1775:bc20
 Flags: bus master, fast devsel, latency 0, IRQ 317
 Memory at de000000 (64-bit, non-prefetchable) [size=16M]
 Memory at c0000000 (64-bit, prefetchable) [size=256M]

I also reviewed the results.log I collected in the previous UEFI Plugfest and there were a few systems having similar failures (and many did not). So, I guess there are other affected systems out there too.

I will work with chipset and firmware vendors so they are aware of this issue.

Revision history for this message
Alex Hung (alexhung) wrote :

Intel document (Intel® 64 and IA-32 Architectures Software Developer's Manual) provides a guide of how to choose memory type as below:

================================================
11.3.2 Choosing a Memory Type
The simplest system memory model does not use memory-mapped I/O with read or write side effects, does not include a frame buffer, and uses the write-back memory type for all memory. An I/O agent can perform direct memory access (DMA) to write-back memory and the cache protocol maintains cache coherency.
A system can use strong uncacheable memory for other memory-mapped I/O, and should always use strong uncacheable memory for memory-mapped I/O with read side effects.
================================================

As pointed out, uncacheable is not a invalid choice.

Revision history for this message
Alex Hung (alexhung) wrote :

A fwts patch was sent to mailing for review (http://patchwork.ozlabs.org/patch/998794/)

Alex Hung (alexhung)
Changed in fwts:
status: Incomplete → Opinion
status: Opinion → Fix Committed
Revision history for this message
Ivan Hu (ivan.hu) wrote :

Clarification from Intel Graphics team,

"We do not have such requirement to keep GMAR BAR as "Write Combined". We only need GMADR to be WC for GOP to optimize BIOS logo displaying, it is as design to switch GMARD back to UC for other memory type utilization before transferring control to OS. BIOS need to set the GMADR to WC during POST for improving GOP performance, and restore it back to UC before boot to reduce the MTRR variable usage."

Alex Hung (alexhung)
Changed in fwts:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.