Please try the attached patch. It makes it so that if the device does not support ECRC generation or checking, we don't enable those features. Currently (without this patch), we *do* enable ECRC generation and checking if _HPX allows, i.e., if the platform can support ECRC.
The ACPI dump confirms my theory from comment #65 that the system supplies an _HPX method with PCI_ERR_CAP_ECRC_GENE and PCI_ERR_CAP_ECRC_CHKE set (see disassembly below).
Apparently the Intel Quick Assist card is at 85:00.0 ("Intel Corporation DH895XCC Series QAT"). Here's the path leading to it:
pci 0000:80:02.0: [8086:6f04] # Xeon D PCI Express Root Port 2
pci 0000:80:02.0: PCI bridge to [bus 83-86]
pci 0000:83:00.0: [10b5:8724] # PLX 8724 Upstream Port
pci 0000:83:00.0: PCI bridge to [bus 84-86]
pci 0000:84:00.0: [10b5:8724] # PLX 8724 Downstream Port
pci 0000:84:00.0: PCI bridge to [bus 85]
pci 0000:85:00.0: [8086:0435] # DH895XCC Series QAT
This looks suspect because 85:00.0 claims that it does not support ECRC Generation ("GenCap-") or ECRC Checking ("ChkCap-"), yet we set the Enable bits for both features. The workaround in the initial report turns off ECRC checking in 80:02.0. I suspect that turning off ECRC generation and checking in 85:00.0, e.g., "setpci -s85:00.0 118.w=0" would also be a workaround. This patch should be the equivalent of this setpci command.
Here's the _HPX disassembly from dsdt.dsl (extracted from comment #66):
Device (PCI0)
...
Method (_HPX, 0, NotSerialized) // _HPX: Hot Plug Parameter Extensions
{ Store ("_HPX", Debug) Name (SSDH, Package (0x01)
{ Package (0x12) { 0x02, 0x01, 0xFC000FCF, // Uncorrectable Mask AND 0x03A18000, // Uncorrectable Mask OR 0xFC000FCF, // Uncorrectable Severity AND 0x004E7030, // Uncorrectable Severity OR 0xFFFF0E3E, // Correctable Mask AND 0xF1C1, // Correctable Mask OR 0xFFFFFEBF, // AER AND 0x0140, // AER OR 0xFFF1, // Device Control AND 0x0E, // Device Control OR 0xFFFF, // Link Control AND 0x00, // Link Control OR 0xFFFFC010, // Secondary Uncorrectable Severity AND 0x1BC0, // Secondary Uncorrectable Severity OR 0xFFFFC010, // Secondary Uncorrectable Mask AND 0x242F // Secondary Uncorrectable Mask OR }
}) Store (SSDH, Debug) Return (SSDH)
}
Please try the attached patch. It makes it so that if the device does not support ECRC generation or checking, we don't enable those features. Currently (without this patch), we *do* enable ECRC generation and checking if _HPX allows, i.e., if the platform can support ECRC.
The ACPI dump confirms my theory from comment #65 that the system supplies an _HPX method with PCI_ERR_ CAP_ECRC_ GENE and PCI_ERR_ CAP_ECRC_ CHKE set (see disassembly below).
Apparently the Intel Quick Assist card is at 85:00.0 ("Intel Corporation DH895XCC Series QAT"). Here's the path leading to it:
pci 0000:80:02.0: [8086:6f04] # Xeon D PCI Express Root Port 2
pci 0000:80:02.0: PCI bridge to [bus 83-86]
pci 0000:83:00.0: [10b5:8724] # PLX 8724 Upstream Port
pci 0000:83:00.0: PCI bridge to [bus 84-86]
pci 0000:84:00.0: [10b5:8724] # PLX 8724 Downstream Port
pci 0000:84:00.0: PCI bridge to [bus 85]
pci 0000:85:00.0: [8086:0435] # DH895XCC Series QAT
Here are the ECRC settings along the path:
80:02.0: AERCap: GenCap+ CGenEn+ ChkCap+ ChkEn+
83:00.0: AERCap: GenCap+ CGenEn+ ChkCap+ ChkEn+
84:00.0: AERCap: GenCap+ CGenEn+ ChkCap+ ChkEn+
85:00.0: AERCap: GenCap- CGenEn+ ChkCap- ChkEn+
This looks suspect because 85:00.0 claims that it does not support ECRC Generation ("GenCap-") or ECRC Checking ("ChkCap-"), yet we set the Enable bits for both features. The workaround in the initial report turns off ECRC checking in 80:02.0. I suspect that turning off ECRC generation and checking in 85:00.0, e.g., "setpci -s85:00.0 118.w=0" would also be a workaround. This patch should be the equivalent of this setpci command.
Here's the _HPX disassembly from dsdt.dsl (extracted from comment #66):
Device (PCI0)
Store ("_HPX", Debug)
Name (SSDH, Package (0x01)
Package (0x12)
{
0x02,
0x01,
0xFC000FCF, // Uncorrectable Mask AND
0x03A18000, // Uncorrectable Mask OR
0xFC000FCF, // Uncorrectable Severity AND
0x004E7030, // Uncorrectable Severity OR
0xFFFF0E3E, // Correctable Mask AND
0xF1C1, // Correctable Mask OR
0xFFFFFEBF, // AER AND
0x0140, // AER OR
0xFFF1, // Device Control AND
0x0E, // Device Control OR
0xFFFF, // Link Control AND
0x00, // Link Control OR
0xFFFFC010, // Secondary Uncorrectable Severity AND
0x1BC0, // Secondary Uncorrectable Severity OR
0xFFFFC010, // Secondary Uncorrectable Mask AND
0x242F // Secondary Uncorrectable Mask OR
}
Store (SSDH, Debug)
Return (SSDH)
...
Method (_HPX, 0, NotSerialized) // _HPX: Hot Plug Parameter Extensions
{
{
})
}