AER: Corrected error received: id=00e0

Bug #1521173 reported by David Henningsson
726
This bug affects 151 people
Affects Status Importance Assigned to Milestone
Linux
Unknown
Medium
linux (Ubuntu)
Triaged
Medium
Unassigned
Xenial
Triaged
Medium
Unassigned

Bug Description

WORKAROUND: add pci=noaer to your kernel command line:

1) edit /etc/default/grub and and add pci=noaer to the line starting with GRUB_CMDLINE_LINUX_DEFAULT. It will look like this:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer"
2) run "sudo update-grub"
3) reboot

----

My dmesg gets completely spammed with the following messages appearing over and over again. It stops after one s3 cycle; it only happens after reboot.

[ 5315.986588] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 5315.987249] pcieport 0000:00:1c.0: can't find device of ID00e0
[ 5315.995632] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 5315.995664] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
[ 5315.995674] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00000001/00002000
[ 5315.995683] pcieport 0000:00:1c.0: [ 0] Receiver Error
[ 5316.002772] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 5316.002811] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
[ 5316.002826] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00000001/00002000
[ 5316.002838] pcieport 0000:00:1c.0: [ 0] Receiver Error
[ 5316.009926] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 5316.009964] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
[ 5316.009979] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00000001/00002000
[ 5316.009991] pcieport 0000:00:1c.0: [ 0] Receiver Error

ProblemType: BugDistroRelease: Ubuntu 16.04
Package: linux-image-4.2.0-19-generic 4.2.0-19.23 [modified: boot/vmlinuz-4.2.0-19-generic]
ProcVersionSignature: Ubuntu 4.2.0-19.23-generic 4.2.6
Uname: Linux 4.2.0-19-generic x86_64
ApportVersion: 2.19.2-0ubuntu8
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/pcmC0D0c: david 1502 F...m pulseaudio
 /dev/snd/controlC0: david 1502 F.... pulseaudio
CurrentDesktop: Unity
Date: Mon Nov 30 13:19:00 2015
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=fe528b90-b4eb-4a20-82bd-6a03b79cfb14
InstallationDate: Installed on 2015-11-28 (2 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Alpha amd64 (20151127)
MachineType: Dell Inc. Inspiron 13-7359
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-19-generic.efi.signed root=UUID=94d54f88-5d18-4e2b-960a-8717d6e618bb ro noprompt persistent quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-19-generic N/A
 linux-backports-modules-4.2.0-19-generic N/A
 linux-firmware 1.153SourcePackage: linux
UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/07/2015
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 01.00.00
dmi.board.name: 0NT3WX
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr01.00.00:bd08/07/2015:svnDellInc.:pnInspiron13-7359:pvr:rvnDellInc.:rn0NT3WX:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: Inspiron 13-7359
dmi.sys.vendor: Dell Inc.

Revision history for this message
David Henningsson (diwic) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.4 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc3-wily

tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
David Henningsson (diwic) wrote :

I've tried upgrading BIOS to 1.2.0 (latest version on Dell website) and also with the v4.4-rc3-wily kernel. The dmesg is still spammed with the same error.

tags: added: kernel-bug-exists-upstream
penalvch (penalvch)
tags: added: bios-outdated-1.2.0
tags: added: latest-bios-1.2.0
removed: bios-outdated-1.2.0
Revision history for this message
penalvch (penalvch) wrote :

David Henningsson, pending you've already tested and reproduced in 4.4-rc4, the issue you are reporting is an upstream one. Could you please report this upstream (TO Bjorn Helgaas CC linux-pci) via https://wiki.ubuntu.com/Bugs/Upstream/kernel ?

Please provide a direct URL to your post to the mailing list when it becomes available so that it may be tracked.

Also, could you quantify your description comment "My dmesg gets completely spammed with the following messages appearing over and over again."?

For example, it increases the log file size by 1MB per hour in comparison to when this doesn't happen?

Thank you for your understanding.

tags: added: kernel-bug-exists-upstream-4.4-rc3
Changed in linux (Ubuntu Xenial):
status: Confirmed → Triaged
Revision history for this message
David Henningsson (diwic) wrote :

The spam rate is 150 lines per second. With ~80 characters per line, that's about 50 MB per hour. As a very rough measure.

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Created attachment 197891
Collection of outputs from X555U laptop

Good day.
I have updated this laptop to the latest vendor supplied BIOS 204 10/18/2015.

Attempted distribution: Ubuntu mate 15.10.
Had to use acpi=off boot parameter to install linux
Eventually found more hardware worked with the pci=nommconf boot parameter

With pci=nommconf the following still does not work:
- Realtec rtl8821ae 802.11ac wireless NIC PCIe will only run in 2.4GHz mode. 5GHz mode will not work.
- Laptop will not resume after suspend

Many boot errors show in dmesg:
ACPI: AE_NOT_FOUND errors
systemd: failed to insert module 'kdbus' function not implemented

If pci=nommconf not used as boot parameter there is a looping pci-e error message that I cant break out of. From what I can read it says:
printk messages dropped pcieport 0000:00:... id=00E5(Receiver ID)

In the attached file is the following when pci=nommconf boot parameter used:
sudo output of:
dmesg
uname -a
lspci -vvnn
dmidecode
Tarball of /proc/acpi directory

Note: I am unable to resume from hibernate everything is frozen. So I am not able to attach a copy of /var/log/kern.log.0

Revision history for this message
Beanow (beanow) wrote :

Confirming same error messages on 4.2.0 kernel from jessie-backports with skylake i7-6700HQ. On pci port 0:1c:0, device ID [8086:a110].

According to lspci -tv this is connected to my Intel 3165 wireless card. Using a manually added ucode from https://wireless.kernel.org/en/users/Drivers/iwlwifi

Can you check with lspci -tv what device is connected to this pci slot?

Revision history for this message
Beanow (beanow) wrote :

Found in your udev file that your slot that triggers the messages is also a wifi card. Realtek, RLT8723BE PCIe Wireless Network Adapter.

So the common ground seems to be. 4.x kernel versions. PCIe wireless cards. Intel PCIe bus. Skylake CPU series laptop.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

There are a couple of problems here
1. "pci=nommconf" is needed to boot
2. tpm_crb driver calltrace in dmesg
3. ieee80211_tx calltrace in dmesg
4. hibernate failure

IMO, any of the first three problems may break hibernation, thus we should try to fix the first three issues separately and then check how hibernation goes on this laptop.

Move to PCI category to get Problem 1 fixed first.

Revision history for this message
In , bjorn (bjorn-linux-kernel-bugs) wrote :

Thank you very much for this report. It's a pretty serious problem when we can't boot at all.

"pcieport 0000:00:... id=00E5(Receiver ID)" looks like an AER message. Please try turning off AER with "pci=noaer". If you can boot with "pci=noaer" and without "pci=nommconf", please attach the dmesg log.

Here's a report of another similar AER problem:

  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521173

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Created attachment 198481
Updated dmesg with pci=noaer

It booted no problem after replacing pci=nommconf with pci=noaer as suggested. See updated dmesg.txt as requested.

Thanks!

Revision history for this message
David Henningsson (diwic) wrote : Re: Dmesg filled with "AER: Corrected error received"

Hi,

Indeed booting with pci=noaer (as suggested in the other bug) works
around this issue as well. I'll use that for the time being.

Thanks for working on it!

// David

On 2015-12-29 16:58, Bjorn Helgaas wrote:
> On Fri, Dec 18, 2015 at 11:30:33AM +0100, David Henningsson wrote:
>> Hi Linux PCI maintainers,
>>
>> My dmesg gets filled with a few lines repeated over and over again:
>>
>> pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> pcieport 0000:00:1c.0: can't find device of ID00e0
>> pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected,
>> type=Physical Layer, id=00e0(Receiver ID)
>> pcieport 0000:00:1c.0: device [8086:9d14] error
>> status/mask=00000001/00002000
>> pcieport 0000:00:1c.0: [ 0] Receiver Error
>>
>> This happens 10-30 times per second (!), so dmesg fills up quickly.
>> The bug is present in both vanilla and Ubuntu kernels.
>
> This is a pretty obvious bug in our AER code. We normally clear
> correctable errors by writing the PCI_ERR_COR_STATUS register in
> handle_error_source(). The execution path looks like this:
>
> aer_isr_one_error
> aer_print_port_info
> if (find_source_device())
> aer_process_err_devices
> handle_error_source
> pci_write_config_dword(dev, PCI_ERR_COR_STATUS, ...)
>
> In this case, find_source_device() printed "can't find device of
> ID00e0" [sic] and returned false, so we don't call
> aer_process_err_devices(). The error is never cleared, so
> we discover it again and again.
>
> I'll work on fixing this. Incidentally, there's another report
> with similar symptoms here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=109691
>
> Bjorn
>

--
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic

Revision history for this message
In , bjorn (bjorn-linux-kernel-bugs) wrote :

Great, thank you! I understand the AER bug (see http://lkml.kernel.org/r/20151229155822.GA17321@localhost); now we just need to figure out a fix.

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Excellent.
Thanks Bjorn.
Great to see you have isolated the problem.

All the best in 2016!

Any other details you require from me let me know I will update this post.

Cheers!

Revision history for this message
SqUe (sque) wrote :

Same error on Ubuntu Gnome 15.10 running 4.2 or 4.3 or 4.4-rc8 as also on Debian testing with 4.3. I get randomly this kind of error:
[ 851.659186] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 851.659208] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
[ 851.659219] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
[ 851.659227] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)

Revision history for this message
SqUe (sque) wrote :

..continuing (pressed post by mistake)

I am on intel i5-6200u and the pci port is the one that wireless card is connected too.

lspci -vt
-[0000:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
           +-02.0 Intel Corporation Sky Lake Integrated Graphics
           +-14.0 Intel Corporation Device 9d2f
           +-14.2 Intel Corporation Device 9d31
           +-16.0 Intel Corporation Device 9d3a
           +-17.0 Intel Corporation Device 9d03
           +-1c.0-[01]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1c.5-[02]----00.0 Intel Corporation Wireless 3165
           +-1d.0-[03]----00.0 Realtek Semiconductor Co., Ltd. Device 522a
           +-1f.0 Intel Corporation Device 9d48
           +-1f.2 Intel Corporation Device 9d21
           +-1f.3 Intel Corporation Device 9d70
           \-1f.4 Intel Corporation Device 9d23

I am also having spci -vt
-[0000:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
           +-02.0 Intel Corporation Sky Lake Integrated Graphics
           +-14.0 Intel Corporation Device 9d2f
           +-14.2 Intel Corporation Device 9d31
           +-16.0 Intel Corporation Device 9d3a
           +-17.0 Intel Corporation Device 9d03
           +-1c.0-[01]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1c.5-[02]----00.0 Intel Corporation Wireless 3165
           +-1d.0-[03]----00.0 Realtek Semiconductor Co., Ltd. Device 522a
           +-1f.0 Intel Corporation Device 9d48
           +-1f.2 Intel Corporation Device 9d21
           +-1f.3 Intel Corporation Device 9d70
           \-1f.4 Intel Corporation Device 9d23

The weird thing is that at some boots this error never appears and on some others this error my show early or later and repeatedly.

Jan W (ubuntu-kiekerjan)
tags: added: kernel-bug-exists-upstream-4.4.1
removed: kernel-bug-exists-upstream-4.4-rc3
Jan W (ubuntu-kiekerjan)
tags: added: wily
Revision history for this message
Jordon Bedwell (envygeeks) wrote :

I still get this problem in Xenial as well... randomly but it happens.

Revision history for this message
In , bugs (bugs-linux-kernel-bugs) wrote :

Looks like I have this same problem (with the same hardware). Adding my name to the list, using Ubuntu's Xubuntu 15.10 distro. The pci=noaer works, although pci=nomsi also works.

Strangely enough, Knoppix 7.6.1 boots just fine. Hmmm...

Revision history for this message
Ehsan (azarnasab) wrote :

On 4.4.8-300.fc23.x86_64 with "Dell Inc. XPS 8900/0XJ8C4, BIOS 2.1.3 01/20/2016" and "i7-6700 CPU @ 3.40GHz (family: 0x6, model: 0x5e, stepping: 0x3)"

```text
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: device [8086:a110] error status/mask=00000001/00002000
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: [ 0] Receiver Error (First)
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: can't find device of ID00e0
May 05 14:02:57 dashesy.wavelet kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
```

That device is "PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #1 (rev f1)" and is used by "+-1c.0-[02]----00.0 Realtek Semiconductor Co., Ltd. RTL8723BE PCIe Wireless Network Adapter"
`pci=nomsi` solved the problem but so did `pci=noaer` which I will use for now.

I will gladly do debugging if there is a kernel to test.

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Just an update.
confirmed Kelly Price's discovery: Knoppix 7.6.1 with kernel 4.2.6 boots fine.
Thanks Kelly.

I flash updated the BIOS to latest vendor supplied version 206 (2016/02/24).

Latest Ubuntu 16.04 with kernel 4.4 still has the same problem.

Revision history for this message
Abhishek Bhatia (abhigenie92) wrote :

I tried the suggestion of pci=nomsi but it doesn't fix it. Here are the complete details. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1588428

Revision history for this message
Abhishek Bhatia (abhigenie92) wrote :

Any progress on this bug?

Revision history for this message
e633 (e633) wrote :

Hello, i am affected too. Dell Latitude 3570. Kernel 4.4.0-21-generic x86_64 and in my case the problematic device seems to be the Qualcomm Atheros AR9462 Wireless Network Adapter. Everything seems to work though.
Full PC specs: https://paste.debian.net/hidden/03da6511/

Error:
AER: Corrected error received: id=00e0
pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00003000/00002000
pcieport 0000:00:1c.0: [12] Replay Timer Timeout

#lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:1904] (rev 08)
00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:1916] (rev 07)
00:14.0 USB controller [0c03]: Intel Corporation Device [8086:9d2f] (rev 21)
00:14.2 Signal processing controller [1180]: Intel Corporation Device [8086:9d31] (rev 21)
00:15.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d60] (rev 21)
00:16.0 Communication controller [0780]: Intel Corporation Device [8086:9d3a] (rev 21)
00:17.0 SATA controller [0106]: Intel Corporation Device [8086:9d03] (rev 21)
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:9d14] (rev f1)
00:1c.5 PCI bridge [0604]: Intel Corporation Device [8086:9d15] (rev f1)
00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:9d48] (rev 21)
00:1f.2 Memory controller [0580]: Intel Corporation Device [8086:9d21] (rev 21)
00:1f.3 Audio device [0403]: Intel Corporation Device [8086:9d70] (rev 21)
00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:9d23] (rev 21)
01:00.0 Network controller [0280]: Qualcomm Atheros AR9462 Wireless Network Adapter [168c:0034] (rev 01)
02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)

#lspci -vt
-[0000:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
           +-02.0 Intel Corporation Sky Lake Integrated Graphics
           +-14.0 Intel Corporation Device 9d2f
           +-14.2 Intel Corporation Device 9d31
           +-15.0 Intel Corporation Device 9d60
           +-16.0 Intel Corporation Device 9d3a
           +-17.0 Intel Corporation Device 9d03
           +-1c.0-[01]----00.0 Qualcomm Atheros AR9462 Wireless Network Adapter
           +-1c.5-[02]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1f.0 Intel Corporation Device 9d48
           +-1f.2 Intel Corporation Device 9d21
           +-1f.3 Intel Corporation Device 9d70
           \-1f.4 Intel Corporation Device 9d23

pci=noaer helps.

Revision history for this message
Игорь (ifree92) wrote :

I have the same "spam" in my dmesg
And as said upper... I have "Intel Corporation Wireless 3165" card connected.

So strange....

Revision history for this message
erika jonell (erika-jonell) wrote :

In order to supress the error and boot at all you must add pci=noaer to your kernel boot parameters. You can do it in the install launcher's GRUB menu or during boot, then regen your grub.cfg with it included for future boots.

This is not an ubuntu unique problem, as i can confirm it exists in other distros as well (Arch for one).

my belief is it is an issue with Skylake chips and intel based mobos and the south-bridge PCI support within the kernel itself.

(i have a i7 6700 and an H110 chipset)

Revision history for this message
Makda (makdamujji) wrote :

This is my dmesg output:
[ 121.716206] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00000000
[ 121.716209] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)
[ 121.716216] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 121.716616] pcieport 0000:00:1c.5: can't find device of ID00e5
[ 121.716619] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 121.717092] pcieport 0000:00:1c.5: can't find device of ID00e5
[ 121.717109] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 121.717129] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)

my lspci:
00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 08)
00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 07)
00:04.0 Signal processing controller: Intel Corporation Skylake Processor Thermal Subsystem (rev 08)
00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller (rev 21)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP Thermal subsystem (rev 21)
00:15.0 Signal processing controller: Intel Corporation Sunrise Point-LP Serial IO I2C Controller (rev 21)
00:15.1 Signal processing controller: Intel Corporation Sunrise Point-LP Serial IO I2C Controller (rev 21)
00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME HECI (rev 21)
00:17.0 SATA controller: Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode] (rev 21)
00:1c.0 PCI bridge: Intel Corporation Device 9d10 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port (rev f1)
00:1c.5 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-LP LPC Controller (rev 21)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
01:00.0 3D controller: NVIDIA Corporation Device 134e (rev a2)
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 10)
03:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8723BE PCIe Wireless Network Adapter

Most probably having something to do with 6th gen intel and Realtek hardware.

Doug McMahon (mc3man)
tags: added: yakketywily
removed: wily
tags: added: wily yakkety
removed: yakketywily
Revision history for this message
JujuLand (alain-aupeix) wrote :

Same bug with a Dell XPS8900.

I can install 12.04, but it fails with 15.10 or 16.04.
Having installed 12.04 and updated to 14.04, I have then updated to 16.04, but if it boots correctly, syslog and kern.log are filled with these messages and / is filled (0 bytes free ...)

I tried to boot on 16.04 DVD, but impossible ...

Is there any progress about this bug ?

Thanks
A+

Revision history for this message
Bill Michaelson (t-launchpad-bill-from-net) wrote :

I seem to have this issue too, but related to a different device. Running 16.04 with 4.4.0-31-generic. New (used) machine so very concering. It ran fine for about an hour then spontaneously started spewing this:

Jul 26 13:28:05 twin kernel: [ 8.837650] pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Jul 26 13:28:05 twin kernel: [ 8.837665] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0018(Receiver ID)
Jul 26 13:28:05 twin kernel: [ 8.837675] pcieport 0000:00:03.0: device [8086:d138] error status/mask=00000001/00002000
Jul 26 13:28:05 twin kernel: [ 8.837685] pcieport 0000:00:03.0: [ 0] Receiver Error (First)

lspci -nn gives me a match against this:

00:03.0 PCI bridge [0604]: Intel Corporation Core Processor PCI Express Root Port 1 [8086:d138] (rev 11)

and booting with pci=noaer suppresses the messages with no apparent ill effects.

But I don't know what the message is supposed to mean and I fear that I am suppressing a valid warning and gambling if I use the machine for serious work. Insights more than welcome. The machine is an ASUS G73Jh laptop Intel Core i7-720QM @ 1.60GHz / Nehalem 45nm). TIA.

Revision history for this message
David Henningsson (diwic) wrote :

Out of curiousity, do all of you have the combination of Skylake + RTL8723BE, and second, do you experience (as I do) that wifi doesn't work very well (often loses connections etc)?

...as the errors seem to indicate some kind of physical error between the Skylake/Sunrise Point host controller and the wifi card.

Revision history for this message
David Henningsson (diwic) wrote :

Btw, I reported mine upstream long ago and got response from upstream that "I've thought about this problem a bit, but realistically I don't have time to do the fix I'd like to do /.../ Anybody else who is interested should feel free to take a crack at it."

See http://permalink.gmane.org/gmane.linux.kernel.pci/48697

Also some googling finds me a few other reports with very similar symptoms, e g:

https://bugzilla.kernel.org/show_bug.cgi?id=111601

https://lkml.org/lkml/2015/9/2/573

description: updated
Revision history for this message
Fabio A. (falemagn) wrote :

Yes David, I've got your exact hw combination and indeed wifi sometimes seems to "get stuck".

A
    sudo modprobe -r rtl8723be

followed by

    sudo modprobe rtl8723be

does the trick of bringing the device to life most of the times, though.

Revision history for this message
Makda (makdamujji) wrote :

WiFi can be fixed by this:

Create a conf file for Wifi:
sudo gedit /etc/modprobe.d/rtl8723be.conf

Write in it:
options rtl8723be fwlps=N ips=N

Save and reboot. WIFi will work fine now, but the NOAER error still floods the dmesg.

Revision history for this message
JujuLand (alain-aupeix) wrote :

I have build the Dell XP 8900 with Ubuntu 14.04, and it works fine.

I forget to disable LTS update, and the owner made the update

The bug is always here, and I must redo a 14.04 install

Grrr ....

Does somebody is in charge of this bug which is very old (since 15.04) ?

Thanks
A+

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

Related problem report:
https://bugzilla.kernel.org/show_bug.cgi?id=109691

Brief analysis of AER issue:
http://lkml.kernel.org/r/20151229155822.GA17321@localhost

I did say in that analysis that I was going to work on fixing this, but I haven't had time. It would be great if somebody would jump in and help out.

Revision history for this message
JujuLand (alain-aupeix) wrote :

Hi, I had a look to the link you give, and saw there is a way to boot using pci=noaer parameter.

It's a good way while no other solution has been found, but does this method is usable when booting on a live hd to install on an HD

Thanks
A+

Revision history for this message
JujuLand (alain-aupeix) wrote :

Humm ... typo : booting on a live DvD, obviously :)

A+

Revision history for this message
John (jsalatas) wrote :

Same here. Also in a Dell XPS 8900 (Skylake + RTL8723BE) using kernel 4.4.0

Revision history for this message
Eduardo Montes de Oca Sanchez (ed-montesdeoca) wrote :
Download full text (6.4 KiB)

I have de same issue. I Have an HP Star Wars Special Edition 15-an050nr:

edrendar@outrider-HP-Pavilion-Notebook:~$ tail -f /var/log/syslog
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778017] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778028] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778032] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778035] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778041] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778151] pcieport 0000:00:1c.5: can't find device of ID00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778296] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778307] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778310] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.778313] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.877828] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.877853] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.877864] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.877872] pcieport 0000:00:1c.5: [ 0] Receiver Error
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.877885] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.878542] pcieport 0000:00:1c.5: can't find device of ID00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.878562] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.878587] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.878594] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.878600] pcieport 0000:00:1c.5: [ 0] Receiver Error
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.878611] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Nov 8 23:43:47 outrider-HP-Pavilion-Notebook kernel: [ 7275.879259] pcieport 0000:00:1c.5: can't find device of ID00e...

Read more...

Revision history for this message
Daniel Jose (danieldsj) wrote :

I exhibited similar symptoms when installing Ubuntu 16.04.1 LTS on an Asus x541u VivoBook Max system. When performing the installation, the logs would fill up with these errors and eventually fail because of lack of disk space. I found the following thread helpful...
http://www.gossamer-threads.com/lists/linux/kernel/2250177

The workaround for me was to hold left SHIFT, edit the grub menu and add the pcie_aspm=off kernel parameter to suppress the messages during the installation and every subsequent boot. Adding these options to the grub configuration after installing was the long-term workaround.

Revision history for this message
Ped (ped) wrote :

I'm slightly affected, or maybe actually my kernel is "fixed" to correctly clear the error report even when device is not found internally (referring to the #27 brief analysis), as I do see the AER error in dmesg, periodically showing up, but only about once per couple of minutes.

It's still beyond being acceptable for me, so I used the "pci=noaer" workaround, which stops the messages appearing.

Error log:
[ 487.987496] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 487.987503] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
[ 487.987505] pcieport 0000:00:1c.0: device [8086:a110] error status/mask=00000001/00002000
[ 487.987507] pcieport 0000:00:1c.0: [ 0] Receiver Error (First)

Further errors have the same 1c.0 address (Intel Corporation Wireless 3165) and details.

Kernel version: 4.4.0-59-generic

CPU: Intel(R) Core(TM) i5-6300HQ CPU @ 2.30GHz

# lspci -vt
-[0000:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
           +-01.0-[01]----00.0 NVIDIA Corporation GM107M [GeForce GTX 960M]
           +-02.0 Intel Corporation Skylake Integrated Graphics
           +-14.0 Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
           +-14.2 Intel Corporation Sunrise Point-H Thermal subsystem
           +-16.0 Intel Corporation Sunrise Point-H CSME HECI #1
           +-17.0 Intel Corporation Sunrise Point-H SATA Controller [AHCI mode]
           +-1c.0-[02]----00.0 Intel Corporation Wireless 3165
           +-1c.3-[03]----00.0 Qualcomm Atheros Killer E2400 Gigabit Ethernet Controller
           +-1f.0 Intel Corporation Sunrise Point-H LPC Controller
           +-1f.2 Intel Corporation Sunrise Point-H PMC
           +-1f.3 Intel Corporation Sunrise Point-H HD Audio
           \-1f.4 Intel Corporation Sunrise Point-H SMBus

MSI Notebook GP62 6QF-678XCZ

Revision history for this message
mohican (mohican) wrote :

Hello,
same bug on Asus R556UB-DM217T (live session)

I was able to install using pci=noaer

Also associated with a bug with the sound (no input sound from integrated webcam mic)
sound device : HDA Intel PCH, Realtek ALC256

Revision history for this message
pakman (phill-phillk) wrote :

not sure if this merit's as i encountered this on a Centos install with anaconda, booted with the flag specified & the errors didnt pile up. Hardware is a dell xps. i can provide more info if needed.

Revision history for this message
PanPetr (javacentrum) wrote :

The same issue: lubuntu 16.04 on HP ProBook 470 G3 writes to kernel.log and then completely freeze

Mar 24 09:02:09 localhost kernel: [ 6972.305728] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Mar 24 09:02:09 localhost kernel: [ 6972.305749] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Mar 24 09:02:09 localhost kernel: [ 6972.305760] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Mar 24 09:02:09 localhost kernel: [ 6972.305768] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)
Mar 24 09:03:12 localhost kernel: [ 7035.298073] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Mar 24 09:03:12 localhost kernel: [ 7035.298083] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Mar 24 09:03:12 localhost kernel: [ 7035.298087] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Mar 24 09:03:12 localhost kernel: [ 7035.298089] pcieport 0000:00:1c.5: [ 0] Receiver Error
Mar 24 09:04:15 localhost kernel: [ 7098.238955] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Mar 24 09:04:15 localhost kernel: [ 7098.238979] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Mar 24 09:04:15 localhost kernel: [ 7098.238992] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Mar 24 09:04:15 localhost kernel: [ 7098.239001] pcieport 0000:00:1c.5: [ 0] Receiver Error
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00

Revision history for this message
Davide (davide-maraschio93) wrote :

The same issue: Ubuntu 16.04.2 on Asus N552VW-FY136T writes to kernel.log and then completely freeze

Mar 24 09:02:09 localhost kernel: [ 6972.305728] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
pcieport 0000:00:1c.5 PCIe Bus Error: severity=corrected, type=physical layer, id=00e4(Receiver 12)
pcieport 0000:00:1c.5 device[8086:a112] error status/mask=00000001/000020000

The workarounds described here don't work for me.

Revision history for this message
Davide (davide-maraschio93) wrote :

My kernel version is 4.8

Revision history for this message
Davide (davide-maraschio93) wrote :

I've reinstalled Ubuntu and now it starts. I typed dmesg and there's this message anyway:

[ 0.875431] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 0.875438] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 0.875440] pcieport 0000:00:1c.4: device [8086:a114] error status/mask=00000100/00002000
[ 0.875442] pcieport 0000:00:1c.4: [ 8] RELAY_NUM Rollover
[ 0.879660] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 0.879667] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 0.879669] pcieport 0000:00:1c.4: device [8086:a114] error status/mask=00000100/00002000
[ 0.879670] pcieport 0000:00:1c.4: [ 8] RELAY_NUM Rollover
[ 0.911313] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 0.911319] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 0.911320] pcieport 0000:00:1c.4: device [8086:a114] error status/mask=00000100/00002000
[ 0.911321] pcieport 0000:00:1c.4: [ 8] RELAY_NUM Rollover
[ 0.923536] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 0.923542] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 0.923543] pcieport 0000:00:1c.4: device [8086:a114] error status/mask=00000100/00002000
[ 0.923544] pcieport 0000:00:1c.4: [ 8] RELAY_NUM Rollover

wanghuan (fredwanghuan)
Changed in linux (Ubuntu):
status: Triaged → New
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Greg Lutostanski (lutostag) wrote :

Hitting this with zesty
Linux doe 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Patrik Wallström (pawal) wrote :

I got this today with Zesty as well:

pawal@lakrobot:~$ lspci -vt
-[0000:00]-+-00.0 Intel Corporation Device 5904
           +-02.0 Intel Corporation Device 5916
           +-04.0 Intel Corporation Skylake Processor Thermal Subsystem
           +-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
           +-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem
           +-15.0 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #0
           +-15.1 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #1
           +-16.0 Intel Corporation Sunrise Point-LP CSME HECI #1
           +-1c.0-[01-39]----00.0-[02-39]--+-00.0-[03]--
           | +-01.0-[04-38]--
           | \-02.0-[39]----00.0 Intel Corporation DSL6340 USB 3.1 Controller [Alpine Ridge]
           +-1c.4-[3a]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           +-1c.5-[3b]----00.0 Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader
           +-1d.0-[3c]----00.0 Device 1c5c:1284
           +-1f.0 Intel Corporation Device 9d58
           +-1f.2 Intel Corporation Sunrise Point-LP PMC
           +-1f.3 Intel Corporation Device 9d71
           \-1f.4 Intel Corporation Sunrise Point-LP SMBus

pawal@lakrobot:~$ uname -a
Linux lakrobot 4.10.0-20-generic #22-Ubuntu SMP Thu Apr 20 09:22:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I have the same issue on the Razer Blade 2017 - the kernel log is flooded with messages.

Disabling PCIe Active State Power Management helps:

GRUB_CMDLINE_LINUX_DEFAULT="quiet button.lid_init_state=open pcie_aspm=off"

Tested that on 4.11.0-041100rc8-generic.

Revision history for this message
Eudald (reaven) wrote :

Same here with 17.04 Kernel: 4.10.0-20-generic. Computer hangs completely after some random time. It's happened since I updated to 17.04.

Going to try if the workaround prevents the computer from freezing.

Revision history for this message
Eudald (reaven) wrote :

@dmitriis are you sure this flag disables PCIe Active State Power Management? I set it and I still see errors:
May 5 13:09:43 evo kernel: [ 673.810810] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
May 5 13:09:43 evo kernel: [ 673.810821] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
May 5 13:09:43 evo kernel: [ 673.810829] pcieport 0000:00:1c.0: device [8086:a110] error status/mask=00000001/00002000
May 5 13:09:43 evo kernel: [ 673.810833] pcieport 0000:00:1c.0: [ 0] Receiver Error (First)

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Eudald, I am sure. Tested multiple times.

# edit /etc/default/grub
sudo update-grub
sudo shutdown -r now

and you should be good.

https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt

 pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
   Management.
  off Disable ASPM.
  force Enable ASPM even on devices that claim not to support it.
   WARNING: Forcing ASPM on may cause system lockups.

Revision history for this message
jbeale (jpbeale) wrote :

Same problem here. Both Ubuntu Desktop 16.04, and 17.04 with kernel 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

This is on i7-6700TE @ 2.4GHz, 32 GB RAM, 500 GB SSD with Nvidia GeForce GT710 video card and Realtek 8821AE PCI-E wireless adaptor.
The "pci=noaer" in the GRUB file does work for me to eliminate the error log spam. Before doing that fix, I had an extremely high error rate (every few microseconds) so /var/log/kern.log grew over 10 GB in just a few minutes after bootup. A brief excerpt:

[ 283.805239] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
[ 283.805243] pcieport 0000:00:1d.0: can't find device of ID00e8
[ 283.805256] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
[ 283.805260] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID)
[ 283.805262] pcieport 0000:00:1d.0: device [8086:a119] error status/mask=00000001/00002000
[ 283.805263] pcieport 0000:00:1d.0: [ 0] Receiver Error (First)
[ 283.805281] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
[ 283.805287] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID)
[ 283.805288] pcieport 0000:00:1d.0: device [8086:a119] error status/mask=00000001/00002000
[ 283.805290] pcieport 0000:00:1d.0: [ 0] Receiver Error (First)

Revision history for this message
jbeale (jpbeale) wrote :

Note: before doing the workaround, my Realtek 8821AE wifi module did work and would connect to the network, despite the high rate of errors going to the log. After implementing the workaround, no more errors but the wifi doesn't work (it can see the network but won't connect to it).

Revision history for this message
Daniel Mulholland (dan-mulholland) wrote :

FWIW, running Kubuntu 17.04 with kernel 4.12.0-041200rc4-generic on a Dell XPS 9360 (late 2016) model, I get this error:

 9825.550655] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9825.550661] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9825.550664] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9825.550666] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 9825.846925] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9825.846951] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9825.846966] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9825.846974] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 9825.852701] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9825.852715] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9825.852724] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9825.852730] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 9826.680756] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9826.680767] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9826.680774] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9826.680780] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 9826.938346] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9826.938362] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9826.938370] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9826.938375] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 9828.079556] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9828.079566] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9828.079573] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9828.079577] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 9828.278507] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 9828.278531] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 9828.278548] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 9828.278559] pcieport 0000:00:1c.4: [12] Replay Timer Timeout

I think this is likely the same issue as in this thread, however the type is "Data Link Layer" rather than "Physical Layer". Happy to run diagnostics on suggestion.

Revision history for this message
Vladiszavlyev Gergo (gergo-ruszki) wrote :

I have an ASUS N552VW laptop for which updating BIOS to 300 had helped to solve this issue.

Revision history for this message
Vladiszavlyev Gergo (gergo-ruszki) wrote :

Follow-up: Errors disappeared only for a few reboots. First only a subset of errors, today all of the previously observed error messages appeared again during boot up.

Revision history for this message
Stephan Rügamer (sruegamer) wrote :

Just saw this message the first time:

Ubuntu Artful (Devel) latest packages.

Jul 21 10:15:18 sruegamer-xps13 kernel: [ 3949.594315] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
Jul 21 10:15:18 sruegamer-xps13 kernel: [ 3949.594323] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
Jul 21 10:15:18 sruegamer-xps13 kernel: [ 3949.594329] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818653] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818666] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818671] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818674] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
Jul 21 10:16:01 sruegamer-xps13 gnome-terminal-[6368]: Unable to load blank_cursor from the cursor theme
Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583189] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583201] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583205] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583207] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002195] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002201] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002204] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002206] pcieport 0000:00:1c.4: [12] Replay Timer Timeout

Laptop is Dell XPS 13 Dev Edition (9360)

Revision history for this message
Daniel Mulholland (dan-mulholland) wrote : Re: [Bug 1521173] Re: AER: Corrected error received: id=00e0
Download full text (6.6 KiB)

I have experienced the same issue.

Linux kernel 4.12 (easily installed using Ukuu
http://www.omgubuntu.co.uk/2017/02/ukuu-easy-way-to-install-mainline-kernel-ubuntu)
completely resolved this issue with the PCIe bus for me.

On Fri, Jul 21, 2017 at 8:26 PM, Stephan Ruegamer <
<email address hidden>> wrote:

> Just saw this message the first time:
>
> Ubuntu Artful (Devel) latest packages.
>
> Jul 21 10:15:18 sruegamer-xps13 kernel: [ 3949.594315] pcieport
> 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer,
> id=00e4(Transmitter ID)
> Jul 21 10:15:18 sruegamer-xps13 kernel: [ 3949.594323] pcieport
> 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
> Jul 21 10:15:18 sruegamer-xps13 kernel: [ 3949.594329] pcieport
> 0000:00:1c.4: [12] Replay Timer Timeout
> Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818653] pcieport
> 0000:00:1c.4: AER: Corrected error received: id=00e4
> Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818666] pcieport
> 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer,
> id=00e4(Transmitter ID)
> Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818671] pcieport
> 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
> Jul 21 10:15:36 sruegamer-xps13 kernel: [ 3967.818674] pcieport
> 0000:00:1c.4: [12] Replay Timer Timeout
> Jul 21 10:16:01 sruegamer-xps13 gnome-terminal-[6368]: Unable to load
> blank_cursor from the cursor theme
> Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583189] pcieport
> 0000:00:1c.4: AER: Corrected error received: id=00e4
> Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583201] pcieport
> 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer,
> id=00e4(Transmitter ID)
> Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583205] pcieport
> 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
> Jul 21 10:16:43 sruegamer-xps13 kernel: [ 4034.583207] pcieport
> 0000:00:1c.4: [12] Replay Timer Timeout
> Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002195] pcieport
> 0000:00:1c.4: AER: Corrected error received: id=00e4
> Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002201] pcieport
> 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer,
> id=00e4(Transmitter ID)
> Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002204] pcieport
> 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
> Jul 21 10:16:46 sruegamer-xps13 kernel: [ 4037.002206] pcieport
> 0000:00:1c.4: [12] Replay Timer Timeout
>
> Laptop is Dell XPS 13 Dev Edition (9360)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1521173
>
> Title:
> AER: Corrected error received: id=00e0
>
> Status in Linux:
> Unknown
> Status in linux package in Ubuntu:
> Confirmed
> Status in linux source package in Xenial:
> Triaged
>
> Bug description:
> Note: Current workaround is to add pci=noaer to your kernel command
> line:
>
> 1) edit /etc/default/grub and and add pci=noaer to the line starting
> with GRUB_CMDLINE_LINUX_DEFAULT. It will look like this:
> GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer"
...

Read more...

Revision history for this message
jon anoter (jon8899888) wrote :

I have the same issue on Asus x550vx laptop {with Nividia GTX950M} i7-7700HQ quad-core on Ubuntu 16.04.3 linux kernel 4.10.0-32-generic:

Aug 21 07:45:05 kernel: [170968.303385] pcieport 0000:00:1c.2: AER: Multiple Corrected error received: id=00e2
Aug 21 07:45:05 kernel: [170968.304027] pcieport 0000:00:1c.2: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e2(Receiver ID)
Aug 21 07:45:05 kernel: [170968.304030] pcieport 0000:00:1c.2: device [8086:a112] error status/mask=00000001/00002000
Aug 21 07:45:05 kernel: [170968.304032] pcieport 0000:00:1c.2: [ 0] Receiver Error (First)
Aug 21 07:45:05 kernel: [170968.304044] pcieport 0000:00:1c.2: AER: Corrected error received: id=00e2
Aug 21 07:45:05 kernel: [170968.304691] pcieport 0000:00:1c.2: can't find device of ID00e2
:1c.2: AER: Corrected error received: id=00e2

Revision history for this message
Logi Leifsson (logileifs) wrote :

Also affecting me on Ubuntu 14.04 Asus UX305CA.

Could this be the reason my computer could not resume from suspend anymore?

After a very recent system update my computer never resumed fully from suspend and after a hard restart I got an apportcheckresume error. Only thing I could notice was the same error being described here so I wonder if that has been preventing my computer from resuming after suspend

Revision history for this message
Julian Alarcon (julian-alarcon) wrote :

Still happening with Ubuntu 17.10 kernel Linux P01A30136 4.12.0-12-generic #13-Ubuntu SMP Thu Aug 17 16:13:25 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux and updated BIOS.

Aug 28 10:55:08 LAPTOPNAME kernel: [ 5632.967441] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Aug 28 10:55:08 LAPTOPNAME kernel: [ 5632.967445] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Aug 28 10:55:08 LAPTOPNAME kernel: [ 5632.967447] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Aug 28 10:55:08 LAPTOPNAME kernel: [ 5632.967449] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)

Revision history for this message
@Spazm (granny-launchpad) wrote :

Hit by this today after updating packages on ubuntu 17.10 running on dell 9360.
This upgraded the kernel to '4.13.0-11-generic #12-Ubuntu SMP'

The same pcieport messages as dan-mulholland was seeing.

[ 1423.748011] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 1423.748015] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 1423.748017] pcieport 0000:00:1c.4: [12] Replay Timer Timeout
[ 1428.702571] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 1428.702577] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
...

% lcpci | grep 1c.4

00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 (rev f1)

Will try with a stock kernel.

Revision history for this message
jon anoter (jon8899888) wrote :

I just wanted to report that I am on Asus X550V (Skylake i7-7700HQ Cpu with Nividia GeForce GTX 950M) and your workaround in first paragraph worked:

"Note: Current workaround is to add pci=noaer to your kernel command line:

1) edit /etc/default/grub and and add pci=noaer to the line starting with GRUB_CMDLINE_LINUX_DEFAULT. It will look like this:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer"
2) run "sudo update-grub"
3) reboot"
`
That fixed my problem. No more `PCIe Bus Error` and `AER error` messages now.

(I also used `sudo find /var/log -type f -name "*.gz" -delete` to remove old log files and enabled `logrotate` , because I had over 100Gb (!) in those thousands of spam `pcie error` log messages.)

Revision history for this message
spike speigel (frail-knight) wrote :

Just now experiencing this on Ubuntu 17.10. Never saw this before. Dell XPS 13 DE 9360 w/ Kabylake CPU.

tags: added: artful
Revision history for this message
spike speigel (frail-knight) wrote :

[ 6283.204650] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 6283.204661] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 6283.204671] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 6283.204677] pcieport 0000:00:1c.4: [12] Replay Timer Timeout

Revision history for this message
Tim Ritberg (xpert-reactos) wrote :

Still same here. Updated from 17.04 to 17.10:
pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e5(Transmitter ID)
pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
pcieport 0000:00:1c.5: [12] Replay Timer Timeout

Skylake i5-6200U
Aspire E5-574G

Revision history for this message
Carlos (cjclm7) wrote :

Same issue on Linux Ubuntu Server 16.04
Intel Skylake i5-6400
Asus Motherboard Z270 Prime
GPU on PCIe MSI RX 580

Revision history for this message
Marcos Alano (mhalano) wrote :

I'm using Ubuntu 17.10 on a i7 Skylake and using the "pci=noaer"tip the message goes away. Now I need to find out what this option means to see if I'm losing something.

Revision history for this message
Marcos Alano (mhalano) wrote :

This message just occurs to me when I set the "Fastboot" option on BIOS to "Minimal" instead of "Through". I think the Linux isn't ready yet for Fastboot feature.

Revision history for this message
Jinyu LIU (liujinyu) wrote :
Download full text (11.1 KiB)

I got same issue with DELL XPS

* Ubuntu 17.10
* Linux SimonUbuntu 4.13.0-16-generic #19-Ubuntu SMP Wed Oct 11 18:35:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

* syslog
Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635733] pcieport 0000:00:1c.2: AER: Corrected error received: id=00e2
Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635744] pcieport 0000:00:1c.2: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e2(Receiver ID)
Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635750] pcieport 0000:00:1c.2: device [8086:a292] error status/mask=00000001/00002000
Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635754] pcieport 0000:00:1c.2: [ 0] Receiver Error (First)

* dmesg
[ 1994.498700] pcieport 0000:00:1c.2: AER: Corrected error received: id=00e2
[ 1994.498705] pcieport 0000:00:1c.2: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e2(Receiver ID)
[ 1994.498707] pcieport 0000:00:1c.2: device [8086:a292] error status/mask=00000001/00002000
[ 1994.498708] pcieport 0000:00:1c.2: [ 0] Receiver Error (First)

$ lspci -v -s 1c.2
00:1c.2 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #3 (rev f0) (prog-if 00 [Normal decode])
 Flags: bus master, fast devsel, latency 0, IRQ 122
 Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
 Memory behind bridge: df300000-df3fffff
 Capabilities: <access denied>
 Kernel driver in use: pcieport
 Kernel modules: shpchp

$ lspci
00:00.0 Host bridge: Intel Corporation Device 591f (rev 05)
00:01.0 PCI bridge: Intel Corporation Skylake PCIe Controller (x16) (rev 05)
00:02.0 Display controller: Intel Corporation HD Graphics 630 (rev 04)
00:14.0 USB controller: Intel Corporation 200 Series PCH USB 3.0 xHCI Controller
00:15.0 Signal processing controller: Intel Corporation 200 Series PCH Serial IO I2C Controller #0
00:15.1 Signal processing controller: Intel Corporation 200 Series PCH Serial IO I2C Controller #1
00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1
00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode]
00:1c.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #2 (rev f0)
00:1c.2 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #3 (rev f0)
00:1c.3 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #4 (rev f0)
00:1d.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #9 (rev f0)
00:1e.0 Signal processing controller: Intel Corporation 200 Series PCH Serial IO UART Controller #0
00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (Z270)
00:1f.2 Memory controller: Intel Corporation 200 Series PCH PMC
00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio
00:1f.4 SMBus: Intel Corporation 200 Series PCH SMBus Controller
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
02:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller
03:00.0 Network controller: Intel Corporation Wireless 3165 (rev 79)
04:00.0 Ethernet controller: Qualcomm Atheros QCA8171 Gigabit E...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Download full text (15.6 KiB)

> On 7 Nov 2017, at 1:07 AM, Jinyu LIU <email address hidden> wrote:
>
> I got same issue with DELL XPS
Jinyu LIU,

Can you file a new bug? Thanks.

Kai-Heng

>
> * Ubuntu 17.10
> * Linux SimonUbuntu 4.13.0-16-generic #19-Ubuntu SMP Wed Oct 11 18:35:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> * syslog
> Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635733] pcieport 0000:00:1c.2: AER: Corrected error received: id=00e2
> Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635744] pcieport 0000:00:1c.2: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e2(Receiver ID)
> Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635750] pcieport 0000:00:1c.2: device [8086:a292] error status/mask=00000001/00002000
> Nov 7 00:59:08 SimonUbuntu kernel: [ 1701.635754] pcieport 0000:00:1c.2: [ 0] Receiver Error (First)
>
> * dmesg
> [ 1994.498700] pcieport 0000:00:1c.2: AER: Corrected error received: id=00e2
> [ 1994.498705] pcieport 0000:00:1c.2: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e2(Receiver ID)
> [ 1994.498707] pcieport 0000:00:1c.2: device [8086:a292] error status/mask=00000001/00002000
> [ 1994.498708] pcieport 0000:00:1c.2: [ 0] Receiver Error (First)
>
>
> $ lspci -v -s 1c.2
> 00:1c.2 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #3 (rev f0) (prog-if 00 [Normal decode])
> Flags: bus master, fast devsel, latency 0, IRQ 122
> Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
> Memory behind bridge: df300000-df3fffff
> Capabilities: <access denied>
> Kernel driver in use: pcieport
> Kernel modules: shpchp
>
>
> $ lspci
> 00:00.0 Host bridge: Intel Corporation Device 591f (rev 05)
> 00:01.0 PCI bridge: Intel Corporation Skylake PCIe Controller (x16) (rev 05)
> 00:02.0 Display controller: Intel Corporation HD Graphics 630 (rev 04)
> 00:14.0 USB controller: Intel Corporation 200 Series PCH USB 3.0 xHCI Controller
> 00:15.0 Signal processing controller: Intel Corporation 200 Series PCH Serial IO I2C Controller #0
> 00:15.1 Signal processing controller: Intel Corporation 200 Series PCH Serial IO I2C Controller #1
> 00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1
> 00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode]
> 00:1c.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #2 (rev f0)
> 00:1c.2 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #3 (rev f0)
> 00:1c.3 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #4 (rev f0)
> 00:1d.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #9 (rev f0)
> 00:1e.0 Signal processing controller: Intel Corporation 200 Series PCH Serial IO UART Controller #0
> 00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (Z270)
> 00:1f.2 Memory controller: Intel Corporation 200 Series PCH PMC
> 00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio
> 00:1f.4 SMBus: Intel Corporation 200 Series PCH SMBus Controller
> 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
> 01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio C...

Revision history for this message
Sasha Stadnik (evrybiont) wrote :

I had the same problem on Linux Lubuntu 17.10 (4.13.0-16-generic)
Asus X550VXK Intel i7, Nvidia Geforce GTX 950M

First at all i had black screen, temporary added "nomodeset" to grub menu solved it.

pci=noaer helped me to get rid of "PCIe Bus Error: severity=Corrected, type=Physical Layer"

After that i had wifi problems (often loses connections etc), posts below helped me to fix wifi
https://forum.manjaro.org/t/wifi-fails-time-to-time-rtl8821ae/28914/9
the same in https://medium.com/@elmaxx/rtl8821ae-wifi-drivers-in-ubuntu-16-04-4c1286524afa

Revision history for this message
Marcos Alano (mhalano) wrote :
Download full text (4.8 KiB)

I entered on BIOS and set the option "Fastboot" to "through". I would
like people check what value is selected for this option and change to
check if error persists. Some people could help me on that?

On Fri, Nov 10, 2017 at 10:37 AM, Sasha Stadnik
<email address hidden> wrote:
> I had the same problem on Linux Lubuntu 17.10 (4.13.0-16-generic)
> Asus X550VXK Intel i7, Nvidia Geforce GTX 950M
>
> First at all i had black screen, temporary added "nomodeset" to grub
> menu solved it.
>
> pci=noaer helped me to get rid of "PCIe Bus Error: severity=Corrected,
> type=Physical Layer"
>
> After that i had wifi problems (often loses connections etc), posts below helped me to fix wifi
> https://forum.manjaro.org/t/wifi-fails-time-to-time-rtl8821ae/28914/9
> the same in https://medium.com/@elmaxx/rtl8821ae-wifi-drivers-in-ubuntu-16-04-4c1286524afa
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1521173
>
> Title:
> AER: Corrected error received: id=00e0
>
> Status in Linux:
> Unknown
> Status in linux package in Ubuntu:
> Confirmed
> Status in linux source package in Xenial:
> Triaged
>
> Bug description:
> Note: Current workaround is to add pci=noaer to your kernel command
> line:
>
> 1) edit /etc/default/grub and and add pci=noaer to the line starting with GRUB_CMDLINE_LINUX_DEFAULT. It will look like this:
> GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer"
> 2) run "sudo update-grub"
> 3) reboot
>
> ----
>
> My dmesg gets completely spammed with the following messages appearing
> over and over again. It stops after one s3 cycle; it only happens
> after reboot.
>
> [ 5315.986588] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 5315.987249] pcieport 0000:00:1c.0: can't find device of ID00e0
> [ 5315.995632] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 5315.995664] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
> [ 5315.995674] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00000001/00002000
> [ 5315.995683] pcieport 0000:00:1c.0: [ 0] Receiver Error
> [ 5316.002772] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 5316.002811] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
> [ 5316.002826] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00000001/00002000
> [ 5316.002838] pcieport 0000:00:1c.0: [ 0] Receiver Error
> [ 5316.009926] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
> [ 5316.009964] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
> [ 5316.009979] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00000001/00002000
> [ 5316.009991] pcieport 0000:00:1c.0: [ 0] Receiver Error
>
> ProblemType: Bug
> DistroRelease: Ubuntu 16.04
> Package: linux-image-4.2.0-19-generic 4.2.0-19.23 [modified: boot/vmlinuz-4.2.0-19-generic]
> ProcVersionSignature: Ubuntu 4.2.0-19.23-generic 4.2.6
> Uname: Linux 4.2.0-19-generic x86_64
> A...

Read more...

Revision history for this message
gotcha (pjusto) wrote :

Hi Marcos, I was getting this error on a Precision 5520 when I plugged a TB16 docking station. I am running Xubuntu 16.04 fully updated. No peripheral was functional.

After setting the fast boot option to Auto, it worked!

Cheers...

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Marcos,

#68

Regardless of fastboot on/off I get the same behavior without pcie_aspm=off

➜ ~ uname -r
4.13.0-16-generic

Revision history for this message
Bougron (francis-bougron) wrote :

hello
Today, I have seem this 4 lines repeaded many many times in a syslog trace of ubuntu 17.10

Nov 20 00:07:53 sat-XPS-15-9560 kernel: [ 590.905484] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
Nov 20 00:07:53 sat-XPS-15-9560 kernel: [ 590.905513] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
Nov 20 00:07:53 sat-XPS-15-9560 kernel: [ 590.905522] pcieport 0000:00:1c.0: device [8086:a110] error status/mask=00003000/00002000
Nov 20 00:07:53 sat-XPS-15-9560 kernel: [ 590.905528] pcieport 0000:00:1c.0: [12] Replay Timer Timeout

Revision history for this message
Bougron (francis-bougron) wrote :

Sorry
   I saw this

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Bougron, please file a new bug.

Revision history for this message
Rolands Kusiņš (tower98) wrote :
Download full text (18.4 KiB)

@Bougron did you register new bug? Failed to find new one... If registered, could you pls share new number?

Got new laptop, seems that I'm having the same issue.

Nov 22 10:05:50 tower9-xps15 kernel: [ 110.580978] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
Nov 22 10:05:50 tower9-xps15 kernel: [ 110.580990] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e8(Transmitter ID)
Nov 22 10:05:50 tower9-xps15 kernel: [ 110.581008] pcieport 0000:00:1d.0: device [8086:a118] error status/mask=00001000/00002000
Nov 22 10:05:50 tower9-xps15 kernel: [ 110.581009] pcieport 0000:00:1d.0: [12] Replay Timer Timeout

$ uname -r
4.13.0-16-generic

ps:
root 4177 0.0 0.0 26804 4740 pts/2 S+ 10:15 0:00 | | \_ /usr/bin/perl /var/lib/dpkg/info/linux-headers-4.13.0-17-generic.postin
root 4178 0.0 0.0 4468 896 pts/2 S+ 10:15 0:00 | | \_ run-parts --verbose --exit-on-error --arg=4.13.0-17-generic --arg=/
root 4179 0.0 0.0 4608 1708 pts/2 S+ 10:15 0:00 | | \_ /bin/sh /usr/lib/dkms/dkms_autoinstaller start 4.13.0-17-generi
root 4184 0.0 0.0 12936 1024 pts/2 S+ 10:15 0:00 | | \_ plymouth --ping

$ sudo lspci -v
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
 Subsystem: Dell Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
 Flags: bus master, fast devsel, latency 0
 Capabilities: [e0] Vendor Specific Information: Len=10 <?>

00:01.0 PCI bridge: Intel Corporation Skylake PCIe Controller (x16) (rev 05) (prog-if 00 [Normal decode])
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
 I/O behind bridge: 0000e000-0000efff
 Memory behind bridge: ec000000-ed0fffff
 Prefetchable memory behind bridge: 00000000c0000000-00000000d1ffffff
 Capabilities: [88] Subsystem: Dell Skylake PCIe Controller (x16)
 Capabilities: [80] Power Management version 3
 Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
 Capabilities: [a0] Express Root Port (Slot+), MSI 00
 Capabilities: [100] Virtual Channel
 Capabilities: [140] Root Complex Link
 Capabilities: [d94] #19
 Kernel driver in use: pcieport
 Kernel modules: shpchp

00:02.0 VGA compatible controller: Intel Corporation Device 591b (rev 04) (prog-if 00 [VGA controller])
 Subsystem: Dell Device 07be
 Flags: bus master, fast devsel, latency 0, IRQ 135
 Memory at eb000000 (64-bit, non-prefetchable) [size=16M]
 Memory at 80000000 (64-bit, prefetchable) [size=256M]
 I/O ports at f000 [size=64]
 [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
 Capabilities: [40] Vendor Specific Information: Len=0c <?>
 Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
 Capabilities: [d0] Power Management version 2
 Capabilities: [100] Process Address Space ID (PASID)
 Capabilities: [200] Address Translation Service (ATS)
 Capabilities: [300] Page Request Interface (PRI)
 Kernel driver in use: i915
 Kernel modules: i915

00:0...

Revision history for this message
Rolands Kusiņš (tower98) wrote :

Sorry not enough coffee in a morning. ps output was meant for frozen kernel update...

information type: Public → Public Security
information type: Public Security → Public
information type: Public → Public Security
information type: Public Security → Public
Revision history for this message
Bruno Randolf (br1-l) wrote :

Setting "Fastboot" to "Through" in the BIOS (v2.4.2) of my XPS 13 9360 fixed this error.

Revision history for this message
spike speigel (frail-knight) wrote :

I'm not seeing this spammed in dmesg. Only maybe once per boot, but I'm seeing the following on my 9360 running Ubuntu 17.10:

[ 4649.396767] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[ 4649.396784] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 4649.396800] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 4649.396811] pcieport 0000:00:1c.4: [12] Replay Timer Timeout

Revision history for this message
Thorsten Munsch (thorsten-munsch) wrote :

Still present in (X)ubuntu 17.10 with kernel 4.13.0-32-generic.

The trigger is the onboard Realtek network chip on my Gigabyte GA-AB350 Gaming 3 (AMD Ryzen) mainboard:

+-01.3-[01-05]--+-00.0 Advanced Micro Devices, Inc. [AMD] USB 3.1 XHCI Controller
           | +-00.1 Advanced Micro Devices, Inc. [AMD] Device 43b7
           | \-00.2-[02-05]--+-00.0-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           | +-01.0-[04]--
           | \-04.0-[05]--

Revision history for this message
Thorsten Munsch (thorsten-munsch) wrote :

I noticed weird network problems on this system aswell, when plugging in a USB3 external harddisk and on Friday even when I just plugged in my mobile phone just to load the battery.

Don't know if this is connected in some way. Yesterday I updated the UEFI/BIOS and will watch if this is still happening.

Revision history for this message
angelalberto (flkangel) wrote :

I had same messages, to get hide it I use Fastboot and pcie_aspm=off. I think this only hide the messages, WiFi works correctly

Revision history for this message
roussel geoffrey (roussel-geoffrey) wrote :

I had the same problem and fix #9 worked for me (adding "pci=noaer").

I'm on Ubuntu 17.10 on a HP Pavilion laptop 14-008nf and all hardware seems to be working.
I was flooded with this(took lots of disk space cause logging constantly):

akem@akem-HP-Pavilion-Notebook:~$ tail /var/log/syslog
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950330] pcieport 0000:00:1d.0: device [8086:9d1b] error status/mask=00000001/00002000
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950337] pcieport 0000:00:1d.0: [ 0] Receiver Error (First)
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950348] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950746] pcieport 0000:00:1d.0: can't find device of ID00e8
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950751] pcieport 0000:00:1d.0: AER: Corrected error received: id=00e8
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950762] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e8(Receiver ID)
Mar 20 19:00:03 akem-HP-Pavilion-Notebook kernel: [20571.950770] pcieport 0000:00:1d.0: device [8086:9d1b] error status/mask=00000001/00002000

Revision history for this message
Luca (zapduke) wrote :

I too like Thorsten have a Ryzen workstation, and incidentally the same network chipset but a different motherboard (asrock ab350m pro4)

1f:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller

I think this error message is generic and different type of problems hides behind it, in my case every once in a while my PC slowly freeze, before freezing I see the AER error, after I have seen a lot of them it freeze.

adding "pcie_aspm=off" makes the error messages disappear and it freeze silently or it shows "r8169:rtl_counters_cond == 1" before freezing.

Revision history for this message
Warner (warner-veltman) wrote :

I can confirm similar issues on AMD Threadripper.

The issues went away on Ubuntu 17.10 by adding "pci_aspm=off" to grub, but are re-introduced by upgrading to 18.04. Strangely, the error now occurs in both 4.13 and 4.15 kernels.

One solution I found is to set PCIe to 2.0 instead of the default 3.0 in BIOS (but this comes at a slight performance cost).

Do we know if this will be assigned / solved soon?

May 4 13:09:38 TR-Ubuntu kernel: [ 76.552730] pcieport 0000:00:01.1: [12] Replay Timer Timeout
May 4 13:09:38 TR-Ubuntu kernel: [ 76.563746] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
May 4 13:09:38 TR-Ubuntu kernel: [ 76.563759] pcieport 0000:00:01.1: AER: Multiple Corrected error received: id=0000
May 4 13:09:38 TR-Ubuntu kernel: [ 76.563788] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Transmitter ID)
May 4 13:09:38 TR-Ubuntu kernel: [ 76.563790] pcieport 0000:00:01.1: device [1022:1453] error status/mask=00001080/00006000
May 4 13:09:38 TR-Ubuntu kernel: [ 76.563792] pcieport 0000:00:01.1: [ 7] Bad DLLP

Revision history for this message
M (manudv7) wrote :

I have this error on my Asus X541U, with Ubuntu 18.04, please solve it.

It generates an endless list with this error:

PCIe Bus Error: severity=Corrected, type=Physical Layer,
id=00e5(Receiver ID) device [8086:9d15] error status/mask=00000001/00002000

And I can't log in.

And it is related to the Wi-Fi board of my PC, which is the following:Realtek Semiconductor Co., Ltd. RTL8723BE PCIe Wireless Network Adapter

M (manudv7)
Changed in linux (Ubuntu):
status: Confirmed → In Progress
M (manudv7)
Changed in linux (Ubuntu):
status: In Progress → Confirmed
Revision history for this message
Dave Howson (dave.sohan) wrote :

What is the current status of this issue?
I am facing it on my MSI laptop and I'm not sure if disabling interrupts or turning off active-state power management is the right solution?

Revision history for this message
Riko Naka (rikonaka) wrote :

I have the same problem in my computer since I upgrade the last linux kernel 4.4.0-128-generic, and my sound card can not work as usual.

Jun 18 01:48:00 home kernel: [ 353.142183] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
Jun 18 01:48:00 home kernel: [ 353.142194] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
Jun 18 01:48:00 home kernel: [ 353.142197] pcieport 0000:00:1c.0: device [8086:a115] error status/mask=00000001/00002000
Jun 18 01:48:00 home kernel: [ 353.142200] pcieport 0000:00:1c.0: [ 0] Receiver Error

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Phillip Sz (phillip-sz) wrote :

Where is this fixed?

Revision history for this message
asusbios (asusbios) wrote :

This is not fixed for me. Asus x541u

Revision history for this message
asusbios (asusbios) wrote :

Just an update with this. I tried with fedora rawhide with 4.18 kernel and Ubuntu daily cosmic cuttlefish and the issue is present there too.

I am unable to change the status of this bug back to confirmed.

Revision history for this message
Luiz (lmfranco) wrote :

Let me tell my history:

I bought a dell inspiron 7000 series with windows 10.

Replaced by ubuntu 18.04

I make some bios upgrades. my actual bios is the newer.

One day i opened my notebook and do a ssd upgrade.

My error was to unplug the battery because one pin twisted.

I noticed in ubuntu that the notebook was not charging until the battery FULL charge, even with less one pin the batery should charge LESS than full design(3684000) and was charging at "full" (3403000).

Ubuntu was showing all the time CHARGING and never completelly charged.

The error pci aer corrected occurs as i could see in dmesg log.

with lspci i described the pci hardware:
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 (rev f1)

Another problem i noticed is that the bios date was bugged when i restore factory bios settings. One bios upgrade i download the file(bios) from another country my notebook model over. I'm not sure this is the cause rtc bug.

I am not completelly sure how the faithful solution sequence was made by me, so i am only reporting ALL long story happened.

In my opinion it can be TIMER bug. One pci aer dmesg log mention replay timer.

What i made:

1 - Download from LOCAL vendor bios. I download las time i upgrade from another country bios vendor, over proxy.

2- Cleared setup (with blank) password. I cleared my only defined admin password.

3- Restarted and flashed the bios download from my model locale.

4- Reset cmos/nvram to defaults factory.

5 - Adjust time at bios and sync with ubuntu via command: timedatectl set-local-rtc 1 --adjust-system-clock. Using bios hardware time rtc, not NTP from servers. I tryied with no success: hwclock --systohc. Restart. At this point the ubuntu shows full charged message at battery system menu status. Even with not charging my battery full design.

6- I leave my notebook turned on full time until the drain total remaining battery. Unplugged from power.

7 - Plug AC Power and turn on the notebook.

The error still occurs, but not in boot time from dmesg logs. My /etc/default/grub file: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash". Not using any kernel custom parameter.

Conclusion
Its not Linux bug. In my experience it can be related to: Bios/Time&Date/cmos/nvram. Maybe my battery in short circuit. I Still not tested with battery disconnected. I put my note to function with primary AC use. When i ran a game the error shows up dmesg log, not after boot the system.

The final procedure was to leave my battery drains with notebook switched on.

Revision history for this message
Luiz (lmfranco) wrote :

the initial error report is(several log messages dmesg):

[ 13.078695] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[ 13.078697] pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
[ 13.078698] pcieport 0000:00:1c.4: [12] Replay Timer Timeout

My wifi card is:
02:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)

All Qualcomm Atheros QCA6174 dmesg log messages:
[ 11.539350] ath10k_pci 0000:02:00.0: enabling device (0000 -> 0002)
[ 11.540180] ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 11.821802] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:02:00.0.bin failed with error -2
[ 11.821813] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/cal-pci-0000:02:00.0.bin failed with error -2
[ 11.825550] ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1028:0310
[ 11.825552] ath10k_pci 0000:02:00.0: kconfig debug 0 debugfs 1 tracing 1 dfs 0 testmode 0
[ 11.825976] ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00079-QCARMSWPZ-1 api 6 features wowlan,ignore-otp crc32 fd869beb
[ 11.896449] ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 20d869c3
[ 12.560585] ath10k_pci 0000:02:00.0: Unknown eventid: 118809
[ 12.563589] ath10k_pci 0000:02:00.0: Unknown eventid: 90118
[ 12.564353] ath10k_pci 0000:02:00.0: htt-ver 3.47 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1
[ 12.655660] ath10k_pci 0000:02:00.0 wlp2s0: renamed from wlan0
[ 13.448044] ath10k_pci 0000:02:00.0: Unknown eventid: 118809
[ 13.451050] ath10k_pci 0000:02:00.0: Unknown eventid: 90118

My ath10k_pci tryied to load:
[ 11.821802] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:02:00.0.bin failed with error -2
[ 11.821813] ath10k_pci 0000:02:00.0: Direct firmware load for ath10k/cal-pci-0000:02:00.0.bin failed with error -2

The pci aer message disappears when i disable Qualcomm Atheros QCA6174 WIFI. Note that i keep bluetooth firmware initial bios load. Referenced by the same driver ath10k_pci.

The wifi is connected i think pcie slot, maybe pcie x1.

My "fix" is to disable wifi firmware loading on bios and use ethernet.

Pci aer error with wifi firmware disabled has gone, even when playing.

My notebook error with the battery is physical, maybe the vendor support can do something.
Maybe One solution related to wifi firmware is to use windows 10 driver/firmware and add to linux kernel driver tree after testing. The firmware is packaged with wifi driver, only need to extract. Other fix can be to upgrade kernel. My version is from ubuntu 18.04 repository 4.15.0-23-generic default kernel.

Bluetooth works great.

Revision history for this message
Dimitrios Menounos (dmenounos) wrote :

I have a Dell Inspiron 5570 with Intel i7-8550U CPU. I face the same problem with the Ubuntu 16.04 OEM install and a fresh Kubuntu 18.04 install.

22/7/18 3:01 Μ.Μ. kernel pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
22/7/18 3:01 Μ.Μ. kernel pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e5(Transmitter ID)
22/7/18 3:01 Μ.Μ. kernel pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
22/7/18 3:01 Μ.Μ. kernel pcieport 0000:00:1c.5: [12] Replay Timer Timeout

$ lspci -tv
-[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
           +-02.0 Intel Corporation UHD Graphics 620
           +-04.0 Intel Corporation Skylake Processor Thermal Subsystem
           +-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
           +-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem
           +-15.0 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #0
           +-16.0 Intel Corporation Sunrise Point-LP CSME HECI #1
           +-17.0 Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode]
           +-1c.0-[01]----00.0 Advanced Micro Devices, Inc. [AMD/ATI] Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/M445]
           +-1c.4-[02]----00.0 Realtek Semiconductor Co., Ltd. RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller
           +-1c.5-[03]----00.0 Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter
           +-1f.0 Intel Corporation Device 9d4e
           +-1f.2 Intel Corporation Sunrise Point-LP PMC
           +-1f.3 Intel Corporation Sunrise Point-LP HD Audio
           \-1f.4 Intel Corporation Sunrise Point-LP SMBus

I haven't tried the pci=noaer solution yet. However, judging from (https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt) that only disables the error reporting and thous is not a proper fix.

Revision history for this message
Leonidas S. Barbosa (leosilvab) wrote :
Download full text (7.7 KiB)

I have the same issue in mey dell inspiron 5378 i7.

lspci -vt
-[0000:00]-+-00.0 Intel Corporation Device 5904
           +-02.0 Intel Corporation Device 5916
           +-04.0 Intel Corporation Skylake Processor Thermal Subsystem
           +-13.0 Intel Corporation Device 9d35
           +-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
           +-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem
           +-15.0 Intel Corporation Sunrise Point-LP Serial IO I2C Controller
           +-15.1 Intel Corporation Sunrise Point-LP Serial IO I2C Controller
           +-16.0 Intel Corporation Sunrise Point-LP CSME HECI
           +-17.0 Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode]
           +-1c.0-[01]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           +-1f.0 Intel Corporation Device 9d58
           +-1f.2 Intel Corporation Sunrise Point-LP PMC
           +-1f.3 Intel Corporation Device 9d71
           \-1f.4 Intel Corporation Sunrise Point-LP SMBus

It also seems my wifi card is struggling it's quite annoying.

Dmesg info:

 7024.543968] acpi INT3400:00: Unsupported event [0x86]
[ 7127.808824] acpi INT3400:00: Unsupported event [0x86]
[ 7527.461667] wlp1s0: deauthenticating from 10:62:d0:9d:dc:b2 by local choice (Reason: 3=DEAUTH_LEAVING)
[ 7532.477432] wlp1s0: authenticate with 10:62:d0:9d:dc:b2
[ 7532.527885] wlp1s0: send auth to 10:62:d0:9d:dc:b2 (try 1/3)
[ 7532.529562] wlp1s0: authenticated
[ 7532.531874] wlp1s0: associate with 10:62:d0:9d:dc:b2 (try 1/3)
[ 7532.535501] wlp1s0: RX AssocResp from 10:62:d0:9d:dc:b2 (capab=0x411 status=0 aid=3)
[ 7532.537945] wlp1s0: associated
[ 7627.582766] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 7627.582785] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[ 7627.582801] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00001000/00000000
[ 7627.582815] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[ 7732.910629] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 7732.910649] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[ 7732.910662] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00001000/00000000
[ 7732.910670] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[ 7733.522628] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 7733.522648] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[ 7733.522661] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00001000/00000000
[ 7733.522669] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[ 7761.990278] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[ 7761.990300] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[ 7761.990313] pcieport 0000:00:1c.0: device [8086:9d14] error status/mask=00001000/00000000
[ 7761.990323] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[ 7821.073917] pcieport 0000:00:1c.0: AER: Corrected error received:...

Read more...

Revision history for this message
Leonidas S. Barbosa (leosilvab) wrote :

I'm in Xenial : Linux 4.15.0-29-generic #31~16.04.1-Ubuntu

Revision history for this message
Lucas Czepaniki (lucas.czpnk) wrote :

I'm also having this issue on my Dell Inspiron 14 7472.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic

$ uname -a
Linux bionic 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ dmesg
[ 2196.965458] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
[ 2196.965466] pcieport 0000:00:1c.5: [12] Replay Timer Timeout
[ 2197.399555] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 2197.399566] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e5(Transmitter ID)
[ 2197.399569] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
[ 2197.399571] pcieport 0000:00:1c.5: [12] Replay Timer Timeout
[ 2197.644496] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 2197.644506] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e5(Transmitter ID)
[ 2197.644509] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
[ 2197.644511] pcieport 0000:00:1c.5: [12] Replay Timer Timeout
[ 2198.273044] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 2198.273053] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e5(Transmitter ID)
[ 2198.273056] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
[ 2198.273058] pcieport 0000:00:1c.5: [12] Replay Timer Timeout
[ 2198.274547] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
[ 2198.274557] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e5(Transmitter ID)
[ 2198.274561] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00001000/00002000
[ 2198.274564] pcieport 0000:00:1c.5: [12] Replay Timer Timeout
...

and it goes like this for a ton of lines.

Revision history for this message
Amr Elbeleidy (beleidy) wrote :

*First linux bug report*

I am also affected by this issue Bionic: 4.15.0-33 on a Dell Aurora R6

Processor is i7-7700K Kabylake getting the issue with the PCI bus connected to Intel Corporation Wireless 3165 card.

I see someone said a fix is in place, but cannot find where the fix has been released.

Revision history for this message
C de-Avillez (hggdh2) wrote :

Reverting to Triaged on the Ubuntu task, Also making clear there is a workaround (pci=noaer) for it.

Changed in linux (Ubuntu):
status: Fix Released → Triaged
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The mainline kernel is now at v4.19-rc6. It might be worth testing this kernel to see if the bug has been fixed upstream. It can be downloaded from:

 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc6

C de-Avillez (hggdh2)
description: updated
Revision history for this message
StoatWblr (stoatwblr) wrote :

This is also present in later distro versions, right up to Cosmic.

In my case it manifests on Supermicro and Intel 7500/5500/5520/X58 - based servers when Qlogic QLE2562 fibre optic cards are used - and _ONLY_ with Qlogic cards, nothing else seems to trigger it

As with the wifi cards on laptops, a S3 cycle stops it.

Revision history for this message
PedroCorreia (pmfernandez) wrote :

I'm still having this issue with my Dell Inspiron 5570 and a Ubuntu 18.04 fully updated install.
I had this issue since i bought it (6 months ago), but now its even worse. Right now my wifi keeps disconnecting and a hard reboot is required to make it work again.

Also, the wifi icon on the top panel keeps showing an interrogation instead of the wifi icon.

Restarting the Network manager results on an inability to detect any wifi connections.

I have several errors in dmesg when this happens like ( failed to wake target to writing ... )

Revision history for this message
jack lemon (mb0087) wrote :

I'm also experiencing this bug with the latest debian testing release:
Linux lemon 4.19.0-1-amd64 #1 SMP Debian 4.19.12-1 (2018-12-22) x86_64 GNU/Linux

Revision history for this message
Shaheed Haque (srhaque-i) wrote :

I am seeing this on an updated Cosmic with a Dell Inspiron 5570. Kernel version is presently 4.18.0-15.16.

Revision history for this message
vmc (vmclark) wrote :

I get the pcie errors on all X,L,K Ubuntu's disco 19.04
====
[ 3056.549121] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[ 3056.549136] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 3056.549147] pcieport 0000:00:1c.0: device [8086:a33d] error status/mask=00001000/00002000
[ 3056.549154] pcieport 0000:00:1c.0: [12] Timeout
================

00:1c.0 0604: 8086:a33d (rev f0) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 122
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 00003000-00003fff
        Memory behind bridge: a2100000-a21fffff
        Prefetchable memory behind bridge: 00000000a0000000-00000000a00fffff
        Capabilities: <access denied>
        Kernel driver in use: pcieport
=================
lspci -s 000:00:1c.0
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port (rev f0)
=================
$ lspci -vt
-[0000:00]-+-00.0 Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers
           +-02.0 Intel Corporation UHD Graphics 630 (Desktop)
           +-08.0 Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model
           +-12.0 Intel Corporation Cannon Lake PCH Thermal Controller
           +-14.0 Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller
           +-14.2 Intel Corporation Cannon Lake PCH Shared SRAM
           +-14.3 Intel Corporation Wireless-AC 9560 [Jefferson Peak]
           +-16.0 Intel Corporation Cannon Lake PCH HECI Controller
           +-17.0 Intel Corporation SATA Controller [RAID mode]
           +-1c.0-[01]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1f.0 Intel Corporation Device a308
           +-1f.3 Intel Corporation Cannon Lake PCH cAVS
           +-1f.4 Intel Corporation Cannon Lake PCH SMBus Controller
           \-1f.5 Intel Corporation Cannon Lake PCH SPI Controller

Revision history for this message
Mohamed Salama (gray.hat.enigma) wrote :

I have experienced the same bug on my laptop HP - Pavilion and I think the problem is in compatibility between Wireless Card RTL8723BE PCIe and the linux kernal
causing it to infinitely log a PCIe Error on every boot/reboot of the system which cause a huge log files size on the disk!

I 'm not with the approach to suppress the warning on start up using grub defaults parameters (pci=nomsi and pci=noaer) .. I think this may endanger the system if a serious error/problem arises in the future and anyway the wifi card doesn't function properly due to the error.

So I think it might better to disable it permanently and use another wifi drive ( for example: usb adapter )
# Of course this is a "temp" solution until there is a fix to the kernal regarding this matter, but it will do the trick

Steps:
# In /etc/modprobe.d/blacklist.conf add this line at the end of the file then reboot the system
blacklist rtl8723be

# After rebooting to make sure the driver is disabled execute this
lsmod | grep rtl

# To get the kernal module related to the card name simply execute
lspci -nnk

Revision history for this message
Pablo Palácios (ppalacios) wrote :

I've got the same using archlinux latest kernel with a dell computer, i7 and an pcie network card as well. I've found this thread on redhat bugzilla very helpful:

https://bugzilla.redhat.com/show_bug.cgi?id=681017

I was able to solve my problem by explicitly disabling aspm in my bios. From factory it was set to auto which perhaps could result in "device has no support for aspm but let's enabled aspm anyway" behavior making kernel confused.

Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
Willem Hobers (whobers) wrote :

Seeing this on Linux LAPTOP 5.0.0-25-generic #26~18.04.1-Ubuntu SMP Thu Aug 1 13:51:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux, running xubuntu 18.04.3.

description: Notebook
    product: Aspire A315-53 (0000000000000000)
    vendor: Acer

     *-pci
          description: Host bridge
          product: Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 08
          width: 32 bits
          clock: 33MHz
          configuration: driver=skl_uncore
          resources: irq:0

If there's any other info I can provide, please let me know.

Revision history for this message
Utkarsh (ashubisht) wrote :

I am getting the same issue with Ubuntu 19.04.
I am using HP Probook 440G3 having i7-6500U processor and RTL8723BE network card.
Any ideas on when this issue is planned to be patched?

Getting this info from Windows counterpart, as I am still getting issues after applying nomsi

Name LocationInfo UINumber
---- ------------ --------
Realtek RTL8723BE 802.11 bgn Wi-Fi Adapter PCI bus 3, device 0, function 0 5
Realtek PCIe GBE Family Controller PCI bus 2, device 0, function 0 4
Realtek PCIE CardReader PCI bus 4, device 0, function 0 8

Revision history for this message
V-Mark (vertesmark) wrote :

I have similar problem, but I got "Timeout"
ACER Nitro 5 - Ubuntu 19.04 fresh install.
Intel(R) Core(TM) i7-8750H

Spamming in few minutes the following (on example):
Sep 25 03:09:06 mark-Nitro-AN515-52 kernel: [ 4326.007230] pcieport 0000:00:1d.5: AER: Corrected error received: 0000:00:1d.5
Sep 25 03:09:06 mark-Nitro-AN515-52 kernel: [ 4326.007248] pcieport 0000:00:1d.5: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Sep 25 03:09:06 mark-Nitro-AN515-52 kernel: [ 4326.007256] pcieport 0000:00:1d.5: device [8086:a335] error status/mask=00001000/00002000
Sep 25 03:09:06 mark-Nitro-AN515-52 kernel: [ 4326.007261] pcieport 0000:00:1d.5: [12] Timeout

Spamming means: Sometimes 1 every 2-4 minutes, sometime I have 1 hour without any spam.

>lspci -vt
-[0000:00]-+-00.0 Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers
           +-01.0-[01-05]--+-00.0 NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile]
           | \-00.1 NVIDIA Corporation GP107GL High Definition Audio Controller
           +-02.0 Intel Corporation UHD Graphics 630 (Mobile)
           +-08.0 Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model
           +-12.0 Intel Corporation Cannon Lake PCH Thermal Controller
           +-14.0 Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller
           +-14.2 Intel Corporation Cannon Lake PCH Shared SRAM
           +-14.3 Intel Corporation Wireless-AC 9560 [Jefferson Peak]
           +-15.0 Intel Corporation Device a368
           +-15.1 Intel Corporation Device a369
           +-16.0 Intel Corporation Cannon Lake PCH HECI Controller
           +-17.0 Intel Corporation Device a353
           +-1d.0-[06]----00.0 Device 1cc1:8201
           +-1d.5-[07]--+-00.0 Realtek Semiconductor Co., Ltd. RTL8411B PCI Express Card Reader
           | \-00.1 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1e.0 Intel Corporation Device a328
           +-1f.0 Intel Corporation Device a30d
           +-1f.3 Intel Corporation Cannon Lake PCH cAVS
           +-1f.4 Intel Corporation Cannon Lake PCH SMBus Controller
           \-1f.5 Intel Corporation Cannon Lake PCH SPI Controller

Revision history for this message
Luigi Calligaris (luigicalligaris) wrote :

Dell Inspiron P74G, Kubuntu 19.04 Disco, kernel 5.0.0-13-generic.

I'm affected as well by this bug, with ~50 lines per minute of errors in the syslog.

I noticed only recently the issue on my Kubuntu 18.04 LTS setup (say, this October 2019). Since then I upgraded to 19.04, but with no improvement. My errors in dmesg are of the same form as stated above, with two recurring types of error statuses:

pcieport 0000:00:1c.4: AER: Corrected error received: 0000:00:1c.4
pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00001000/00002000
pcieport 0000:00:1c.4: [12] Timeout

pcieport 0000:00:1c.4: AER: Corrected error received: 0000:00:1c.4
pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
pcieport 0000:00:1c.4: device [8086:9d14] error status/mask=00003000/00002000
pcieport 0000:00:1c.4: [12] Timeout

That pcie port is shown to be connected to the Atheros WiFi of the laptop:

+-1c.4-[02]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter

The output of lshw for it is:

        *-pci:1
             description: PCI bridge
             product: Sunrise Point-LP PCI Express Root Port #5
             vendor: Intel Corporation
             physical id: 1c.4
             bus info: pci@0000:00:1c.4
             version: f1
             width: 32 bits
             clock: 33MHz
             capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
             configuration: driver=pcieport
             resources: irq:123 memory:d5000000-d51fffff
           *-network
                description: Wireless interface
                product: QCA6174 802.11ac Wireless Network Adapter
                vendor: Qualcomm Atheros
                physical id: 0
                bus info: pci@0000:02:00.0
                logical name: wlp2s0
                version: 32
                serial: [edited for privacy]
                width: 64 bits
                clock: 33MHz
                capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless
                configuration: broadcast=yes driver=ath10k_pci driverversion=5.0.0-13-generic firmware=RM.4.4.1.c2-00057-QCARMSWP-1 ip=192.168.0.71 latency=0 link=yes multicast=yes wireless=IEEE 802.11
                resources: irq:131 memory:d5000000-d51fffff

I cannot find an APSM disable option in my BIOS setup.

A guy named Dennis E. Mungai digged into the issue last year (link below), and his temporary fix (turning off the report bit for AER Corrected errors) worked for me, without the need to turn off AER for the whole system.

https://gist.github.com/Brainiarc7/3179144393747f35e5155fdbfd675554

I find interesting that for most of us this issue affects laptop WiFi cards from different vendors.

information type: Public → Public Security
information type: Public Security → Public
Revision history for this message
Ricardo S O Leite (ricsdeol) wrote :
Download full text (14.5 KiB)

Hi, Dell G3 3579 (086F)
Ubuntu 20.04

LOG:
[72720.138307] pcieport 0000:00:1d.6: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[72720.138314] pcieport 0000:00:1d.6: AER: device [8086:a336] error status/mask=00001000/00002000
[72720.138320] pcieport 0000:00:1d.6: AER: [12] Timeout

PCI INFO:
➜ sudo lspci -v
00:00.0 Host bridge: Intel Corporation 8th Gen Core 4-core Processor Host Bridge/DRAM Registers [Coffee Lake H] (rev 07)
 DeviceName: Onboard - Other
 Subsystem: Dell 8th Gen Core 4-core Processor Host Bridge/DRAM Registers [Coffee Lake H]
 Flags: bus master, fast devsel, latency 0
 Capabilities: [e0] Vendor Specific Information: Len=10 <?>
 Kernel driver in use: skl_uncore

00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 07) (prog-if 00 [Normal decode])
 Flags: bus master, fast devsel, latency 0, IRQ 122
 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
 I/O behind bridge: 00004000-00004fff [size=4K]
 Memory behind bridge: a3000000-a40fffff [size=17M]
 Prefetchable memory behind bridge: 0000000090000000-00000000a1ffffff [size=288M]
 Capabilities: [88] Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
 Capabilities: [80] Power Management version 3
 Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
 Capabilities: [a0] Express Root Port (Slot+), MSI 00
 Capabilities: [100] Virtual Channel
 Capabilities: [140] Root Complex Link
 Capabilities: [d94] Secondary PCI Express
 Kernel driver in use: pcieport

00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 630 (Mobile) (prog-if 00 [VGA controller])
 DeviceName: Onboard - Video
 Subsystem: Dell UHD Graphics 630 (Mobile)
 Flags: bus master, fast devsel, latency 0, IRQ 130
 Memory at a2000000 (64-bit, non-prefetchable) [size=16M]
 Memory at 80000000 (64-bit, prefetchable) [size=256M]
 I/O ports at 5000 [size=64]
 Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
 Capabilities: [40] Vendor Specific Information: Len=0c <?>
 Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
 Capabilities: [d0] Power Management version 2
 Capabilities: [100] Process Address Space ID (PASID)
 Capabilities: [200] Address Translation Service (ATS)
 Capabilities: [300] Page Request Interface (PRI)
 Kernel driver in use: i915
 Kernel modules: i915

00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem (rev 07)
 DeviceName: Onboard - Other
 Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem
 Flags: fast devsel, IRQ 16
 Memory at a4610000 (64-bit, non-prefetchable) [size=32K]
 Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
 Capabilities: [d0] Power Management version 3
 Capabilities: [e0] Vendor Specific Information: Len=0c <?>
 Kernel driver in use: proc_thermal
 Kernel modules: processor_thermal_device

00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
 DeviceName:...

Revision history for this message
Martin Vernay (magean) wrote :

I've been getting a similar problem on a Leopard GP73-8RE laptop from MSI, on Ubuntu 20.04 as well as 19.10 and 18.04.

This is the message that spammed in my system journal, causing it to inflate very rapidly to ludicrous proportions:

    22:36:51 kernel: alx 0000:03:00.0: AER: [ 7] BadDLLP
    22:36:51 kernel: alx 0000:03:00.0: AER: device [1969:e0a1] error status/mask=00000080/00002000
    22:36:51 kernel: alx 0000:03:00.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)

I tried the following kernel parameters:

-`pci=nomsi`: this also disabls removable devices... not an option.

-`pci=noaer` : disables advanced error reporting without fixing the errors themselves. It works insofar as it suppresses the message flood. However, this is akin to "shooting the messenger": it also prevents troubleshooting other, potentially more serious, errors that won't be reported as well. Plus, letting errors occur continuously might not be the optimal solution, even though these errors are apparently getting corrected.

-`pci=nommconf`: this gets rid of the errors, and so far hasn't had any undesirable side effect. I'll report back if I notice any.

Someone on reddit has also suggested `pcie_aspm=off` :
https://www.reddit.com/r/linuxquestions/comments/g8pbku/any_undesirable_side_effects_of_pcinommconf/foq8eut/

But I haven't tried it myself.

Revision history for this message
Martin Vernay (magean) wrote :

So, although `pci=nommconf` gets rid of the error flood, it does apparently make some collateral damage. After a few days under this kernel parameter, the person who uses the laptop on a daily basis reported a decrease in responsiveness and stability, with occasional stutters if I understood correctly. Then I was called to help with a black screen. And indeed, there was nothing to be done but a hard power-off. I couldn't even access a tty. At that point I decided to stop the experiment and reverted to `pci=noaer`; the system then returned to its normal behavior.

I am now trying `pcie_aspm=off`. That apparently gets rid of the errors as well. Hopefully the trade-off is limited to less efficient power saving, which doesn't matter as the laptop is nearly always connected to a power source. Besides, if the error messages were of any indication, ASPM did not work correctly anyway; so, potentially power management won't get worse (what's there to lose by disabling a malfunctioning feature?).

Revision history for this message
Paul Menzel (paulmenzel) wrote :

@magean, I believe you are having a different issue here, so please create a separate bug report, and, as you reproduced this with Linux 5.4 (also try https://kernel.ubuntu.com/~kernel-ppa/), contact <email address hidden> and the PCI subsystem maintainers directly, and attach `dmesg` to your message for starters.

Revision history for this message
smiki (micouk) wrote :
Download full text (5.2 KiB)

my investigation came to same conclusion as #109 (but I'm on 20.04 and latest kernel, so this is still relevant)

My configuration is as follows.
It there is a need to get more information/logs, please let me know.
--

Dell Latitude 7389
miki@DL-7389:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04 LTS"
miki@DL-7389:~$ uname -a
Linux DL-7389 5.4.0-31-generic #35-Ubuntu SMP Thu May 7 20:20:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

dmesg gets spammed by these error reports:
[116522.584941] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[116522.584959] pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[116522.584967] pcieport 0000:00:1c.0: AER: device [8086:9d17] error status/mask=00001000/00002000
[116522.584973] pcieport 0000:00:1c.0: AER: [12] Timeout
[116533.643718] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[116533.643735] pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[116533.643744] pcieport 0000:00:1c.0: AER: device [8086:9d17] error status/mask=00001000/00002000
[116533.643751] pcieport 0000:00:1c.0: AER: [12] Timeout
[116559.755644] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[116559.755655] pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[116559.755658] pcieport 0000:00:1c.0: AER: device [8086:9d17] error status/mask=00001000/00002000
[116559.755660] pcieport 0000:00:1c.0: AER: [12] Timeout

Device causing it seems to be Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter:

miki@DL-7389:~$ sudo lspci -t -v
-[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
           +-02.0 Intel Corporation HD Graphics 620
           +-04.0 Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem
           +-13.0 Intel Corporation Sunrise Point-LP Integrated Sensor Hub
           +-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller
           +-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem
           +-15.0 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #0
           +-15.1 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #1
           +-15.2 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #2
           +-16.0 Intel Corporation Sunrise Point-LP CSME HECI #1
           +-1c.0-[01]----00.0 Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
           +-1d.0-[02]----00.0 Toshiba Corporation Device 0116
           +-1f.0 Intel Corporation Sunrise Point LPC Controller/eSPI Controller
           +-1f.2 Intel Corporation Sunrise Point-LP PMC
           +-1f.3 Intel Corporation Sunrise Point-LP HD Audio
           \-1f.4 Intel Corporation Sunrise Point-LP SMBus

lspci -v detailed output for this device (strange that the serial number is read as zeros):

01:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapte...

Read more...

Revision history for this message
piscvau (piscvau) wrote :

on PC MSI GE73, installation of Xubuntu 18.04 fails,as well as XUBUNTU 19.10. PC was returned to MSI under warranty. Hardware is correct and no problem with windows.
WIth the latest BIOS it is now impossible to boot the PC with an ISO USB key for version 18.04 and 19.10.
WIth XUBUNTU 20.04 the PC boots but after entering session, the system crashes.

Revision history for this message
Paul Menzel (paulmenzel) wrote :

@piscvau, the original report is not about a crash, so your issue is unrelated. Please create a separate report for the crash Ubuntu 20.04. (Also mention there, if it is a system crash/hang? Does the numlock key still work? Can you switch to a virtual console with Ctrl + Alt + F4? Can you still ping the system in the network?) Good luck!

Revision history for this message
Luis A (peppapig123) wrote :

This bug still affect the install process today, with ubuntu 20.04 LTS and same with kubuntu installer.

Revision history for this message
fermulator (fermulator) wrote :

"me too" - Dell Latitude w/ a Dell WD16 USB-C dock. After recent firmware updates it got significantly worse and nearly never properly re-attaches after suspend/resume.

Dell 5400 Latitude:
(0.1.9.1=same, 0.1.7.4=older, 0.1.6.5=older, 0.1.5.1=older, 0.1.4.2=older)

dock:
```
No upgrades for RTS5413 in Dell dock, current is 01.21: 01.21=same
No upgrades for RTS5487 in Dell dock, current is 01.47: 01.47=same
No upgrades for WD19, current is 01.00.00.00: 01.00.00.00=same
No upgrades for Package level of Dell dock, current is 01.00.04.01: 01.00.04.01=same
No upgrades for VMM5331 in Dell dock, current is 05.03.10: 05.03.10=same
```

spammed by
```
[Tue Sep 15 18:50:06 2020] pcieport 0000:00:1d.0: AER: Corrected error received: 0000:00:1d.0
[Tue Sep 15 18:50:06 2020] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[Tue Sep 15 18:50:06 2020] pcieport 0000:00:1d.0: device [8086:9db1] error status/mask=00001000/00002000
[Tue Sep 15 18:50:06 2020] pcieport 0000:00:1d.0: [12] Timeout
```

```
lspci -v | grep 9db1
00:1d.0 PCI bridge: Intel Corporation Device 9db1 (rev f0) (prog-if 00 [Normal decode])
```

Revision history for this message
Bill Duetschler (bikergeek) wrote :

Still an issue for me on Ubuntu 20.10 "Groovy".

Revision history for this message
Wren Turkal (wt-penguintechs-org) wrote :

I have this same problem on a Dell XPS 13 9360 that shipped with Ubuntu 16.04 preloaded. My dmesg logs look identical to what I am seeing above.

Revision history for this message
Wren Turkal (wt-penguintechs-org) wrote :

And FWIW, I have fully upgraded all firmware and also tried both Ubuntu 20.10 and Fedora 33. All of these systems show the same behavior.

Revision history for this message
Wren Turkal (wt-penguintechs-org) wrote :

I also tried all LTS Ubuntus back to 16.04. They all get this log message a lot.

Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

Seen this as well -- although I don't believe it's causing any
problems that we know of -- sure does look right now like it's
only noise in the logs.

Revision history for this message
In , rbelli97 (rbelli97-linux-kernel-bugs) wrote :

Hello to all. I have the same problem, and this has affected me for a long time now. I described it in detail here, with output, videos, photos etc:

https://ubuntuforums.org/showthread.php?t=2460318

I hope this adds useful information to draw attention to the bug in question.

Revision history for this message
Riccardo Belli (rbelli97) wrote :

Hello to all. I have the same problem, and this has affected me for a long time now. I described it in detail here, with output, videos, photos etc:

https://ubuntuforums.org/showthread.php?t=2460318

I hope this adds useful information to draw attention to the bug in question.

Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
Tobias Schönberg (tobias47n9e) wrote :

Since upgrading from Ubuntu 20.10 to 21.04 I get this message like every second in journalctl:

Apr 09 13:00:28 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: AER: Multiple Corrected error received: 0000:00:00.0
Apr 09 13:00:28 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Apr 09 13:00:28 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: device [1022:1453] error status/mask=00001100/00006000
Apr 09 13:00:28 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: [ 8] Rollover
Apr 09 13:00:28 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: [12] Timeout
Apr 09 13:00:29 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: AER: Corrected error received: 0000:00:00.0
Apr 09 13:00:29 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
Apr 09 13:00:29 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: device [1022:1453] error status/mask=00001000/00006000
Apr 09 13:00:29 tobias-MS-7C37 kernel: pcieport 0000:00:03.1: [12] Timeout

Revision history for this message
Paul Menzel (paulmenzel) wrote :

For every one affected, at least attach the output of `lspci -nn`, `dmesg`, and give details for your system.

As this bug has gotten long, and causes go from firmware, firmware configuration to hardware issues, it’s better if you opened a separate report directly upstream, after testing the current Linux kernel using Ubuntu PPA repository [1].

[1]: https://kernel.ubuntu.com/~kernel-ppa/

Revision history for this message
Tobias Schönberg (tobias47n9e) wrote :
Download full text (10.0 KiB)

lspci -nn

00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
20:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream [1022:57ad]
21:00.0 PCI bridge [0604]: Advan...

Revision history for this message
Paul Menzel (paulmenzel) wrote :

Please create a separate bug report, as the error type is different from the original report here. Also, in the new report (best upstream), give more information (firmware version, extension cards, …), and also *attach* (not paste) the output of `lspci -tvnn` and `sudo lspci -vvxxx`.

Revision history for this message
In , pmenzel+bugzilla.kernel.org (pmenzel+bugzilla.kernel.org-linux-kernel-bugs) wrote :

As the ASUS X541UVK is a different device, please create a new bug report with all the necessary information included/attached.

Revision history for this message
In , bjorn (bjorn-linux-kernel-bugs) wrote :

Riccardo, would you mind booting with just "pci=noaer" to see if that works around the problem? Your photo at https://i.imgur.com/PPZ49lL.jpg suggests that it might.

Revision history for this message
Riccardo Belli (rbelli97) wrote :

I just created the new bug report as suggested, here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1944752

Revision history for this message
In , naveennaidu479 (naveennaidu479-linux-kernel-bugs) wrote :

Created attachment 299043
Patch for the AER message spew

Hello Folks,

I have been working on a patch for the AER message spew. I have a potential patch ready for the problem, but unfortunately, I do not have a system that outputs the same AER errors so I am unable to test it out.

It would really help if anyone could please test this patch and see if it solved the AER message spew.

Thanks,
Naveen Naidu

Revision history for this message
In , naveennaidu479 (naveennaidu479-linux-kernel-bugs) wrote :

(In reply to Naveen Naidu from comment #11)
> Created attachment 299043 [details]
> Patch for the AER message spew
>
> Hello Folks,
>
> I have been working on a patch for the AER message spew. I have a potential
> patch ready for the problem, but unfortunately, I do not have a system that
> outputs the same AER errors so I am unable to test it out.
>
> It would really help if anyone could please test this patch and see if it
> solved the AER message spew.
>
> Thanks,
> Naveen Naidu

Forgot to mention! This patch would make the "pci=noaer" unnecessary.

tags: added: patch
Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Created attachment 299047
attachment-6460-0.html

Hi Naveen.
Absolutely, I can test.
I can try it out this weekend.

Chris

⁣Get BlueMail for Android ​

On Oct 1, 2021, 2:35 AM, at 2:35 AM, <email address hidden> wrote:
>https://bugzilla.kernel.org/show_bug.cgi?id=109691
>
>Naveen Naidu (<email address hidden>) changed:
>
> What |Removed |Added
>----------------------------------------------------------------------------
> CC| |<email address hidden>
>
>--- Comment #11 from Naveen Naidu (<email address hidden>) ---
>Created attachment 299043
> --> https://bugzilla.kernel.org/attachment.cgi?id=299043&action=edit
>Patch for the AER message spew
>
>Hello Folks,
>
>I have been working on a patch for the AER message spew. I have a
>potential
>patch ready for the problem, but unfortunately, I do not have a system
>that
>outputs the same AER errors so I am unable to test it out.
>
>It would really help if anyone could please test this patch and see if
>it
>solved the AER message spew.
>
>Thanks,
>Naveen Naidu
>
>--
>You may reply to this email to add a comment.
>
>You are receiving this mail because:
>You reported the bug.

Revision history for this message
In , naveennaidu479 (naveennaidu479-linux-kernel-bugs) wrote :

Comment on attachment 299043
Patch for the AER message spew

I apologize, please ignore this patch. I realized there is a bug in the patch. I have fixed it now and will upload it. I apologized for the inconvenience caused. I do not know how to delete this patch, so I'll reupload a new patch. Apologies again ^^'

Revision history for this message
In , naveennaidu479 (naveennaidu479-linux-kernel-bugs) wrote :

Created attachment 299071
Patch for the AER message spew

This is the correct patch. Please use this and ignore the previous patch.

Revision history for this message
In , naveennaidu479 (naveennaidu479-linux-kernel-bugs) wrote :

Created attachment 299073
Patch for the AER message spew

Revision history for this message
Naveen Naidu (theprophet26) wrote :

This is the correct patch for the AER message spew.

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Created attachment 299081
attachment-4100-0.html

Okay sounds good.
I will try it soon.

Chris

⁣Get BlueMail for Android ​

On Oct 3, 2021, 2:03 AM, at 2:03 AM, <email address hidden> wrote:
>https://bugzilla.kernel.org/show_bug.cgi?id=109691
>
>Naveen Naidu (<email address hidden>) changed:
>
> What |Removed |Added
>----------------------------------------------------------------------------
> Attachment #299043|0 |1
> is obsolete| |
>
>--- Comment #15 from Naveen Naidu (<email address hidden>) ---
>Created attachment 299071
> --> https://bugzilla.kernel.org/attachment.cgi?id=299071&action=edit
>Patch for the AER message spew
>
>This is the correct patch. Please use this and ignore the previous
>patch.
>
>--
>You may reply to this email to add a comment.
>
>You are receiving this mail because:
>You reported the bug.

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :
Download full text (3.8 KiB)

Are you good with me using kernel: 5.11.0-37-generic  or would you
prefer I use a different kernel?
The X555U is currently running Linux Mint 20.2 Cinnamon.

FYI:
I tried removing pci=noaer and it does boot now (without your patch).
It has been a while since I tried removing pci=noaer and new kernels get
installed all the time so not sure what kernel first started allowing it
to boot without needing that line.
However, there are still many errors on boot.

dmesg --level=err,warn
[    0.105337] x86/cpu: VMX (outside TXT) disabled by BIOS
[    0.110761] MDS CPU bug present and SMT on, data leak possible. See
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for
more details.
[    0.110761]  #3
[    0.114598] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.135583] ACPI BIOS Error (bug): Could not resolve symbol
[\_SB.PCI0.RP01.PXSX], AE_NOT_FOUND (20201113/psargs-330)
[    0.135597] ACPI Error: Skipping While/If block (20201113/psloop-427)
[    0.527786] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not
cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f
flags 0x200] vs fed40080 f80
[    0.527874] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not
cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f
flags 0x200] vs fed40080 f80
[    0.736009] i8042: PNP: PS/2 appears to have AUX port disabled, if
this is incorrect please boot with i8042.nopnp
[    0.738042] platform eisa.0: EISA: Cannot allocate resource for mainboard
[    0.738044] platform eisa.0: Cannot allocate resource for EISA slot 1
[    0.738045] platform eisa.0: Cannot allocate resource for EISA slot 2
[    0.738046] platform eisa.0: Cannot allocate resource for EISA slot 3
[    0.738048] platform eisa.0: Cannot allocate resource for EISA slot 4
[    0.738049] platform eisa.0: Cannot allocate resource for EISA slot 5
[    0.738050] platform eisa.0: Cannot allocate resource for EISA slot 6
[    0.738051] platform eisa.0: Cannot allocate resource for EISA slot 7
[    0.738052] platform eisa.0: Cannot allocate resource for EISA slot 8
[    1.268806] r8169 0000:02:00.0: can't disable ASPM; OS doesn't have
ASPM control
[    1.329939] i2c_hid i2c-ELAN1000:00: supply vdd not found, using
dummy regulator
[    1.329973] i2c_hid i2c-ELAN1000:00: supply vddl not found, using
dummy regulator
[    1.611704] ata1.00: supports DRM functions and may not be fully
accessible
[    1.613394] ata1.00: supports DRM functions and may not be fully
accessible
[    5.726419] elan_i2c i2c-ELAN1000:00: supply vcc not found, using
dummy regulator
[    6.376762] nvidia: loading out-of-tree module taints kernel.
[    6.376775] nvidia: module license 'NVIDIA' taints kernel.
[    6.376776] Disabling lock debugging due to kernel taint

[    6.884240] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.63.01
Tue Aug  3 20:44:16 UTC 2021
[    6.958699] nvidia_uvm: module uses symbols from proprietary module
nvidia, inheriting taint.
[    8.533945] ACPI Warning: \_SB.PCI0.RP01.PEGP._DSM: Argument #4 type
mismatch - Found [Buffer], ACPI requires [Package] (20201113/nsarguments-61)

Chris

On 2021-10-03 2:03 a.m., bugzilla-daemo...

Read more...

Revision history for this message
In , pmenzel+bugzilla.kernel.org (pmenzel+bugzilla.kernel.org-linux-kernel-bugs) wrote :
Download full text (3.9 KiB)

(In reply to cspadijer from comment #18)
> Are you good with me using kernel: 5.11.0-37-generic  or would you
> prefer I use a different kernel?
> The X555U is currently running Linux Mint 20.2 Cinnamon.
>
> FYI:
> I tried removing pci=noaer and it does boot now (without your patch).
> It has been a while since I tried removing pci=noaer and new kernels get
> installed all the time so not sure what kernel first started allowing it
> to boot without needing that line.
> However, there are still many errors on boot.

The original bug seems to be solved now. As there are over ten comments already, could you mark it as fixed, and create new issues?

> dmesg --level=err,warn
> [    0.105337] x86/cpu: VMX (outside TXT) disabled by BIOS
> [    0.110761] MDS CPU bug present and SMT on, data leak possible. See
> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for
> more details.

Is GNU/Linux applying the latest microcode updates?

> [    0.110761]  #3

Cosmetic error.

> [    0.114598] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
> [    0.135583] ACPI BIOS Error (bug): Could not resolve symbol
> [\_SB.PCI0.RP01.PXSX], AE_NOT_FOUND (20201113/psargs-330)
> [    0.135597] ACPI Error: Skipping While/If block (20201113/psloop-427)
> [    0.527786] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not
> cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f flags
> 0x200] vs fed40080 f80
> [    0.527874] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does not
> cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f flags
> 0x200] vs fed40080 f80

Firmware issues.

> [    0.736009] i8042: PNP: PS/2 appears to have AUX port disabled, if this is
> incorrect please boot with i8042.nopnp

Can be ignored.

> [    0.738042] platform eisa.0: EISA: Cannot allocate resource for mainboard
> [    0.738044] platform eisa.0: Cannot allocate resource for EISA slot 1
> [    0.738045] platform eisa.0: Cannot allocate resource for EISA slot 2
> [    0.738046] platform eisa.0: Cannot allocate resource for EISA slot 3
> [    0.738048] platform eisa.0: Cannot allocate resource for EISA slot 4
> [    0.738049] platform eisa.0: Cannot allocate resource for EISA slot 5
> [    0.738050] platform eisa.0: Cannot allocate resource for EISA slot 6
> [    0.738051] platform eisa.0: Cannot allocate resource for EISA slot 7
> [    0.738052] platform eisa.0: Cannot allocate resource for EISA slot 8

Is there an EISA slot?

> [    1.268806] r8169 0000:02:00.0: can't disable ASPM; OS doesn't have ASPM
> control

Can be ignored.

> [    1.329939] i2c_hid i2c-ELAN1000:00: supply vdd not found, using dummy
> regulator
> [    1.329973] i2c_hid i2c-ELAN1000:00: supply vddl not found, using dummy
> regulator

Please contact the Linux folks about this. But first try the latest Linux mainline version.

> [    1.611704] ata1.00: supports DRM functions and may not be fully
> accessible
> [    1.613394] ata1.00: supports DRM functions and may not be fully
> accessible
> [    5.726419] elan_i2c i2c-ELAN1000:00: supply vcc not found, using dummy
> regulator
> [    6.376762] nvidia: loading out-of-tree module taints kernel.
> [    6.376...

Read more...

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :
Download full text (4.6 KiB)

Created attachment 299107
attachment-9243-0.html

Hi Paul.

Okay yes.  I will mark as fixed and open up new for other issues you clarified as linux.  Thanks for your help.

For the firmware issues should I be reaching out to the vendors?

Chris

⁣Get BlueMail for Android ​

On Oct 5, 2021, 7:13 AM, at 7:13 AM, <email address hidden> wrote:
>https://bugzilla.kernel.org/show_bug.cgi?id=109691
>
>--- Comment #19 from Paul Menzel
>(<email address hidden>) ---
>(In reply to cspadijer from comment #18)
>> Are you good with me using kernel: 5.11.0-37-generic  or would you
>> prefer I use a different kernel?
>> The X555U is currently running Linux Mint 20.2 Cinnamon.
>>
>> FYI:
>> I tried removing pci=noaer and it does boot now (without your patch).
>> It has been a while since I tried removing pci=noaer and new kernels
>get
>> installed all the time so not sure what kernel first started allowing
>it
>> to boot without needing that line.
>> However, there are still many errors on boot.
>
>The original bug seems to be solved now. As there are over ten comments
>already, could you mark it as fixed, and create new issues?
>
>> dmesg --level=err,warn
>> [    0.105337] x86/cpu: VMX (outside TXT) disabled by BIOS
>> [    0.110761] MDS CPU bug present and SMT on, data leak possible.
>See
>> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
>for
>> more details.
>
>Is GNU/Linux applying the latest microcode updates?
>
>> [    0.110761]  #3
>
>Cosmetic error.
>
>> [    0.114598] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
>> [    0.135583] ACPI BIOS Error (bug): Could not resolve symbol
>> [\_SB.PCI0.RP01.PXSX], AE_NOT_FOUND (20201113/psargs-330)
>> [    0.135597] ACPI Error: Skipping While/If block
>(20201113/psloop-427)
>> [    0.527786] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does
>not
>> cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f
>flags
>> 0x200] vs fed40080 f80
>> [    0.527874] tpm_crb MSFT0101:00: [Firmware Bug]: ACPI region does
>not
>> cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f
>flags
>> 0x200] vs fed40080 f80
>
>Firmware issues.
>
>> [    0.736009] i8042: PNP: PS/2 appears to have AUX port disabled, if
>this is
>> incorrect please boot with i8042.nopnp
>
>Can be ignored.
>
>> [    0.738042] platform eisa.0: EISA: Cannot allocate resource for
>mainboard
>> [    0.738044] platform eisa.0: Cannot allocate resource for EISA
>slot 1
>> [    0.738045] platform eisa.0: Cannot allocate resource for EISA
>slot 2
>> [    0.738046] platform eisa.0: Cannot allocate resource for EISA
>slot 3
>> [    0.738048] platform eisa.0: Cannot allocate resource for EISA
>slot 4
>> [    0.738049] platform eisa.0: Cannot allocate resource for EISA
>slot 5
>> [    0.738050] platform eisa.0: Cannot allocate resource for EISA
>slot 6
>> [    0.738051] platform eisa.0: Cannot allocate resource for EISA
>slot 7
>> [    0.738052] platform eisa.0: Cannot allocate resource for EISA
>slot 8
>
>Is there an EISA slot?
>
>> [    1.268806] r8169 0000:02:00.0: can't disable ASPM; OS doesn't
>have ASPM
>> control
>
>Can be ignored.
>
>> [    1.329939] i2c_hid i2c-ELAN1000...

Read more...

Revision history for this message
In , pmenzel+bugzilla.kernel.org (pmenzel+bugzilla.kernel.org-linux-kernel-bugs) wrote :

[Please remove the quote next time from your reply. If you look at the Web interface, the comments get needlessly long because of that.]

(In reply to cspadijer from comment #20)

[…]

> Okay yes.  I will mark as fixed and open up new for other issues you
> clarified as linux.  Thanks for your help.

Thank you.

> For the firmware issues should I be reaching out to the vendors?

Yes, only the vendors can fix the firmware, unless you use FLOSS firmware like coreboot based firmware for example.

Unfortunately, my track record of getting vendors to fix their firmware is not so good, as you are only one customer using this weird operating system and not Microsoft Windows. But fingers crossed.

Additionally you might want to point them to the Firmware Test Suite (FWTS) [1].

[1]: https://wiki.ubuntu.com/FirmwareTestSuite/

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

Created attachment 299109
attachment-15734-0.html

Okay great.

Thanks for the link to FirmwareTestSuite.

Chris

⁣Get BlueMail for Android ​

On Oct 5, 2021, 9:25 AM, at 9:25 AM, <email address hidden> wrote:
>https://bugzilla.kernel.org/show_bug.cgi?id=109691
>
>--- Comment #21 from Paul Menzel
>(<email address hidden>) ---
>[Please remove the quote next time from your reply. If you look at the
>Web
>interface, the comments get needlessly long because of that.]
>
>(In reply to cspadijer from comment #20)
>
>[…]
>
>> Okay yes.  I will mark as fixed and open up new for other issues you
>> clarified as linux.  Thanks for your help.
>
>Thank you.
>
>> For the firmware issues should I be reaching out to the vendors?
>
>Yes, only the vendors can fix the firmware, unless you use FLOSS
>firmware like
>coreboot based firmware for example.
>
>Unfortunately, my track record of getting vendors to fix their firmware
>is not
>so good, as you are only one customer using this weird operating system
>and not
>Microsoft Windows. But fingers crossed.
>
>Additionally you might want to point them to the Firmware Test Suite
>(FWTS)
>[1].
>
>
>[1]: https://wiki.ubuntu.com/FirmwareTestSuite/
>
>--
>You may reply to this email to add a comment.
>
>You are receiving this mail because:
>You reported the bug.

Revision history for this message
In , cspadijer (cspadijer-linux-kernel-bugs) wrote :

An upstream kernel since 4.2.0-22-generic has resolved the issue with this make/model of laptop.
Laptop successfully boots now without the pci=nommconf boot parameter.

Changed in linux:
status: Confirmed → Unknown
Revision history for this message
Narcis Garcia (narcisgarcia) wrote :

One more case:

- Hardware: Mainboard "Asus Prime B-560M-A"

- Software: Debian GNU/Linux 11 (bullseye); Kernel Linux 5.10.0-10-amd64

- systemd-journald messages:
Jan 12 09:57:53 system systemd-journald[79944]: Missed 12 kernel messages
░░ Subject: Journal messages have been missed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ Kernel messages have been lost as the journal system has been unable
░░ to process them quickly enough.

- Kernel messages (dmesg) that make systemd-journald to collapse:
[19209.926816] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[19209.926817] pcieport 0000:00:1c.5: device [8086:43bd] error status/mask=00000001/00002000
[19209.926817] pcieport 0000:00:1c.5: [ 0] RxErr

Workaround: Adding "pcie_aspm=off" to GRUB_CMDLINE_LINUX parameter at /etc/default/grub
and run: sudo update-grub
Next reboot.

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

Is this still an issue? If so, can somebody add a complete dmesg log and "sudo lspci -vv" output from a current kernel?

Revision history for this message
Noah Bowman (eksistenze) wrote :

Here is from my fresh install of Xubuntu 22.04 LTS

Revision history for this message
Xavier (xav46) wrote :

Hi there !

Same errors are spamming my logs, and my console...
(Nearly) Fresh install on a Ubuntu server 22.04.4 LTS, motherboard Asus Pro Q670M-C-CSM. Kernel is 5.15.0-97-generic
The "pci=noaer" grub patch does the job, but I’d rather not put the dust under the carpet ;-)
Output of dmesg and lspci -vv attached if it could help.

Revision history for this message
Xavier (xav46) wrote :
Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

Thank you! "pci=noaer" definitely sweeps this dust under the carpet, and it would be much better to avoid that.

Narcis (comment #157) reported that "pcie_aspm=off" is a workaround and is much more specific than "pci=noaer".

If we can collect complete dmesg and "sudo lspci -vv" output when booting with and without "pcie_aspm=off", there might be a clue about something we're doing wrong with ASPM.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.