Repeated (spurious?) pcieport errors in logs (PCIe Bus Error)

Bug #671979 reported by Pauli Virtanen
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

On a newly bought laptop (Asus G53JW), on Ubuntu 10.10 (linux-image-2.6.35-22-generic), logs fill up from messages like this:

[ 1108.493365] pcieport 0000:00:03.0: AER: Corrected error received: id=0018
[ 1108.493378] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0018(Receiver ID)
[ 1108.493385] pcieport 0000:00:03.0: device [8086:d138] error status/mask=00000001/00001100
[ 1108.493391] pcieport 0000:00:03.0: [ 0] Receiver Error

They appear shortly after boot. Everything seems to be nevertheless working. However, the logs get filled up by these messages, and can quickly swell up to GB size.

Not sure if this is related: I've also experienced some ~20s hangs when using the proprietary Nvidia drivers in heavy 3D graphics, which may be related. However, these errors appear also without using those drivers (or having the "nvidia" ever been loaded) -- the attached debug info is from a boot without the proprietary drivers.

I'll try to follow up with tests with the mainline kernel.

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: linux-image-generic 2.6.35.22.23
Regression: No
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.35-22.35-generic 2.6.35.4
Uname: Linux 2.6.35-22-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC259 Analog [ALC259 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: pauli 3114 F.... pulseaudio
 /dev/snd/seq: timidity 2730 F.... timidity
CRDA: Error: [Errno 2] Tiedostoa tai hakemistoa ei ole
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xdc600000 irq 53'
   Mixer name : 'Realtek ALC259'
   Components : 'HDA:10ec0269,10431083,00100100'
   Controls : 16
   Simple ctrls : 9
Card1.Amixer.info:
 Card hw:1 'NVidia'/'HDA NVidia at 0xd6080000 irq 17'
   Mixer name : 'Nvidia ID 11'
   Components : 'HDA:10de0011,10de0101,00100100'
   Controls : 0
   Simple ctrls : 0
Card1.Amixer.values:

Date: Sat Nov 6 22:55:18 2010
HibernationDevice: RESUME=UUID=f5613820-aab4-45b0-a68f-19397460bb5c
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
MachineType: ASUSTeK Computer Inc. G53JW
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.35-22-generic root=/dev/mapper/main-root ro single
ProcEnviron:
 PATH=(custom, no user)
 LANG=fi_FI.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.38
SourcePackage: linux
dmi.bios.date: 10/05/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: G53JW.209
dmi.board.asset.tag: ATN12345678901234567
dmi.board.name: G53JW
dmi.board.vendor: ASUSTeK Computer Inc.
dmi.board.version: 1.0
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK Computer Inc.
dmi.chassis.version: 1.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrG53JW.209:bd10/05/2010:svnASUSTeKComputerInc.:pnG53JW:pvr1.0:rvnASUSTeKComputerInc.:rnG53JW:rvr1.0:cvnASUSTeKComputerInc.:ct10:cvr1.0:
dmi.product.name: G53JW
dmi.product.version: 1.0
dmi.sys.vendor: ASUSTeK Computer Inc.

Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :
Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

These seem to appear also under the mainline linux-image-2.6.37-999-generic 2.6.37-999.201011060905:

[ 192.339504] pcieport 0000:00:03.0: AER: Corrected error received: id=0018
[ 192.339518] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0018(Receiver ID)
[ 192.339525] pcieport 0000:00:03.0: device [8086:d138] error status/mask=00000001/00001100
[ 192.339530] pcieport 0000:00:03.0: [ 0] Receiver Error (First)

Not sure yet if I the rate at which they occur can become high under some conditions, but seems likely.

tags: removed: needs-upstream-testing
Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

Similar messages were noted also in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/577702 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/321412

However, it's maybe clearest to keep this bug report separate, since in those cases there were also other problems.

Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

Adding pci=nommconf to boot options makes these messages to vanish.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Pauli,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 671979

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

I haven't seen these errors any more in recent Ubuntu kernels. "linux-image-2.6.35-24-generic 2.6.35-24.42" functions OK, so either something has changed in my system (no idea what it could be; no h/w changes), or the issue has been fixed.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Dan Kegel (dank) wrote :

I saw this recently on my system.

Ubuntu 12.04 LTS \n \l
Linux i7 3.2.0-25-generic-pae #40-Ubuntu SMP Wed May 23 22:11:24 UTC 2012 i686 i686 i386 GNU/Linux

Jun 22 15:43:15 i7 kernel: [ 2945.323854] pcieport 0000:00:03.0: AER: Corrected error received: id=0018
Jun 22 15:43:15 i7 kernel: [ 2945.323865] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=
0018(Receiver ID)
Jun 22 15:43:15 i7 kernel: [ 2945.323869] pcieport 0000:00:03.0: device [8086:340a] error status/mask=00000001/00002000
Jun 22 15:43:15 i7 kernel: [ 2945.323873] pcieport 0000:00:03.0: [ 0] Receiver Error

or sometimes

Jun 22 16:03:36 i7 kernel: [ 4162.758805] pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Jun 22 16:03:36 i7 kernel: [ 4162.758811] pcieport 0000:00:03.0: can't find device of ID0018
Jun 22 16:03:36 i7 kernel: [ 4162.758822] pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018
Jun 22 16:03:36 i7 kernel: [ 4162.758830] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0018(Receiver ID)
Jun 22 16:03:36 i7 kernel: [ 4162.758837] pcieport 0000:00:03.0: device [8086:340a] error status/mask=00000001/00002000
Jun 22 16:03:36 i7 kernel: [ 4162.758850] pcieport 0000:00:03.0: [ 0] Receiver Error

This repeated about a bazillion times. It has since stopped. Very odd.

Revision history for this message
Dan Kegel (dank) wrote :

It hasn't stopped. In fact, it got worse; with current ubuntu, it won't even finish booting.
If I boot into ubuntu 11.10, I can still boot, but syslog is using one core 100%
writing those messages to the log.

http://forums.opensuse.org/english/get-technical-help-here/hardware/468132-device-8086-340a-error-status.html
suggests moving the nvidia card to another slot, but that didn't help here.
It's possible the problems started when I switched to an old 8400GT card. I'll try to
swap in a different card.

Revision history for this message
Dan Kegel (dank) wrote :

I switched to a GT220; didn't help with Ubuntu 12.04, but Ubuntu 11.10 seems
happy at the moment, /var/log/kern isn't growing.

Revision history for this message
Dan Kegel (dank) wrote :

I think I've recovered. I booted into rescue mode, did
  dpkg -r nvidia-current
then booted normally, updated the system, then installed nvidia-current-updates.
Happy happy joy joy, at least for now.

Revision history for this message
Daniel Jose (danieldsj) wrote :

I exhibited similar symptoms when installing Ubuntu 16.04.1 LTS on an Asus x541u VivoBook Max system. When performing the installation, the logs would fill up with these errors and eventually fail because of lack of disk space. I found the following thread helpful...
http://www.gossamer-threads.com/lists/linux/kernel/2250177

The workaround for me was to hold left SHIFT, edit the grub menu and add the pcie_aspm=off kernel parameter to suppress the messages during the installation and every subsequent boot. Adding these options to the grub configuration after installing was the long-term workaround.

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

Is this still an issue? If so, can somebody add a complete dmesg log and "sudo lspci -vv" output from a current kernel?

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

Also, if you do see this issue, some have reported that "pcie_aspm=off" avoids the errors. If that's the case for you, see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2043665/comments/6 and help me investigate it!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.