GPU lockup (ESR: 0x00000010 PGTBL_ER: 0x00000010) - i915 unused but triggers apport on multi-gpu system unless blacklisted

Bug #752940 reported by William Shotts
66
This bug affects 10 people
Affects Status Importance Assigned to Milestone
module-init-tools (Ubuntu)
Invalid
Undecided
Unassigned
xdiagnose (Ubuntu)
Fix Released
Wishlist
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

I don't know why this is a problem. My system has an on-board Intel graphics system but I'm using a PCI Nvidia card (GeForce 6200) and Nvidia driver. This problem appears shortly after logging in every time.

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-4ubuntu6
ProcVersionSignature: Ubuntu 2.6.38-8.41-generic 2.6.38.2
Uname: Linux 2.6.38-8-generic i686
NonfreeKernelModules: nvidia
.proc.driver.nvidia.gpus.0: Error: [Errno 21] Is a directory: '/proc/driver/nvidia/gpus/0'
.proc.driver.nvidia.registry: Binary: ""
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86 Kernel Module 270.30 Fri Feb 25 14:34:41 PST 2011
 GCC version: gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu1)
Architecture: i386
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: None
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: On
 modes:
 edid-base64:
Date: Wed Apr 6 16:09:44 2011
DistUpgraded: Fresh install
DistroCodename: natty
DistroVariant: ubuntu
DkmsStatus:
 nvidia-current, 270.30, 2.6.38-7-generic, i686: installed
 nvidia-current, 270.30, 2.6.38-8-generic, i686: installed
DuplicateSignature: (ESR: 0x00000010 PGTBL_ER: 0x00000010)
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
GraphicsCard:
 Subsystem: Dell Device [1028:0160]
 nVidia Corporation NV44A [GeForce 6200] [10de:0221] (rev a1) (prog-if 00 [VGA controller])
   Subsystem: eVga.com. Corp. Device [3842:b399]
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Beta 1 i386 (20110329.1)
InterpreterPath: /usr/bin/python2.7
JockeyStatus: xorg:nvidia_current - NVIDIA accelerated graphics driver (Proprietary, Enabled, In use)
Lsusb:
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Dell Computer Corporation Dimension 2400
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-8-generic root=UUID=b763f8ed-c48a-4cfe-824b-4014ed459a52 ro quiet splash vt.handoff=7
ProcKernelCmdLine_: BOOT_IMAGE=/boot/vmlinuz-2.6.38-8-generic root=UUID=b763f8ed-c48a-4cfe-824b-4014ed459a52 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 xserver-xorg 1:7.6+4ubuntu3
 libdrm2 2.4.23-1ubuntu6
 xserver-xorg-video-intel 2:2.14.0-4ubuntu6
Renderer: Unknown
SourcePackage: xserver-xorg-video-intel
Title: GPU lockup (ESR: 0x00000010 PGTBL_ER: 0x00000010)
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

XorgConf:
 Section "Device"
  Identifier "Default Device"
  Option "NoLogo" "True"
 EndSection
XorgConf_:
 Section "Device"
  Identifier "Default Device"
  Option "NoLogo" "True"
 EndSection
dmi.bios.date: 12/02/2003
dmi.bios.vendor: Dell Computer Corporation
dmi.bios.version: A05
dmi.board.name: 0F5949
dmi.board.vendor: Dell Computer Corp.
dmi.board.version: A01
dmi.chassis.type: 15
dmi.chassis.vendor: Dell Computer Corporation
dmi.modalias: dmi:bvnDellComputerCorporation:bvrA05:bd12/02/2003:svnDellComputerCorporation:pnDimension2400:pvr:rvnDellComputerCorp.:rn0F5949:rvrA01:cvnDellComputerCorporation:ct15:cvr:
dmi.product.name: Dimension 2400
dmi.sys.vendor: Dell Computer Corporation
version.compiz: compiz 1:0.9.4git20110322-0ubuntu5
version.libdrm2: libdrm2 2.4.23-1ubuntu6
version.libgl1-mesa-dri: libgl1-mesa-dri 7.10.1-0ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental 7.10.1-0ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10.1-0ubuntu3
version.nvidia-graphics-drivers: nvidia-graphics-drivers N/A
version.xserver-xorg: xserver-xorg 1:7.6+4ubuntu3
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.0-0ubuntu4
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-4ubuntu6
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu6

Revision history for this message
William Shotts (bshotts) wrote :
tags: removed: need-duplicate-check
Revision history for this message
Bryce Harrington (bryce) wrote :

This type of bug (with ESR: 0x00000010) typically occurs due to having the intel kernel driver present when another kernel driver (in this case nvidia) loaded. But I need to understand more about how you have your system set up to understand this, as your description is a bit vague...

First tell me about how you have your system set up - why do you have both intel and nvidia loaded? If you're using nvidia why did you not shut off intel?

Did you experience a gpu lockup (where the graphics stop updating, only the mouse moves) or did you get the error dialog popup but without any symptoms or issues?

Did you upgrade recently? If so, when you ran maverick on this same configuration, were you experiencing this same bug?

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
Revision history for this message
William Shotts (bshotts) wrote :

Hi Bryce,

This a fresh install of 11.04 (Beta 1, daily image from yesterday, and Xubuntu Beta 2 installed a few minutes ago). The problem happens every time. My system is a Dell Dimension 2400 with onboard Intel graphics set to "auto" in BIOS (the only other selection is "enable") and a PCI bus Nvidia card (the monitor is attached to the Nvidia card). I was curious too why the installer seems to configure the system to load the Intel driver when the Intel graphics are not in use.

I get the error dialog without any overt symptoms other than the dialog repeating each time I attempt to report the problem. This problem is new (it did not happen in 10.04, 10.10, or Debian 6).

As a side note, when I ran Ubuntu Beta 1 from the live media, Unity worked, and continued to work with both the experimental driver and the proprietary Nvidia driver after the initial installation, but stopped working after the first set of updates, I assumed that my card was blacklisted.

Revision history for this message
William Shotts (bshotts) wrote :

Subsequent investigation reveals that if I add the i915 module to the /etc/modprobe.d/blacklist.conf file and reboot the problem goes away. Checking the Debian 6 partition on the same machine shows that Debian does not attempt to load both modules, so the question is why does Ubuntu?

Revision history for this message
Bryce Harrington (bryce) wrote :

To answer your last question first, the reason Ubuntu tries to load both modules is that it has a different kernel module loading process, which was introduced to support both fastboot/upstart and flicker-free graphical boot/plymouth. But there are (obviously) some bugs in the module loading logic when multiple video cards are present, such as in your case.

Specifically, the presence of multiple kernel graphics drivers trips up the i915 kernel module driver and results in a gpu event to be triggered; however it is able to realize there isn't actually a serious problem, and it just resets the GPU and continues. However, that triggering is enough to fire up apport to prompt you to report a bug.

So, aside from the kernel module loading logic being a bit flimsy in this case, there isn't anything specifically broken that you need to worry about. The logic should perhaps be cleaned up better (we already have bugs filed to this end).

Revision history for this message
Bryce Harrington (bryce) wrote :

The apport prompts will stop being displayed once we release, and apport is turned back off by default. That will make this issue appear to go away, at least until oneiric development gets under way.

I'm filing a wishlist bug against xdiagnose to include logic in the apport hook (once they've been moved into the xdiagnose package) to check for this particular failure mode and avoid filing bug reports (or to file the bug against module-init-tools).

Changed in xdiagnose (Ubuntu):
importance: Undecided → Wishlist
status: New → Triaged
summary: - GPU lockup (ESR: 0x00000010 PGTBL_ER: 0x00000010)
+ GPU lockup (ESR: 0x00000010 PGTBL_ER: 0x00000010) - i915 unused but
+ triggers apport on multi-gpu system unless blacklisted
Revision history for this message
Bryce Harrington (bryce) wrote :

I'm moving this bug to module-init-tools since the issue is in how the kernel modules are being loaded prior to X coming up.

I'll also subscribe apw since he's tinkered with the code previously.

affects: xserver-xorg-video-intel (Ubuntu) → module-init-tools (Ubuntu)
Changed in module-init-tools (Ubuntu):
status: Incomplete → New
Revision history for this message
Bryce Harrington (bryce) wrote :

Bug #767475 shows a single-gpu case where i915 is just conflicting with (I guess) vesafb.

Changed in module-init-tools (Ubuntu):
status: New → Confirmed
Revision history for this message
Bryce Harrington (bryce) wrote :

In Precise we're noticing that these fake gpu hangs don't occur like they did in oneiric. I suspect those fixes address this case as well. Reopen if you can reproduce this under precise.

Changed in xdiagnose (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
dino99 (9d9) wrote :

This version has expired long time ago, and is no more supported

Changed in module-init-tools (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.