[gen4 mesa] GPU hang

Bug #1182954 reported by Scott Moser
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
xserver-xorg-video-intel (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

I upgraded to saucy yesterday, rebooted, and within 30 minutes of use, X hung on me.
It seemed like it might be a regression of bug 1097315 which occurred during raring, so I booted into a raring kernel. This bug report is occuring with the raring kernel (3.8.0-21-generic).

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: xserver-xorg-core 2:1.13.3-0ubuntu9
ProcVersionSignature: Ubuntu 3.8.0-21.32-generic 3.8.8
Uname: Linux 3.8.0-21-generic x86_64
ApportVersion: 2.10.1-0ubuntu1
Architecture: amd64
CompizPlugins: [core,bailer,detection,composite,opengl,compiztoolbox,decor,snap,commands,mousepoll,grid,move,place,imgpng,session,vpswitch,resize,regex,gnomecompat,unitymtgrabhandles,wall,resizeinfo,animation,workarounds,fade,scale,expo,ezoom,unityshell]
Date: Wed May 22 11:51:46 2013
DistUpgraded: 2013-05-20 08:14:41,880 DEBUG openCache()
DistroCodename: saucy
DistroVariant: ubuntu
EcryptfsInUse: Yes
ExtraDebuggingInterest: Yes
GraphicsCard:
 Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Lenovo Device [17aa:20e4]
   Subsystem: Lenovo Device [17aa:20e4]
InstallationDate: Installed on 2011-10-19 (581 days ago)
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
MachineType: LENOVO 7417CTO
MarkForUpload: True
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.8.0-21-generic root=UUID=f9832678-e9fb-41c5-8edb-5edd5200ed0a ro quiet splash vt.handoff=7
SourcePackage: xorg-server
UpgradeStatus: Upgraded to saucy on 2013-05-20 (2 days ago)
dmi.bios.date: 12/06/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 7UET91WW (3.21 )
dmi.board.name: 7417CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7UET91WW(3.21):bd12/06/2010:svnLENOVO:pn7417CTO:pvrThinkPadT400:rvnLENOVO:rn7417CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 7417CTO
dmi.product.version: ThinkPad T400
dmi.sys.vendor: LENOVO
version.compiz: compiz 1:0.9.9~daily13.04.18.1~13.04-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.45-1
version.libgl1-mesa-dri: libgl1-mesa-dri 9.1.1-0ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 9.1.1-0ubuntu3
version.xserver-xorg-core: xserver-xorg-core 2:1.13.3-0ubuntu9
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.3-0ubuntu2b2
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.1.0-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.21.6-0ubuntu4
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.7-0ubuntu1
xserver.bootTime: Mon May 20 10:49:10 2013
xserver.configfile: default
xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.13.3-0ubuntu9
xserver.video_driver: intel

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

These files were after dmesg instructed there might be more info in 'i915_error_state' with message like:
  [176339.808126] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
  [176339.808141] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

So, I copied both of the following here.
  sudo cp /sys/kernel/debug/dri/0/i915_error_state 0.i915_error_state
  sudo cp /sys/kernel/debug/dri/64/i915_error_state 64.i915_error_state

Then also collected dmesg as it had more info now than it did when 'ubuntu-bug'
collected it.

Revision history for this message
Scott Moser (smoser) wrote :

just hung again on 3.9.0-2-generic (maybe 2 minutes of uptime).
This time, the mouse cursor moves, but focus doesn't change, and no typing input.
screen will dim, then mouse movement brings it back to full brightness.

nothing in dmesg well after 120 seconds waiting.

Timo Aaltonen (tjaalton)
affects: xorg-server (Ubuntu) → xserver-xorg-video-intel (Ubuntu)
Chris Wilson (ickle)
summary: - xorg hang on i915
+ [gen4 mesa] GPU hang
Revision history for this message
Scott Moser (smoser) wrote :

For searchability / ease, here is the dmesg content after failure:
[176880.872168] INFO: task Xorg:2332 blocked for more than 120 seconds.
[176880.872178] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[176880.872184] Xorg D ffff88013bc13f40 0 2332 2314 0x00400004
[176880.872194] ffff880120f2dc60 0000000000000082 ffff8801202dae80 ffff880120f2dfd8
[176880.872204] ffff880120f2dfd8 ffff880120f2dfd8 ffffffff81c15440 ffff8801202dae80
[176880.872212] ffff88012fb2d000 ffff88012f626268 ffff88012f221800 0000000000000000
[176880.872220] Call Trace:
[176880.872239] [<ffffffff816ca029>] schedule+0x29/0x70
[176880.872322] [<ffffffffa01149f5>] intel_crtc_wait_for_pending_flips+0x75/0xd0 [i915]
[176880.872333] [<ffffffff8107dc40>] ? finish_wait+0x80/0x80
[176880.872377] [<ffffffffa0117455>] i9xx_crtc_disable+0x65/0x1c0 [i915]
[176880.872422] [<ffffffffa011c9ef>] intel_crtc_update_dpms+0x6f/0xa0 [i915]
[176880.872467] [<ffffffffa011caca>] intel_encoder_dpms+0x1a/0x30 [i915]
[176880.872511] [<ffffffffa011efa8>] intel_connector_dpms+0x38/0x70 [i915]
[176880.872558] [<ffffffffa0015990>] drm_mode_obj_set_property_ioctl+0x330/0x340 [drm]
[176880.872595] [<ffffffffa00159d0>] drm_mode_connector_property_set_ioctl+0x30/0x40 [drm]
[176880.872628] [<ffffffffa0004559>] drm_ioctl+0x4e9/0x5b0 [drm]
[176880.872668] [<ffffffffa00159a0>] ? drm_mode_obj_set_property_ioctl+0x340/0x340 [drm]
[176880.872679] [<ffffffff8114a1bd>] ? kzfree+0x2d/0x30
[176880.872689] [<ffffffff8117c88d>] ? kfree+0xdd/0x110
[176880.872699] [<ffffffff81085c1a>] ? lg_local_unlock+0x1a/0x20
[176880.872707] [<ffffffff811b2e19>] ? mntput_no_expire+0x49/0x160
[176880.872716] [<ffffffff811a5969>] do_vfs_ioctl+0x99/0x570
[176880.872725] [<ffffffff811956be>] ? ____fput+0xe/0x10
[176880.872734] [<ffffffff8107a0dc>] ? task_work_run+0xac/0xe0
[176880.872741] [<ffffffff811a5ed1>] sys_ioctl+0x91/0xb0
[176880.872751] [<ffffffff816d37dd>] system_call_fastpath+0x1a/0x1f

Revision history for this message
Scott Moser (smoser) wrote :

just a comment, reproduced just now with:

$ dpkg -S /boot/vmlinuz-3.9.0-3-generic
linux-image-3.9.0-3-generic: /boot/vmlinuz-3.9.0-3-generic
$ dpkg-query --show linux-image-3.9.0-3-generic
linux-image-3.9.0-3-generic 3.9.0-3.8

$ dpkg-query --show "*xorg*intel"
xserver-xorg-video-intel 2:2.21.6-0ubuntu4

Revision history for this message
Chris Wilson (ickle) wrote :

The fixes for the kernel deadlock should be in 3.10; the root cause of the primary bug is in mesa.

Revision history for this message
Scott Moser (smoser) wrote :

Chris, thanks.
  Just for reference, I just reproduced with these xserver/kernel:

$ dpkg-query --show linux-image-3.9.0-6-generic xserver-xorg-video-intel
linux-image-3.9.0-6-generic 3.9.0-6.13
xserver-xorg-video-intel 2:2.21.9-0ubuntu2

booting the old kernel (3.8.0-21-generic) doesn't seem to show the problem.

Revision history for this message
Scott Moser (smoser) wrote :

The following appears to have fixed the issue. I've been running it for ~ 6 hours now.

$ uname -r
3.10.0-2-generic

$ dpkg-query --show linux-image-$(uname -r) xserver-xorg-video-intel
linux-image-3.10.0-2-generic 3.10.0-2.10
xserver-xorg-video-intel 2:2.21.9-0ubuntu2

Revision history for this message
Chris Wilson (ickle) wrote :

A new version of mesa is available for saucy, can you please retest?

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for xserver-xorg-video-intel (Ubuntu) because there has been no activity for 60 days.]

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.