8086:2a02 [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung

Bug #946899 reported by Matt Zimmerman on 2012-03-05
684
This bug affects 135 people
Affects Status Importance Assigned to Milestone
Linux
Incomplete
High
Linux Mint
Incomplete
Undecided
Unassigned
xf86-video-intel
Confirmed
Medium
linux (Fedora)
Won't Fix
Undecided
linux (Ubuntu)
High
Unassigned

Bug Description

Since upgrading to 12.04 beta, I've seen this happen twice. The symptoms are:

- The screen freezes
- The backlight turns off

At that point I have to reboot to get my display back. In syslog, i see:

Mar 4 23:09:18 perseus kernel: [ 3751.612064] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 4 23:09:18 perseus kernel: [ 3751.612076] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Mar 4 23:09:18 perseus kernel: [ 3751.613658] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 114040 at 114004, next 114118)
Mar 4 23:09:18 perseus kernel: [ 3751.637049] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000

followed by a lot of things like:

Mar 4 23:09:18 perseus kernel: [ 3751.700662] WARNING: at /build/buildd/linux-3.2.0/drivers/gpu/drm/i915/intel_display.c:793 intel_enable_pipe+0x14a/0x150 [i915]()
Mar 4 23:09:18 perseus kernel: [ 3751.700667] Hardware name: 6465CTO
Mar 4 23:09:18 perseus kernel: [ 3751.700670] PLL state assertion failure (expected on, current off)

Mar 4 23:09:18 perseus kernel: [ 3751.812158] WARNING: at /build/buildd/linux-3.2.0/drivers/gpu/drm/i915/intel_display.c:930 assert_pipe+0x75/0x80 [i915]()
Mar 4 23:09:18 perseus kernel: [ 3751.812165] Hardware name: 6465CTO
Mar 4 23:09:18 perseus kernel: [ 3751.812170] pipe B assertion failure (expected on, current off)

then:

Mar 4 23:09:19 perseus kernel: [ 3753.044086] [drm:intel_lvds_enable] *ERROR* timed out waiting for panel to power on
Mar 4 23:09:20 perseus kernel: [ 3753.671451] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.684603] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.704594] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.724594] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.744595] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.764606] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.784607] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.804618] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
Mar 4 23:09:20 perseus kernel: [ 3753.824606] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-17-generic 3.2.0-17.27
ProcVersionSignature: Ubuntu 3.2.0-17.27-generic 3.2.6
Uname: Linux 3.2.0-17-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 1.94-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: mdz 2458 F.... pulseaudio
 /dev/snd/controlC1: mdz 2458 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfe020000 irq 49'
   Mixer name : 'Analog Devices AD1984'
   Components : 'HDA:11d41984,17aa20bb,00100400'
   Controls : 31
   Simple ctrls : 19
Card1.Amixer.info:
 Card hw:1 'Q9000'/'Logitech, Inc. QuickCam Pro 9000 at usb-0000:00:1a.7-3, high speed'
   Mixer name : 'USB Mixer'
   Components : 'USB046d:0990'
   Controls : 2
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'Mic',0
   Capabilities: cvolume cvolume-joined cswitch cswitch-joined penum
   Capture channels: Mono
   Limits: Capture 0 - 3072
   Mono: Capture 0 [0%] [18.00dB] [off]
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 7KHT24WW-1.08'
   Mixer name : 'ThinkPad EC 7KHT24WW-1.08'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Sun Mar 4 23:12:05 2012
HibernationDevice: RESUME=UUID=bc555036-0252-42e8-804b-b34dc22bbcd4
MachineType: LENOVO 6465CTO
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcEnviron:
 TERM=xterm
 LC_COLLATE=C
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/zsh
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-17-generic root=UUID=305dde78-d20a-4248-aaf4-09447b7c5791 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-17-generic N/A
 linux-backports-modules-3.2.0-17-generic N/A
 linux-firmware 1.71
SourcePackage: linux
UpgradeStatus: Upgraded to precise on 2012-03-04 (0 days ago)
WpaSupplicantLog:

dmi.bios.date: 01/21/2008
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETB0WW (2.10 )
dmi.board.name: 6465CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETB0WW(2.10):bd01/21/2008:svnLENOVO:pn6465CTO:pvrThinkPadT61:rvnLENOVO:rn6465CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 6465CTO
dmi.product.version: ThinkPad T61
dmi.sys.vendor: LENOVO

Matt Zimmerman (mdz) wrote :

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: New → Confirmed
status: Confirmed → Incomplete
tags: added: kernel-request-3.2.0-18.28
Changed in linux (Ubuntu):
importance: Undecided → Medium

Hi Brad,

It's still happening, multiple times per day, with 3.2.0-18.28.

On Mon, Mar 05, 2012 at 07:31:01AM -0000, Brad Figg wrote:
> Thank you for taking the time to file a bug report on this issue.
>
> However, given the number of bugs that the Kernel Team receives during
> any development cycle it is impossible for us to review them all.
> Therefore, we occasionally resort to using automated bots to request
> further testing. This is such a request.
>
> We have noted that there is a newer version of the development kernel
> than the one you last tested when this issue was found. Please test
> again with the newer kernel and indicate in the bug if this issue still
> exists or not.
>
> You can update to the latest development kernel by simply running the
> following commands in a terminal window:
>
> sudo apt-get update
> sudo apt-get upgrade
>
> If the bug still exists, change the bug status from Incomplete to
> Confirmed. If the bug no longer exists, change the bug status from
> Incomplete to Fix Released.
>
> If you want this bot to quit automatically requesting kernel tests, add
> a tag named: bot-stop-nagging.
>
> Thank you for your help, we really do appreciate it.
>
>
> ** Changed in: linux (Ubuntu)
> Status: New => Confirmed
>
> ** Changed in: linux (Ubuntu)
> Status: Confirmed => Incomplete
>
> ** Tags added: kernel-request-3.2.0-18.28
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/946899
>
> Title:
> [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU
> hung
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/946899/+subscriptions

--
 - mdz

tags: added: bot-stop-nagging
Changed in linux (Ubuntu):
status: Incomplete → Confirmed

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.3 kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed by the mainline kernel, please add the following tag 'kernel-fixed-upstream-KERNEL-VERSION'. For example, if kernel version 3.3-rc6 fixed the issue, the tag would be: 'kernel-fixed-upstream-v3.3-rc6'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[1] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-rc6-precise/

tags: added: needs-upstream-testing
Phoenix (phoenix-dominion) wrote :

Sorry for being desperate to get rid of the problem. I installed the 3.3 Kernel, but it didn't work, X still freezed. Then I rememberd I had to add the kernel nolapic to being able to boot at all. New kernel, new luck. I removed that option from Grub and so far it looks very good.

Umbrella Dish (floritiv) wrote :

Experiencing this problem on Ubuntu 12.04 (linux-3.2.0-24-generic) without nolapic boot option though. When the error happens, a prior mouse hang of several seconds is a sure sign, the tft gets dark and restores, gets dark and restores, then graphics acceleration appears disabled, font displayed with faults and I have to log out and log in. The problem seems to be gone in the new kde session with desktop effects on ... until I boot anew.

 My dmesg succeeding the error is:

[ 63.965111] [<ffffffff8106712f>] warn_slowpath_common+0x7f/0xc0
[ 63.965112] [<ffffffff8106718a>] warn_slowpath_null+0x1a/0x20
[ 63.965116] [<ffffffffa00779e4>] __gen6_gt_wait_for_fifo+0x94/0xa0 [i915]
[ 63.965121] [<ffffffffa0078065>] i915_write32+0xe5/0xf0 [i915]
[ 63.965126] [<ffffffffa00b4752>] gen6_ring_put_irq+0xa2/0xc0 [i915]
[ 63.965131] [<ffffffffa00b47c8>] gen6_render_ring_put_irq+0x18/0x20 [i915]
[ 63.965136] [<ffffffffa0089ba7>] i915_wait_request+0x1b7/0x560 [i915]
[ 63.965138] [<ffffffff8108aec0>] ? add_wait_queue+0x60/0x60
[ 63.965143] [<ffffffffa0089f82>] i915_gem_object_wait_rendering+0x32/0x40 [i
915]
[ 63.965148] [<ffffffffa008ed3d>] i915_gem_execbuffer_sync_rings+0xdd/0x160 [
i915]
[ 63.965153] [<ffffffffa008ef2e>] i915_gem_execbuffer_move_to_gpu+0x16e/0x200
 [i915]
[ 63.965157] [<ffffffffa008f65b>] i915_gem_do_execbuffer.isra.8+0x69b/0x940 [
i915]
[ 63.965163] [<ffffffffa009f349>] ? intel_mark_busy+0xd9/0x110 [i915]
[ 63.965168] [<ffffffffa008fdc3>] i915_gem_execbuffer2+0xa3/0x270 [i915]
[ 63.965172] [<ffffffffa00165d4>] drm_ioctl+0x444/0x510 [drm]
[ 63.965177] [<ffffffffa008fd20>] ? i915_gem_execbuffer+0x420/0x420 [i915]
[ 63.965179] [<ffffffff8101dbd4>] ? restore_user_xstate+0x54/0xa0
[ 63.965181] [<ffffffff81189cfa>] do_vfs_ioctl+0x8a/0x340
[ 63.965183] [<ffffffff8118a041>] sys_ioctl+0x91/0xa0
[ 63.965185] [<ffffffff81664a82>] system_call_fastpath+0x16/0x1b
[ 63.965186] ---[ end trace 20690ee302d12d8d ]---
[ 68.296329] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 68.297855] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 68.311854] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 11846 at 11845, next 11847)
[ 68.315568] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
[ 68.319138] [drm:init_ring_common] *ERROR* gen6 bsd ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
[ 68.322179] [drm:init_ring_common] *ERROR* blt ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
[ 70.159916] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 70.161991] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 11847 at 11844, next 11848)
[ 70.162096] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged!
[ 70.162100] [drm:i915_reset] *ERROR* Failed to reset chip.

jrierab (jrierab) wrote :

I am suffering this bug and similar ones related to what seems to be a regression in kernel.

I have 3 similar systems with the following configurations:

1- Work PC, with Precise updated from Oneiric since Beta 2, i5-2400. Does not seems to be affected by the bug.

2- Main home PC, with Precise updated from Oneiric since Beta 2, i5-2500K. The bug and completely system hangs occurs so often that it is nearly unusable. This happens also with the latest mainline kernel 3.4-rc4-precise (http://kernel.ubuntu.com/~kernel-ppa/mainline/). However, if I start with kernel 3.0.17 from Oneiric, the bug does not appear. I am working with it for nearly two days without a single hang. So, this may be a workaround.

3- Same home PC, fresh Ubuntu Precise distribution installed in a clean partition, same i5-2500K. I use it as a test platform. The bug has occured, and the dmesg attached belongs to this clean system. Nothing more is installed from default, safe the precise updates.

The bug produces several 2-3 seconds black screen and back to normal for 10-20 seconds, in sequence, normally followed by a completely desktop hang which requires a full reset. Sometimes, the hang does not occur, but all windows decorations and the unity bar dissapears (like the window manager is dead).

The bug appears randomly, but more often if switching from desktop spaces, and with firefox navigator open.

jrierab (jrierab) wrote :

Just after filling the comment above, the windows decorations haver dessapeared. As I had a terminal open, here it is the dmesg just after.

jrierab (jrierab) wrote :

And a screen capture...

jrierab (jrierab) on 2012-05-01
tags: removed: kernel-request-3.2.0-18.28 needs-upstream-testing
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

tags: added: kernel-bug-exists-upstream

Just an update on the behavior of this bug.

About a week ago, there was an update installed by Update Manager that resolved (for me) the issue of Unity crashing. The problem with the screen resolution spontaneously changing was still present, but at least it was usable.

Yesterday, Update Manager installed a new set of updates, and the regression is back. It's gone back to nearly unusable.

I'm going to have to break down and reinstall 11.10 and hope that the video issues with 12.04 get resolved pretty soon.

While running Firefox makes the breakage happen extremely quickly, I have seen in the last day the breakage happen even with nothing but Thunderbird running. In this scenario, the Unit decoration and launcher bar went away, AND all menus became empty. The menu would still create a box, but the box was empty.

jrierab (jrierab) wrote :

As suggested, I reported the bug upstream. It is still present in kernel v3.4-rc7-precise and also in the latest drm-intel-experimental (http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-experimental/2012-05-14-precise/).

You can follow it from https://bugzilla.kernel.org/show_bug.cgi?id=43267

Sorry for the delay, but I have been quite busy recently and did not had the time to test the latest kernel versions until today (and didn't want to report without being sure the bug still was present upstream).

Changed in linux:
importance: Unknown → High
status: Unknown → Incomplete
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Anton Anikin (anton-anikin) wrote :

Same problem with Ubuntu 12.10 and 3.4 kernel

jrierab (jrierab) wrote :

Good news at last !!! It seems that the source of the problem has been identified upstream.

Until it is fully solved, hovewer, you can try a workaround. Just edit the config file (1st option for burg, 2nd for grub):

$ gksudo gedit /etc/default/burg &
$ gksudo gedit /etc/default/grub &

and add the "i915.i915_enable_rc6=0 option to GRUB_CMDLINE_LINUX_DEFAULT. Should be something like:

GRUB_CMDLINE_LINUX_DEFAULT="i915.i915_enable_rc6=0 quiet splash"

Then, update boot files with (1st option for burg, 2nd for grub):

$ sudo update-burg
$ sudo update-grub

That's it. Reboot and enjoy your new kernels!

Changed in linux:
status: Incomplete → In Progress
Simon Kingsley (scjk) wrote :

This is still an issue for me. Very annoying.

I have seven new Network monitoring machines just built with 12.04 LTS that are impacted by this bug, a solution would be great

versus167 (wingdvd-2008) wrote :

This issue affects me to. :(

+1; also hit this, on a fresh 12.04.1 install. Will try workaround given above.

Changed in linux:
status: In Progress → Incomplete
Changed in linux:
status: Incomplete → Fix Released
Changed in linux:
status: Fix Released → Confirmed
Changed in linux:
status: Confirmed → Incomplete
Pieter (diepes) wrote :

on intel opensource site

https://01.org/linuxgraphics/documentation/how-get-last-batch-buffer-gpu-hang

They suggest capturing the output of /sys/kernel/debug/dri/0/i915_error_state
cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state

I attached it here in the hope that it could help resolve the problem.

Fabio Marconi (fabiomarconi) wrote :

reproduced something similar on 3.8.0-3, see bug 1112871
---
Ubuntu Bug Squad volunteer triager
http://wiki.ubuntu.com/BugSquad

tags: added: raring
Pieter (diepes) wrote :

$ uname -a
Linux t420 3.8.0-030800-generic #201302181935 SMP Tue Feb 19 00:36:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

from kernel ppa
Still hangs

Feb 20 11:10:37 t420 kernel: [ 9795.543780] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

Attached
$ sudo cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state-20130220

Robert Hughes (robert-4) wrote :

Read somewhere that this bug was fixed, so checked out CHANGES in the new kernel builds, Chris Wilson and Daniel Vetter among others are working with a lot of fixes for this driver. Hoping it was fixed somewhere among all those contributions, my system and logs related to the drm:i915_hangcheck_hung ERROR is printed below. However, upgrading reduced the *rate* of the error, but came back today. When working with Blender or Eclipse, programs using a lot of memory, system suddenly freeze. The mouse cursor is still movable, but the rest is frozen.

uname -a:
Linux robtu 3.8.0-030800rc7-generic #201302081635 SMP Fri Feb 8 21:57:43 UTC 2013 i686 i686 i686 GNU/Linux

syslog:
Feb 21 23:34:41 robtu kernel: [233139.352461] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

kern.log:
Feb 21 23:34:41 robtu kernel: [233139.352461] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Feb 21 23:34:41 robtu kernel: [233139.352468] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

cat /debug/dri/0/i915_error_state:
no error state collected

dmesg:
[ 16.946953] i915 0000:00:02.0: setting latency timer to 64
[ 16.947317] i915 0000:00:02.0: irq 42 for MSI/MSI-X
[ 17.934259] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[ 17.934261] i915 0000:00:02.0: registered panic notifier
[ 18.234714] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0

lspci:
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)

lshw:
             description: VGA compatible controller
             product: 2nd Generation Core Processor Family Integrated Graphics Controller
             vendor: Intel Corporation
             physical id: 2
             bus info: pci@0000:00:02.0
             version: 09
             width: 64 bits
             clock: 33MHz
             capabilities: msi pm vga_controller bus_master cap_list rom
             configuration: driver=i915 latency=0
             resources: irq:42 memory:c0000000-c03fffff memory:b0000000-bfffffff ioport:2000(size=64)

BTW, this guy has claimed he has fixed the problem, anyone with "kernelpowers" here want to take a look?
http://www.quineloop.com/2012/05/26/intel-i915-gpu-hung-linux.html

Hello,

I have the same problem here with a freeze of all the screen except the cursor.

I have nvidia optimus handling a Geforce 520M and an intel video chip.

uname -a :
3.5.0-26-generic #40-Ubuntu SMP Tue Feb 26 19:57:24 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

dmesg :
[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
atl1c 0000:07:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update.

attachement : /sys/kernel/debug/dri/0/i915_error_state

Alexander Adam (7ql6) wrote :

Had just the same problem on

LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch
Description: Ubuntu 12.10

In dmesg I found:

[29668.344498] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[29668.344505] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

but in my case no /debug/dri/0/i915_error_state could be found (while the bug-gui-tool asked wether to send the data to launchpad this could be reason for the missing file I think - does it remove the file(s) afterwards?).

The interesting point is that I got in this "state" when trying to watch a railscast episode in the browser and pressed the fullscreen-button. The screen became black so I minimized again and everythink was fine again. Doing this a few times and the exception window came up.
GUI operations are very slow now and and many element won't repaint properly (ie when scrolling).

I used chromium 24.0.1312.56-0ubuntu0.12.10.3 (if it is somehow relevant).

Xorg.log seems to be interesting ((EE) [mi] EQ overflowing…) you can find it attached.

While I have a ASUS Zenbook UX31A my graphics card is

00:02.0 VGA compatible controller [0300]: Intel Corporation 3rd Gen Core processor Graphics Controller [8086:0166] (rev 09) (prog-if 00 [VGA controller])
 Subsystem: ASUSTeK Computer Inc. Device [1043:1517]
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0
 Interrupt: pin A routed to IRQ 44
 Region 0: Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
 Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
 Region 4: I/O ports at f000 [size=64]
 Expansion ROM at <unassigned> [disabled]
 Capabilities: <access denied>
 Kernel driver in use: i915
 Kernel modules: i915

Hans (old-man999) wrote :

Seems related to this: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1135759

I tried all the versions before this bug happening 3.5.24 and earlier with no success.
"915.i915_enable_rc6=0" does not make a change really here although gpu isn't hanging that often.

Stan Schymanski (schymans) wrote :

Same here, tried "i915.i915_enable_rc6=0", even upgraded to Kernel 3.8.2, but the crash reports just keep coming, with or without freezes and hard shutdowns.

Stan Schymanski (schymans) wrote :
Download full text (3.7 KiB)

Just to give an idea of the frequency of the GPU hangs, I pasted below a part from my syslog. After the 3rd hang in a row, the system became unresponsive and I had to do an emergency shut-down (Alt+SysRq, R, E, I, S, U , B). Hope this will help with the troubleshooting.

Mar 11 21:45:03 sppc26 kernel: [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.5.0-26-generic root=UUID=5083e04c-1bad-44bf-a241-c839914a697a ro crashkernel=384M-2G:64M,2G-:128M quiet splash i915.i915_enable_rc6=0
.
.
.
Mar 11 21:47:07 sppc26 kernel: [ 159.875052] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 11 21:47:07 sppc26 kernel: [ 159.875056] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Mar 11 21:47:07 sppc26 kernel: [ 159.878208] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
Mar 11 21:47:20 sppc26 kernel: [ 172.694397] CPU2: Package power limit notification (total events = 1)
Mar 11 21:47:20 sppc26 kernel: [ 172.694399] CPU3: Package power limit notification (total events = 1)
Mar 11 21:47:20 sppc26 kernel: [ 172.694400] CPU1: Package power limit notification (total events = 1)
Mar 11 21:47:20 sppc26 kernel: [ 172.694401] CPU0: Package power limit notification (total events = 1)
Mar 11 21:47:20 sppc26 kernel: [ 172.705373] CPU3: Package power limit normal
Mar 11 21:47:20 sppc26 kernel: [ 172.705374] CPU1: Package power limit normal
Mar 11 21:47:20 sppc26 kernel: [ 172.705393] CPU2: Package power limit normal
Mar 11 21:47:20 sppc26 kernel: [ 172.705394] CPU0: Package power limit normal
Mar 11 21:57:19 sppc26 kernel: [ 770.979630] CPU1: Package power limit notification (total events = 1068)
Mar 11 21:57:19 sppc26 kernel: [ 770.979633] CPU3: Package power limit notification (total events = 1068)
Mar 11 21:57:19 sppc26 kernel: [ 770.979635] CPU2: Package power limit notification (total events = 1068)
Mar 11 21:57:19 sppc26 kernel: [ 770.979637] CPU0: Package power limit notification (total events = 1068)
Mar 11 21:57:19 sppc26 kernel: [ 770.990639] CPU1: Package power limit normal
Mar 11 21:57:19 sppc26 kernel: [ 770.990641] CPU2: Package power limit normal
Mar 11 21:57:19 sppc26 kernel: [ 770.990642] CPU3: Package power limit normal
Mar 11 21:57:19 sppc26 kernel: [ 770.990643] CPU0: Package power limit normal
Mar 11 21:58:51 sppc26 kernel: [ 863.575435] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 11 21:58:51 sppc26 kernel: [ 863.575750] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
Mar 11 22:05:38 sppc26 kernel: [ 1270.493341] CPU2: Package power limit notification (total events = 1578)
Mar 11 22:05:38 sppc26 kernel: [ 1270.493344] CPU1: Package power limit notification (total events = 1578)
Mar 11 22:05:38 sppc26 kernel: [ 1270.493346] CPU3: Package power limit notification (total events = 1578)
Mar 11 22:05:38 sppc26 kernel: [ 1270.493347] CPU0: Package power limit notification (total events = 1578)
Mar 11 22:05:38 sppc26 kernel: [ 1270.504379] CPU2: Package power limit normal
Mar 11 22:05:38 sppc26 kernel: [ 1270.504380] CPU0: Package power limit normal
Mar 11 22:05:38 sppc26 kernel: [ 1270.504399] CPU1: Package power ...

Read more...

Stan Schymanski (schymans) wrote :

For completeness, below an example from the syslog under Kernel 3.8.2 (again needed emergency reset). I also tried Kernel 2.6.38-13, which did not result in the same "GPU hung" message, but lots of other crash reports, so I gave up quite quickly.

Mar 12 23:01:48 sppc26 kernel: [ 0.000000] Linux version 3.8.2-030802-generic (root@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201303031906 SMP Mon Mar 4 00:07:09 UTC 2013
Mar 12 23:01:48 sppc26 kernel: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.8.2-030802-generic root=UUID=5083e04c-1bad-44bf-a241-c839914a697a ro crashkernel=384M-2G:64M,2G-:128M quiet splash i915.i915_enable_rc6=0
.
.
.
Mar 12 23:09:37 sppc26 kernel: [ 505.650930] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 12 23:09:37 sppc26 kernel: [ 505.650935] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Mar 12 23:12:46 sppc26 kernel: [ 695.152169] SysRq : This sysrq operation is disabled.

1 comments hidden view all 127 comments
Pieter (diepes) wrote :

Is this kernel bug the same i915 problem ?
https://bugs.freedesktop.org/show_bug.cgi?id=54226

Alessio (alessio) wrote :

Ubuntu 12.04.2LTS on Intel i3-2120 with integrated hd2000 gpu

I never had this problem before, but after yesterday kernel update (3.5.0-26-generic) after few minutes dmesg start to show this error messages

[ 175.345408] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 175.345412] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 175.348230] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
[ 200.244562] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 200.244844] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off

after about an hour Xorg hangs I can only move the mouse pointer

I attached /sys/kernel/debug/dri/0/i915_error_state

Andrew Seguin (aseguo) wrote :

Just a small "happens to me to" comment, with one small extra detail about kernel versions:

After Ubuntu 12.04 with automatic updates picked up kernel 3.2.0-39-generic, all computers on our campus (20+ with the problem) with Pentium G840 for CPU (Intel i915 graphics) started having that problem.

We resolved the problem temporarily by installing linux-image-3.2.0-38-generic and removing linux-image-3.2.0-39-generic.

The old fix described in the kernel bug report was not working for us (kernel command line option i915.i915_enable_rc6=0)

I have experienced the same today. I am on a Linux Mint 14 system, with the latests kernel update installed.

I get the following in kern.log
[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

Output of uname -a
Linux redemption 3.5.0-26-generic #42-Ubuntu SMP Fri Mar 8 23:18:20 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Andrew Inishev (inish777) wrote :

Ubuntu 12.10. Display randomly freezes for few seconds.

Mar 21 22:20:55 laptop kernel: [ 8539.287011] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 21 22:20:55 laptop kernel: [ 8539.287232] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

uname -a:

Linux laptop 3.5.0-26-generic #42-Ubuntu SMP Fri Mar 8 23:18:20 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Also I have ATI RADEON HD 6300 as a discrete videocard.

/kernel/debug/vgaswitcheroo/switch:
0:IGD:+:Pwr:0000:00:02.0
1:DIS: :Pwr:0000:01:00.0

Ben (bhubu) wrote :

Same problem: display freezes for few seconds since a few days.

Ubuntu 12.10

Mar 22 18:22:02 bhnbu kernel: [28761.486703] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

Linux bhnbu 3.5.0-26-generic #42-Ubuntu SMP Fri Mar 8 23:18:20 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Ben (bhubu) wrote :

Looks like a regression in 3.5.0-26. Booting 3.5.0-25 (Linux bhnbu 3.5.0-25-generic #39-Ubuntu SMP Mon Feb 25 18:26:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux) does not show the problematic behaviour on my system.

1 comments hidden view all 127 comments
Ben (bhubu) wrote :

I am taking back my last comment.

Mar 23 17:10:59 bhnbu kernel: [ 3477.463901] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

:(

Changed in linux (Ubuntu):
importance: Medium → High

Welcome Pedro Villavicencio to Ubuntu! - (¡Re-bienvenido Pedro Villavicencio a Ubuntu!) :D

Acid 303 (acid-303) wrote :

I have the same error with an "Sony Vaio SVE1512C6EW" (Ubuntu 12.10 - 64 Bit).

See my german postion on ubuntuusers.de

http://forum.ubuntuusers.de/topic/error-hangcheck-timer-elapsed-gpu-hung-sony-va/

Is there a workaround for this problem to fix it?

thx,
Acid

tags: added: regression-update
Changed in linuxmint:
status: New → Confirmed
71 comments hidden view all 127 comments

Chipset is whatever is in the i3-2350M processor, HD3000 or whatever it's called.
system architecture: i686
libdrm: 2.4.45
mesa: 9.1.3
xf86-video-intel: 2.21.8
X.Org X Server: 1.14.1, Build Operating System: Linux 3.8.7
uname -r: 3.9.4-1-ARCH
Linux distribution: Arch Linux
Reproducable: not really, so far it only happened to me while playing games. It happens rarely and without obvious triggering event.

Last entry:
...
Jun 02 20:49:41 eeyore devmon[1128]: partition: [1]
Jun 02 23:12:45 eeyore kernel: [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Jun 02 23:12:45 eeyore kernel: [drm] capturing error event; look for more information in/sys/kernel/debug/dri/0/i915_error_state
Jun 02 23:12:57 eeyore systemd-logind[1132]: Power key pressed.
Jun 02 23:12:57 eeyore systemd-logind[1132]: Powering Off...
...

$cat /sys/kernel/debug/dri/0/i915_error_state
no error state collected

What happens is quite simple, the whole system seems to be frozen, almost nothing reacts, certainly nothing that could help me to collect further information. The problem seems to be present since a while, it caused me to report this: https://bugs.freedesktop.org/show_bug.cgi?id=61411.
However, the first few times I played relatively demanding games and the laptop got rather warm, this time the game was far less demanding and the machine was not warmer than usual.

If there's any further information I can provide to help fix this, please tell me.

Created attachment 84132
event capture

These are the contents of /sys/kernel/debug/dri/0/i915_error_state after the gpu hung event occurred.

Created attachment 84134
xrandr --verbose

Bug description:
I seem to be having the same issue. This occurred several times in a short period while using chromium on a second monitor. I set the secondary monitor up with: xrandr --output eDP1 --right-of HDMI1

System environment:
-- chipset: i7-4700MQ with the HD 4600 gpu
-- system architecture: 64-bit (x86_64)
-- xf86-video-intel: 2.21.14-2
-- xserver: 1.14.2-2
-- mesa: 9.1.6-1
-- libdrm: 2.4.46-2
-- kernel: 3.10.6-2-ARCH
-- Linux distribution: Arch
-- Machine or mobo model: Toshiba Satellite P70-A [PSPLNU-01Q006]
-- Display connector: hdmi

dmesg:
[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
[drm:kick_ring] *ERROR* Kicking stuck wait on blitter ring

I attached the contents of i915_error_state

Dual head display seemed to work fine earlier this month when I installed this system. In the last week or so I run into the same issues anytime the second display is connected.
[ 356.614203] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 356.614208] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
[ 356.621884] [drm:kick_ring] *ERROR* Kicking stuck wait on blitter ring
[ 452.710408] Watchdog[966]: segfault at 0 ip 00007ff58f002938 sp 00007ff57ca5f010 error 6 in chromium[7ff58e229000+503a000]
[ 458.656344] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 458.656380] [drm:kick_ring] *ERROR* Kicking stuck wait on blitter ring
[ 470.681128] Watchdog[1146]: segfault at 0 ip 00007fc67a291938 sp 00007fc667cee010 error 6 in chromium[7fc6794b8000+503a000]
[ 473.721812] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 483.709854] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 483.709868] [drm:kick_ring] *ERROR* Kicking stuck wait on blitter ring
[ 541.776618] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 541.776654] [drm:kick_ring] *ERROR* Kicking stuck wait on blitter ring
[ 553.800507] Watchdog[1201]: segfault at 0 ip 00007f83ca2c6938 sp 00007f83b7d23010 error 6 in chromium[7f83c94ed000+503a000]
[ 556.808672] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 556.808693] [drm:kick_ring] *ERROR* Kicking stuck wait on blitter ring

Correction: I just realized that it's fine as long as I don't put Chromium on the second display. My statement "seemed to work fine earlier this month" was actually due to that.

28 comments hidden view all 127 comments
peterbo (peterbo) wrote :

schymans: I am running 13.04 so have not tested on 12.10. The crash you describe sounds a bit worse than the gpu hang but I obviously cannot be sure. My laptop never rebooted when the hang occurred though.

I must admit that I had one hang with rc5 after 7 days, however the error message was different than the usual "Hangcheck timer elapsed". It was something with a "ring" - cannot remember any more of it. It happened after having dual monitors connected, suspending my laptop and then opening it without any monitors connected. Am on rc6 now and no problems yet.

I cannot seem to force the hang, neither by running glxspheres, dragging windows fast across the desktop or playing with compiz effetcs/settings.

Thanks, peterbo. Sorry, got confused. Of course, I'm running it on
13.04, too. I have had these really weird crashes before (rebooting by
itself) and they went away after one of the kernel upgrades, maybe even
when upgrading from 12.10 to 13.04. Got the Hangcheck timer elapsed ones
instead. That is, sometimes, when I return to my laptop, it is turned
off even when I'm pretty sure I left it on, so maybe they have just been
happening while idle. Now with the rc5 kernel I got the original crash
(without error log) within hours again, so I got discouraged from trying
any further. It's my work laptop, running dual monitors, and I really
can't afford to experiment too much as I need it to run and do its job.
I still can't believe that this is such a persistent bug (several
bugs?). The Linux kernel used to be considered bomb-proof! Apparently
its bugs, too...

On 21/08/13 16:04, peterbo wrote:
> schymans: I am running 13.04 so have not tested on 12.10. The crash you
> describe sounds a bit worse than the gpu hang but I obviously cannot be
> sure. My laptop never rebooted when the hang occurred though.
>
> I must admit that I had one hang with rc5 after 7 days, however the
> error message was different than the usual "Hangcheck timer elapsed". It
> was something with a "ring" - cannot remember any more of it. It
> happened after having dual monitors connected, suspending my laptop and
> then opening it without any monitors connected. Am on rc6 now and no
> problems yet.
>
> I cannot seem to force the hang, neither by running glxspheres, dragging
> windows fast across the desktop or playing with compiz effetcs/settings.
>

Another data point; I had posted over at http://ubuntuforums.org/showthread.php?t=2168780, but recently found this bug based on kernel crash syslog entries.

Anyway, I'm on a Macbook 4,2 FWIW:

$ uname -a
Linux cle-mba 3.8.0-29-generic #42-Ubuntu SMP Tue Aug 13 19:40:39 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

partial syslog:
Aug 22 05:26:55 cle-mba kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.8.0-29-generic root=UUID=748df707-f110-49e0-b603-7ca1439341d8 ro quiet splash resume=/dev/sda3 vt.handoff=7
Aug 22 13:17:49 cle-mba kernel: [78499.977886] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Aug 22 13:17:49 cle-mba kernel: [78499.977892] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

Attached i915_error_state from the same event.

Note that I used to have i915.i915_enable_rc6=0 in my grub options, but removed it in an attempt to eliminate the hangs/crashes based on recommendations on a random forum thread. :-P Behaviour has been unaffected; I get a GPU hang or window manager crash (i.e. have to "recover" via tty1) ~every other day.

Stan Schymanski (schymans) wrote :

Just confirming that the kernel under http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11-rc5-saucy/ is not usable to me. My laptop crashes unrecoverably within hours and does not leave any traces in syslog. About to remove it again from the kernel list.

3.11.0-031100rc5-generic from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/ works slightly better for me, but I still get similar crashes every couple of days.

Stan Schymanski (schymans) wrote :

Oops, wrong copy and paste, sorry!
3.11.0-031100rc5-generic from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11-rc5-saucy/ crashes several times a day
3.11.0-994-generic from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/ crashes every couple of days on my machine

Stan Schymanski (schymans) wrote :

After half a day of working on 3.8.13-030813-generic, got the "[drm:i915_hangcheck_hung]" again. At least it was sort-of recoverable, as I was able to log into another tty and backup i915_error_state (attached). Hope it helps someone for diagnosing the problem. When returning to tty7, Screen was still flackering whenever I moved the mouse, and it didn't seem to respond to any clicks, so I had to logout by CTRL+ALT+DEL and then I could log back in without problems.

Olivier Febwin (febcrash) wrote :

[24033.443586] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[24033.443592] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

crash@Dell-Latitude-E6520:~$ uname -a
Linux Dell-Latitude-E6520 3.8.0-30-generic #43-Ubuntu SMP Wed Aug 21 21:07:22 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

rb.eng (rb-engch) wrote :

Perhaps this is a related issue:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/946899

As I noted in the comments for that issue:

This issue impacts me multiple times a week. Sometimes the mouse continues to work but other times the mouse freezes also. I have not yet tried to open a teminal as suggested by bduncan (comment #13) but I am usually able to switch to the first terminal and kill compiz.

My freeze often occurs when switching workspaces. I often have thunderbird and chromium running laterally in desktop 1 and 2 respectively (of four total) and use an external drive on either a laptop or a netbook with an external display attached. The problem exists regardless of the hardware I am using.

When the freeze occurs and I can get to tty1, killing compiz is my work around. I usually use sudo top and find the compiz pid and kill it with signal 3. When I switch back to tty7 the screen will trigger the clearing of the freeze after I switch to it (tty7). Then compiz restarts and after a few moments (perhaps a minute) I am back working with the application which were open. When the freeze occurs and the keyboard becomes unresponsive (unable to switch to tty1) I must do a hard reboot to continue working from a restart.

I have looked into the log files and searched for any instance of compiz logging the freeze. All I have found is the indication I have killed the process. Compiz does not seem to be tied into the apport system for reporting. I expect there are several more users with this problem but the freezes go unreported as users have no idea what to do about it.

Unfortunately these unknown freezes (unknown to development community) will continue until these problems are given priority and addressed. Bottom line is that such freezes are like the blue screen of death and give potential adopters of ubuntu a crummy experience and all the more reason to switch back to something more familiar or "stable".

Compiz has made my user experience a challenging one. To that end I have enjoyed lighter interfaces including crunchbang and trisquel. I have been reluctant to go beyond the 12.04 LTS version as I am concerned my hardware is not up to the task. Compiz IMO provides me an awful experience and these problems give me pause to encourage others to the platform.

I implore those developers with more understanding of the freeze issue to put some heat on this problem and investigate what is going on and fix this. To ignore this further may render any other work on the ubuntu project to be moot if the user experience continues to suffer, IMO. I am happy to contribute what I can to help address this but I do not have deep knowledge of the compiz system.

rb.eng (rb-engch) wrote :

Correct link in comment #96 should read

Perhaps this is a related issue:
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+bug/987498

rb

monomakh (monomakh) wrote :

Bug 987498 is really duplicate of this. On my home PC no problems with nvidia drivers, but on my work compiz freezes several times on day with intel.

1 comments hidden view all 127 comments
Olivier Febwin (febcrash) wrote :

Same problem on Ubuntu 13.10 Saucy

[14026.619116] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
[14026.635119] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x1dd7000 ctx 1) at 0x1dd71d8
crash@Dell-Latitude-E6520:~$ uname -a
Linux Dell-Latitude-E6520 3.11.0-11-generic #17-Ubuntu SMP Tue Oct 1 19:42:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

André Panisson (panisson) wrote :

Same problem on my Dell XPS 15. The interesting thing is that it happened while navigating Google Maps in Chrome.
Switched to a terminal using CTRL+ALT+F1 and back. When switched back, the interface responded for a few seconds, and hanged again. Before hanging again, Chrome was able to show the message "Rats! WebGL hit a snag".

dmesg:
[84044.751322] Watchdog[8060]: segfault at 0 ip 00007fa0e8b11f2e sp 00007fa0d8f7b4e0 error 6 in chrome[7fa0e5406000+56da000]
[84046.087148] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[84046.087153] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

uname -a
Linux xxx 3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 20:03:44 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

André Panisson (panisson) wrote :

By the way, switching to the terminal, killing compiz and switching back again was able to restore the interface without having to restart lightdm and all my applications.

Download full text (6.9 KiB)

I also have sometimes hangups where CTRL+ALT+F1 and afterwards
CTRL+ALT+F7 brings back GUI response, but very rarely. I am using
Chrome only for Hangouts, for everything else I am using Firefox.

On Thu, Oct 10, 2013 at 4:23 PM, André Panisson
<email address hidden> wrote:
> By the way, switching to the terminal, killing compiz and switching back
> again was able to restore the interface without having to restart
> lightdm and all my applications.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/946899
>
> Title:
> [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU
> hung
>
> Status in The Linux Kernel:
> Incomplete
> Status in The Linux Mint Distribution:
> Confirmed
> Status in “linux” package in Ubuntu:
> Triaged
> Status in “linux” package in Fedora:
> Unknown
>
> Bug description:
> Since upgrading to 12.04 beta, I've seen this happen twice. The
> symptoms are:
>
> - The screen freezes
> - The backlight turns off
>
> At that point I have to reboot to get my display back. In syslog, i
> see:
>
>
> Mar 4 23:09:18 perseus kernel: [ 3751.612064] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
> Mar 4 23:09:18 perseus kernel: [ 3751.612076] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
> Mar 4 23:09:18 perseus kernel: [ 3751.613658] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 114040 at 114004, next 114118)
> Mar 4 23:09:18 perseus kernel: [ 3751.637049] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
>
> followed by a lot of things like:
>
>
> Mar 4 23:09:18 perseus kernel: [ 3751.700662] WARNING: at /build/buildd/linux-3.2.0/drivers/gpu/drm/i915/intel_display.c:793 intel_enable_pipe+0x14a/0x150 [i915]()
> Mar 4 23:09:18 perseus kernel: [ 3751.700667] Hardware name: 6465CTO
> Mar 4 23:09:18 perseus kernel: [ 3751.700670] PLL state assertion failure (expected on, current off)
>
> Mar 4 23:09:18 perseus kernel: [ 3751.812158] WARNING: at /build/buildd/linux-3.2.0/drivers/gpu/drm/i915/intel_display.c:930 assert_pipe+0x75/0x80 [i915]()
> Mar 4 23:09:18 perseus kernel: [ 3751.812165] Hardware name: 6465CTO
> Mar 4 23:09:18 perseus kernel: [ 3751.812170] pipe B assertion failure (expected on, current off)
>
> then:
>
>
> Mar 4 23:09:19 perseus kernel: [ 3753.044086] [drm:intel_lvds_enable] *ERROR* timed out waiting for panel to power on
> Mar 4 23:09:20 perseus kernel: [ 3753.671451] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
> Mar 4 23:09:20 perseus kernel: [ 3753.684603] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
> Mar 4 23:09:20 perseus kernel: [ 3753.704594] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
> Mar 4 23:09:20 perseus kernel: [ 3753.724594] [drm:i915_wait_request] *ERROR* something (likely vbetool) disabled interrupts, re-enabling
> Mar 4 23:09:20 perseus ke...

Read more...

 uname -a
Linux athena2 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 cat /var/log/syslog|grep i915
Oct 21 12:18:59 athena2 kernel: [253454.206467] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
Oct 21 12:18:59 athena2 kernel: [253454.206470] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
Oct 21 12:18:59 athena2 kernel: [253454.217043] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x75b40000 ctx 2) at 0x75b40220

André Panisson (panisson) wrote :

I can consistently reproduce the "*ERROR* Hangcheck timer elapsed... GPU hung" message in the following conditions: using the new Google Maps interface with Chrome and WebGL hardware acceleration enabled.
After a few seconds navigating in the Maps interface, the browser crashes and Unity freezes. I'm able to restore Unity by switching to the terminal, killing Compiz and switching back again.
After restoring Unity, it seems that Chrome automatically disables hardware acceleration. You can see, attached, the results of about:gpu before and after the crash. After this point, Maps navigation does not crash the browser. However, by restarting the browser, hardware acceleration is back again, and I'm able to reproduce the same behavior.

Olivier Febwin (febcrash) wrote :
summary: - [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
+ 8086:2a02 [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
+ elapsed... GPU hung
tags: added: bios-outdated-2.30
Olivier Febwin (febcrash) wrote :

crash@Dell-Latitude-E6520:~$ dmesg|grep i915
[ 23.254078] i915 0000:00:02.0: setting latency timer to 64
[ 23.283563] i915 0000:00:02.0: irq 44 for MSI/MSI-X
[ 24.347643] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[ 24.347644] i915 0000:00:02.0: registered panic notifier
[ 24.787777] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 5825.839301] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
[ 5825.839311] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
[ 5825.842976] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0xfeb5000 ctx 1) at 0xfeb51d8
crash@Dell-Latitude-E6520:~$ uname -a
Linux Dell-Latitude-E6520 3.11.0-14-generic #21-Ubuntu SMP Tue Nov 12 17:04:55 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Oleg Yaroshevych (brainunit) wrote :

Folks, I'm on HP ProBook 6360b, and I've experienced this issue on Ubuntu 13.04 every day. I've switched my kernel to 3.11.6 from saucy, and problem had gone. It works good with 13.04, I had uptime of about a month.

➜ ~ uname -a
Linux comp_name 3.11.6-031106-generic #201310181453 SMP Fri Oct 18 18:54:15 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Exact image name is "linux-image-3.11.6-031106-generic_3.11.6-031106.201310181453_amd64.deb"

Good luck!

tags: added: needs-bisect

Affected. Ubuntu 13.04. I just opened a terminal application. Music was playing, but could not do anything.

Linux PC 3.11.5-031105-generic #201310132235 SMP Mon Oct 14 02:35:52 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Legimet (legimet) wrote :

I just got the following:
[drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring
[drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
[drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x6bcd5000 ctx 1) at 0x6bcd51c8
init: tty1 main process ended, respawning

I am using Trisquel 6.0, based on 12.04, with the lts-saucy enablement stack.
Linux hostname 3.11.0-17-generic #0trisquel1 SMP Fri Feb 21 10:26:23 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Espen Nilsen (espen-k-nilsen) wrote :

Not sure if this is relevant, but I had a similar problem:
------------------
Mar 11 16:09:21 bytesize-desktop kernel: [ 1141.077075] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring

Mar 11 16:09:21 bytesize-desktop kernel: [ 1141.077095] [drm] capturing error event; look for more information in /sys/kernel/debug/d

ri/0/i915_error_state

Mar 11 16:09:21 bytesize-desktop kernel: [ 1141.082863] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x6043000 ctx

 1) at 0x60431c8
------------------

The fix for this for me was to change the display drivers to intel-01 drivers ( 01.org ). This bumped the MESA driver from 9.2.1 to 10.0.0.

From what i could guess it seamed to be sandy-bridge related, and apparently got fixed in MESA 9.2.3

Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Confirmed
6 comments hidden view all 127 comments

It would appear some related action is going on in bug #54226

Bug is still there, Linux Kernel 3.13.0-37-generic #64-Ubuntu SMP x86_64

MSI Z87-G55 + Intel Core i7 4770 + integrated graphics:
Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
Drivers: Loading /usr/lib/xorg/modules/drivers/intel_drv.so

I am still experiencing freeze of the graphic interface of about 3 to 5 seconds associated with the followind dmesg output:

[drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... blitter ring idle

visi0nary (confusionbay91) wrote :

Have you guys tried setting the i915.semaphores option to 1? That fixed it for me!

Robert Hrovat (robi-hipnos) wrote :

I tried on 12 computers and it didn't help. I'm updating BIOS on all computers since we have mostly first versions of BIOS that came with computers. I'm also disabling Intel Virtualisation Support in BIOS that some forums suggested.

This is really crazy. We had zero problems with freezing on Ubuntu 10.04, but since 12.04 it's happening all the time.

Chris Rainey (ckrzen) wrote :

It is also recommended to _disable_ VT-d(Intel Virtualization Technologies) in the BIOS when using:

i915_semaphores=1

boot option.

This increases the stability of the stack / system, according to Intel's release notes for this option.

This fixed my DELL Inspiron 3646 using a J2900 cpu BayTrail / ValleyView SoC on Ubuntu 15.04.

Robert Hrovat (robi-hipnos) wrote :

Chris, I did that on all machines and maybe there are less crashes, but still far from stable.

Chris Rainey (ckrzen) wrote :

OK,

I'm getting better results with the following settings:

ALREADY TRIED:

Disable VT-d(Intel Virtualization Technologies) in the BIOS

/etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_pstate=disable"

then

$ sudo update-grub

and

/etc/modprobe.d/i915.conf:

options i915 semaphores=1

then

$ sudo update-initramfs -u

NOT TRIED:

/etc/X11/xorg.conf.d/20-intel.conf
 Section "Device"
    Identifier "old intel stuff"
    Driver "intel"
    Option "Shadow" "True"
    Option "DRI" "false"
 EndSection

/etc/X11/xorg.conf.d/20-intel.conf

Section "Device"
   Identifier "Intel Graphics"
   Driver "intel"
   Option "NoAccel" "True"
EndSection

/usr/share/X11/xorg.conf.d/20-intel.conf:

Code:
Section "Device"
   Identifier "Intel Graphics"
   Driver "intel"
   Option "AccelMethod" "uxa"
EndSection

Robert Hrovat (robi-hipnos) wrote :

Chris, currently I get better results with your hints:
Adding intel_pstate=disable and moving i915 semaphores to modprobe.d

Changed in linux (Fedora):
importance: Unknown → Undecided
status: Unknown → Won't Fix
Chris Rainey (ckrzen) wrote :

Ubuntu 12.04 (precise) reached end-of-life on April 28, 2017.

See this document for currently supported Ubuntu releases:
https://wiki.ubuntu.com/Releases

We appreciate that this bug may be old and you might not be interested in discussing it any more. But if you are then please upgrade to the latest Ubuntu version and re-test. If you then find the bug is still present in the newer Ubuntu version, please add a comment here telling us which new version it is in and change the bug status to Confirmed.

Changed in linuxmint:
status: Confirmed → Incomplete
Changed in linux (Ubuntu):
status: Triaged → Incomplete
Displaying first 40 and last 40 comments. View all 127 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.