[GM965] GPU lockup running OpenGL applications on Dell Vostro 1510

Bug #1388612 reported by Andrea Bini
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mesa
Fix Released
Medium
mesa (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

Hello, I'm suffering a GPU lockup problem with Ubuntu 14.10 64-bit on a Dell Vostro 1510 laptop with Intel graphics. The problem used to arise also with Ubuntu 14.04 and is always reproducible with glmark2 on the LiveCD. Just to run the "ideas" benchmark of glmark2 to trigger the lockup in few seconds (glmark2 -b ideas). Other OpenGL applications also causes the lockup (e.g. FooBillard++, my self-made SDL applications). Memory test passed, BIOS at the latest version. Don't know if the following is a good point but I think that the problem is not caused by a failing hardware because it works fine on Windows. I tried with the latest Intel upstream kernel (http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/current/) and upstream X.org (https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa), no change.

[Personal Notes]
I'm new to Ubuntu's world, I hope I've done things right, if not please tell me and I will resend the report or supply more information. I've created the bug report with "ubuntu-bug xorg --save" from another machine via SSH while the sick machine was frozen. It seemed to me more logical than using ubuntu-bug directly when the system is working fine. I've collected other information via SSH during the same lockup: dmesg, Xorg.0.log, i915_error_state. The latter was added as attachment.

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: xorg 1:7.7+7ubuntu2
ProcVersionSignature: Ubuntu 3.16.0-24.32-generic 3.16.4
Uname: Linux 3.16.0-24-generic x86_64
NonfreeKernelModules: wl
ApportVersion: 2.14.7-0ubuntu8
Architecture: amd64
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
Date: Sun Nov 2 19:24:31 2014
DistUpgraded: 2014-10-25 12:05:51,479 DEBUG enabling apt cron job
DistroCodename: utopic
DistroVariant: ubuntu
DkmsStatus:
 bcmwl, 6.30.223.248+bdcom, 3.13.0-37-generic, x86_64: installed
 bcmwl, 6.30.223.248+bdcom, 3.16.0-23-generic, x86_64: installed
 bcmwl, 6.30.223.248+bdcom, 3.16.0-24-generic, x86_64: installed
ExtraDebuggingInterest: Yes
GraphicsCard:
 Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) [8086:2a02] (rev 0c) (prog-if 00 [VGA controller])
   Subsystem: Dell Device [1028:0273]
   Subsystem: Dell Device [1028:0273]
InstallationDate: Installed on 2014-10-11 (22 days ago)
InstallationMedia: Ubuntu 14.04.1 LTS "Trusty Tahr" - Release amd64 (20140722.2)
MachineType: Dell Inc. Vostro1510
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.16.0-24-generic root=UUID=9b3612b7-c4bf-4840-885b-9fae75d6342c ro splash quiet vt.handoff=7
SourcePackage: xorg
UpgradeStatus: Upgraded to utopic on 2014-10-25 (8 days ago)
dmi.bios.date: 03/18/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A15
dmi.board.name: 0M277C
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnDellInc.:bvrA15:bd03/18/2009:svnDellInc.:pnVostro1510:pvrNull:rvnDellInc.:rn0M277C:rvr:cvnDellInc.:ct8:cvrN/A:
dmi.product.name: Vostro1510
dmi.product.version: Null
dmi.sys.vendor: Dell Inc.
version.compiz: compiz 1:0.9.12+14.10.20140918-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.56-1
version.libgl1-mesa-dri: libgl1-mesa-dri 10.3.0-0ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 10.3.0-0ubuntu3
version.xserver-xorg-core: xserver-xorg-core 2:1.16.0-1ubuntu1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.9.0-1ubuntu2
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.4.0-2ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.914-1~exp1ubuntu4
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.11-1ubuntu2
xserver.bootTime: Sun Nov 2 18:41:51 2014
xserver.configfile: default
xserver.errors:
 intel(0): Detected a hung GPU, disabling acceleration.
 intel(0): When reporting this, please include /sys/class/drm/card0/error and the full dmesg.
xserver.logfile: /var/log/Xorg.0.log
xserver.outputs:
 product id 15873
 vendor LPL
xserver.version: 2:1.16.0-1ubuntu1

Revision history for this message
In , Haineb (haineb) wrote :

Created attachment 101821
/sys/class/drm/card0/error as requested

Normal browsing, no other significant applications running. Crash has occurred 3+ times. Kernel remains up after display crash. Able to use other vtys normally, but restarting mdm after display crash results in full kernel lock.

Google Chrome 35.0.1916.153

Linux host3 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
In , Haineb (haineb) wrote :

Unable to consistently replicate with any specific use case.

Revision history for this message
In , Haineb (haineb) wrote :

Created attachment 101823
dmesg tail after GPU crash

Revision history for this message
In , Haineb (haineb) wrote :

Issue replicated. Browsing in chrome again. No GPU crash reported this time. I'm lost on what component is truly causing the failure at this point.

[ 18.736862] init: plymouth-upstart-bridge main process ended, respawning
[ 2738.952528] perf samples too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 4610.808077] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... render ring idle
[ 4616.535139] Watchdog[2389]: segfault at 0 ip 00007fae73e88b4e sp 00007fae638ae670 error 6 in chrome[7fae6fe97000+4f39000]
[ 4638.603458] Watchdog[4588]: segfault at 0 ip 00007f2827619b4e sp 00007f281703f670 error 6 in chrome[7f2823628000+4f39000]
[ 4648.644352] Watchdog[4627]: segfault at 0 ip 00007f2c06ceeb4e sp 00007f2bf6714670 error 6 in chrome[7f2c02cfd000+4f39000]

Revision history for this message
Andrea Bini (andrea-bini) wrote :
Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 83627 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 84781 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 84971 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85344 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85406 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85656 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85726 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85749 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85824 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 86108 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85874 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 86167 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 81394 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

I've just marked a lot of bugs as duplicates of this one, so that we can hopefully get to the bottom of the Great Gen4 Chrome Hang of 2014.

Firstly, apologies if your bug is not actually the same bug as all of the rest. It's hard to determine that in advance.

It looks like this report was the first, and it's from the end of June. What changed when you started experiencing the hang? Did you update Mesa, the kernel, or Chrome itself?

Does reverting to a version of one of these from, say, April avoid the problem? If so, that's great and we should be able to bisect it.

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 83807 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 84561 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 85249 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 78483 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 74094 has been marked as a duplicate of this bug. ***

Revision history for this message
In , mikbini (mikbini) wrote :

Well, it never worked for me (I'm the reporter of bug 85824) but I tried chrome only very recently (v 38.0.2125.111).

On the other hand I have a case that reproduces bug 85824 100% of the times: bootstrap, login, launch chrome, go to the chrome store, don't touch anything, wait a few instants ... video hangs.

Revision history for this message
In , Haineb (haineb) wrote :

The mainboard that this graphics chipset was embedded in has failed and I no longer have it available for testing. (Failure seemed to be DIMM related; probably not relevant to this bug.)

I rolled out Mint 17 with Chrome perhaps two weeks after its release/two weeks before opening this report. These intermittent failures started happening immediately. The failures persisted throughout all subsequent system/Chrome updates until the board died a few weeks ago.

Unfortunately, I was not able to identify any cases under which this set of packages worked stably on this graphics chipset.

(The new system with different mainboard is running the same packages; just moved the hard disc over. No failures have occurred since swapping to a different mainboard/graphics chipset.)

Revision history for this message
In , Jarno Suni (jarnos) wrote :

I have upgraded chromium-browser from 37.0.2062.120 to 38.0.2125.111 recently, but my OS (Ubuntu Trusty) provides only version 34 as the alternative version to downgrade to.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

I downgraded chromium from the current 38 version to the 34.0.1847.116 Built on Ubuntu 14.04, running on LinuxMint 17 aura (260972). It was available from the official repositories selecting the option to force version in synaptic.

Since I made it, I never experienced the gpu hang again.

PS: I'm running Linux Mint 17. On previous versions this bug didn't exist.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

That leads me to the conclusion that the problem doesn't reside on intel's driver, but on chromium itself.

Revision history for this message
In , txtsd (thexerothermicsclerodermoid) wrote :

lagreca can you please visit the Google Chrome Store and multiple random videos on Youtube and see if Chrome still doesn't hang?

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

txtsd, I've already done that. I have browsed multiple tabs on youtube, other video websites, outlook website (which made the hang happen when clicking the instanst messenger button), and so on.

The bug just doesn't happen on Chromium Version 34.0.1847.116 Built on Ubuntu 14.04, running on LinuxMint 17 aura (260972).

I have tested it with and without the chromium ffmpeg extra codecs package.

I have tested flash videos and html5 videos.

There is no bug with this version.

Of course I'd like to use the newer versions of Chrome and Chromium.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

I've entered chrome store while testing other tabs.

It's ok.

There's no gpu hang anymore.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

lagreca, I think Chrome was still using software acceleration for page rendering in version 34. It hadn't changed it to hardware acceleration yet. I am not 100% sure though.
Another factor that is not mentioned here is that Chrome is not the only app that causes the crash. Glmark2, an OpenGL benchmark app, causes exactly the same issue when it reaches the "Idea" test.

Revision history for this message
In , txtsd (thexerothermicsclerodermoid) wrote :

I've encountered the bug with github's Atom too.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

Alex,

But I checked the box to enable hardware acceleration when available.

If there are other programs affected, then I must agree with you that it's really a driver's bug.

But the Chromium version I installed works for me.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

It used to work for all of us, I remember it too. The bug first appeared in version 36 and it was only a couple of times in 4 months. It got worse in 37 and finally it's permanent in 38.

They keep changing things. The fact that hardware acceleration is enabled, doesn't mean that all functions are enabled too. If you type chrome://gpu on your search bar, you will see a more thorough list of what is actually enabled and what's not yet. They keep certain functions disabled till they become stable.

On my system, the Web store is crashing the GPU each and every time I go there without disabling hardware acceleration first. It's not random. The web store has no videos. hotmail is doing it too but randomly. Google Chrome download page caused it once too. It has to be something else or there maybe 2 or 3 factors causing this.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

(In reply to Alex from comment #34)
> It used to work for all of us, I remember it too. The bug first appeared in
> version 36 and it was only a couple of times in 4 months. It got worse in
> 37 and finally it's permanent in 38.
>
> They keep changing things. The fact that hardware acceleration is enabled,
> doesn't mean that all functions are enabled too. If you type chrome://gpu
> on your search bar, you will see a more thorough list of what is actually
> enabled and what's not yet. They keep certain functions disabled till they
> become stable.
>
> On my system, the Web store is crashing the GPU each and every time I go
> there without disabling hardware acceleration first. It's not random. The
> web store has no videos. hotmail is doing it too but randomly. Google
> Chrome download page caused it once too. It has to be something else or
> there maybe 2 or 3 factors causing this.

4 months is probably wrong. It was a long time though. :)

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

Created attachment 109501
The content of "chrome://gpu " in Chrome 38.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

Created attachment 109502
The content of "chrome://gpu " in Chromium 34.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

I attached two pdf files showing the content of chrome://gpu both in 38 and 34 versions.

Revision history for this message
In , Johny-quest (johny-quest) wrote :

Hello,

I have tried to found first version of kernel / intel drivers / chroem drivers which was cousing problems from apt-get logs:

root@Dell-LD830:~# grep -i Start-Date apt/* | perl -n -e 'BEGIN{my %hn;} $hn{$1}->{$2} = 1 if m/(\w+\.?\w*\.?\w*\.?\w*):.*?:\s+(\d+-\d+-\d+)/imgs; END{ print $_.": ".join(", ", sort keys %{$hn{$_}})."\n" for (sort(keys %hn)); }'

history.log: 2014-11-02, 2014-11-11
history.log.1: 2014-10-02, 2014-10-13, 2014-10-16, 2014-10-20
history.log.10: 2014-01-06, 2014-01-16
history.log.11: 2013-12-01, 2013-12-04, 2013-12-08, 2013-12-18, 2013-12-27, 2013-12-30
history.log.12: 2013-11-17, 2013-11-24
history.log.2: 2014-09-09, 2014-09-21, 2014-09-26
history.log.3: 2014-08-24
history.log.4: 2014-07-03, 2014-07-19
history.log.5: 2014-06-08
history.log.6: 2014-05-28
history.log.7: 2014-04-15
history.log.8: 2014-03-01, 2014-03-05, 2014-03-09, 2014-03-11, 2014-03-16
history.log.9: 2014-02-06, 2014-02-10, 2014-02-22

The GPU-Hung bug first occured about 2 weeks beforei've reported it here. So I should check versions before 2014-06-12. Probably bug was introduced in one of this upgrades:

history.log.5: 2014-06-08
history.log.6: 2014-05-28

Here are version that i was using at that time (extracted from logs):

From history.log.9:
Start-Date: 2014-02-06 19:28:45
Commandline: apt-get upgrade ...
linux-image-3.11.0-15-generic:i386 (3.11.0-15.23, 3.11.0-15.25)
linux-firmware:i386 (1.116, 1.116.1)
xserver-xorg-video-intel:i386 (2.99.904-0ubuntu2, 2.99.904-0ubuntu2.1)
flashplugin-installer:i386 (11.2.202.335ubuntu0.13.10.1, 11.2.202.336ubuntu0.13.10.1)
...

From history.log.7:
Start-Date: 2014-04-15 16:41:31
Commandline: apt-get upgrade ...
google-chrome-stable:i386 (33.0.1750.152-1, 34.0.1847.116-1)
...

From history.log.6:
Start-Date: 2014-05-28 21:05:02
Commandline: apt-get upgrade ...
linux-firmware:i386 (1.127, 1.127.2)
linux-image-extra-3.13.0-24-generic:i386 (3.13.0-24.46, 3.13.0-24.47)
linux-libc-dev:i386 (3.13.0-24.46, 3.13.0-27.50)
linux-headers-3.13.0-24:i386 (3.13.0-24.46, 3.13.0-24.47)
linux-headers-3.13.0-24-generic:i386 (3.13.0-24.46, 3.13.0-24.47)
linux-image-3.13.0-24-generic:i386 (3.13.0-24.46, 3.13.0-24.47)
...

From history.log.5:
Start-Date: 2014-06-08 20:15:24
Commandline: apt-get upgrade ...
libegl1-mesa:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
libosmesa6:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
libegl1-mesa-drivers:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
libwayland-egl1-mesa:i386 (10.1.0-4ubuntu5, 10.1.3-0ubuntu0.1)
...

For further reference I will include whole apt-get history as zip file.

Revision history for this message
In , Johny-quest (johny-quest) wrote :

Created attachment 109507
apt-get logs to bisect changes

Added apt.zip log files;

Revision history for this message
In , Johny-quest (johny-quest) wrote :

Created attachment 109509
more logs drom current GPU Crash

Today I was able to reproduce this bug with:

Linux Dell-LD830 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:31:23 UTC 2014 i686 i686 i686 GNU/Linux

Google Chrome 38.0.2125.111

and glmark2 installed via apt-get install:

glmark2-data:i386 (2012.08-0ubuntu2, automatic), glmark2:i386 (2012.08-0ubuntu2)

I'm attaching logs for more information.

Revision history for this message
In , Johny-quest (johny-quest) wrote :

Created attachment 109510
When crash happens (screenshot from glmark2 demo)

I was able to reproduce error using glmark2 demo on Dell D830.

I've also got other PC with Intel GPU (ThinkPad T430s). I've tried to reproduce error on ThinkPad Ivy Bridge Mobile GPU but the bug is not there. I'm attaching screenshot to show exact moment from demo when crash happens (the screen is from ThinkPad Ivy Bridge not Dell D830).

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 83423 has been marked as a duplicate of this bug. ***

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

Any news?

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

Why is Intel ignoring us?

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

We should file a class action!

Revision history for this message
In , Mattst88 (mattst88) wrote :

Good grief. Take it easy. I work for Intel.

Revision history for this message
In , shacharr (shacharr) wrote :

The glmark2 issue might be related to https://www.libreoffice.org/bugzilla/show_bug.cgi?id=85367 .
It is reproducing on my machine as well (GLmark2 ideas crashes the GPU, "render ring stuck"). Can provide the error file/log files if needed. Can't bisect, as for me the issue showed up when jumping from ubuntu 10.04 to ubuntu 14.04.
However, I'm not sure that the glmark issue and the chrome issue are related.
Matt Turner, any way I can help diagnose/resolve the issue?

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

*** Bug 85958 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Coladict (coladict) wrote :

Created attachment 109667
Error log dump

Sending my latest dump from the crash as well.
I'll give it a few tries on my home computer as well tonight.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

I've found this topic which seems to be related to our bug:
https://bugs.archlinux.org/task/38518?project=1

It says:
"Description:
The new xf86-video-intel (2.99.907-1) causes the display to crash. Reverting to previous version (2.21.15-2) resolves the issue."

Revision history for this message
In , Alexsecret (alexsecret) wrote :

This bug exists in driver version 2.99.911 too. This is the latest one. I have installed this driver using the graphics installer offered by Intel.

I don't think there's any way for us to go back to version 2.21.15-2. There's nowhere in the Ubuntu repos.

Intel will have to correct it eventually since reverting to older drivers is not the right solution for any driver, is it? ;)

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

(In reply to Alex from comment #52)
> This bug exists in driver version 2.99.911 too. This is the latest one. I
> have installed this driver using the graphics installer offered by Intel.
>
> I don't think there's any way for us to go back to version 2.21.15-2.
> There's nowhere in the Ubuntu repos.
>
> Intel will have to correct it eventually since reverting to older drivers is
> not the right solution for any driver, is it? ;)

You're right, Alex.

I only hope that it helps to identify when the bug first happened and why. Maybe it's enough to eliminate it.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

(In reply to Matt Turner from comment #47)
> Good grief. Take it easy. I work for Intel.

Matt, what then can you tell us about Intel's efforts to solve this issue?

Revision history for this message
In , Alexsecret (alexsecret) wrote :

Ok. There's an update for our case here.

Yesterday I installed and tested the new Chrome 39. If we re-enable hardware acceleration on it and visit the Web store, the screen does not switch off any more BUT, there is a huge BUT... The driver switches off hardware acceleration for the whole system after that and unless the system is rebooted, it stays off. This of course means that we have to keep hardware acceleration off in Chrome if we don't want to have the driver switch it off for the entire system.

It would be nice if Intel could solve this issue soon.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

(In reply to Alex from comment #55)
> Ok. There's an update for our case here.
>
> Yesterday I installed and tested the new Chrome 39. If we re-enable
> hardware acceleration on it and visit the Web store, the screen does not
> switch off any more BUT, there is a huge BUT... The driver switches off
> hardware acceleration for the whole system after that and unless the system
> is rebooted, it stays off. This of course means that we have to keep
> hardware acceleration off in Chrome if we don't want to have the driver
> switch it off for the entire system.
>
> It would be nice if Intel could solve this issue soon.

It's up to intel then.

Revision history for this message
In , shacharr (shacharr) wrote :

I did few tests to see if I can narrow down the cause of the issue, here are my insights so far:

- It seems that the bug is triggered only if AccelMethod is SNA. Setting the AccelMethod to UXA seems to be hiding the issue with chrome/chromium. Issue still shows up when running glmark2 -b ideas (though this might be a different issue)

- I tried changing the Xorg/XFree driver versions. I used the freedesktop git, and went all the way back to 2.20.0, where SNA was officially introduced. Bug is reproducing there as well. I attach a small patch to make the code from 2.20.0 compile on modern Xorg version.

- Last April was when Ubuntu released their new "long term support" version, this (and derivatives) might explain part of the spike in the bug reports at this point, as large number of people jumped ship to have SNA enabled by default in their distro

- The ArchLinux page on Intel graphics ( https://wiki.archlinux.org/index.php/Intel_graphics ) contains few pointers to additional tweaking knobs to try out. Going to try them next when I have some free time.

Revision history for this message
In , shacharr (shacharr) wrote :

Created attachment 109788
Compilation helper for old code versions, to assist future bisects (though it seems the issue is not bisectable)

Revision history for this message
In , Alexsecret (alexsecret) wrote :

(In reply to shachar from comment #57)
> I did few tests to see if I can narrow down the cause of the issue, here are
> my insights so far:
>
> - It seems that the bug is triggered only if AccelMethod is SNA. Setting the
> AccelMethod to UXA seems to be hiding the issue with chrome/chromium. Issue
> still shows up when running glmark2 -b ideas (though this might be a
> different issue)
>
> - I tried changing the Xorg/XFree driver versions. I used the freedesktop
> git, and went all the way back to 2.20.0, where SNA was officially
> introduced. Bug is reproducing there as well. I attach a small patch to make
> the code from 2.20.0 compile on modern Xorg version.
>
> - Last April was when Ubuntu released their new "long term support" version,
> this (and derivatives) might explain part of the spike in the bug reports at
> this point, as large number of people jumped ship to have SNA enabled by
> default in their distro
>
> - The ArchLinux page on Intel graphics (
> https://wiki.archlinux.org/index.php/Intel_graphics ) contains few pointers
> to additional tweaking knobs to try out. Going to try them next when I have
> some free time.

I'm really surprised to see that switching to UXA really helped you because that exact test was the very first I did when the bug first appeared and it didn't work at all. Actually, I didn't even have to visit the Web Store in order to produce it. I just attempted to watch a random video on Youtube and the screen went black at once. I didn't even need a second test. :)

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

I tried changing accel mode to uxa again. Surprise, it doesn't work for me. The bug persists. Chrome still crashes X11.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

Intel must intervene as soon as possible.

Revision history for this message
In , Idr (idr) wrote :

There is some evidence (see bug #85267) that always_flush_cache=true or always_flush_batch=true may help.

always_flush_cache=true always_flush_batch=true chromium-browser

This suggests that we're missing a flush somewhere... but finding where is like finding a needle in all the haystacks. :(

Revision history for this message
In , Idr (idr) wrote :

(In reply to Ian Romanick from comment #62)
> There is some evidence (see bug #85267) that always_flush_cache=true or
                      Oops... bug #85367

> always_flush_batch=true may help.
>
> always_flush_cache=true always_flush_batch=true chromium-browser
>
> This suggests that we're missing a flush somewhere... but finding where is
> like finding a needle in all the haystacks. :(

Revision history for this message
In , Alexsecret (alexsecret) wrote :

So, are you suggesting that we create a launcher using this as command line?

always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome

or

always_flush_cache=true always_flush_batch=true glark2

I am not sure I understand correctly. :)

Revision history for this message
In , Mattst88 (mattst88) wrote :

(In reply to Alex from comment #64)
> So, are you suggesting that we create a launcher using this as command line?
>
> always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome

Yes.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

Ok. I tested all three, Chrome, glmark2 and glmark2-es2 running them from the terminal using your suggestion.

All three are working fine, the web store is loading fine, videos play ok.

I wish I knew how to incluse those two "always_flush_cache=true always_flush_batch=true" in a launcher command line though since the only way I can launch chrome this way, is through a terminal and that terminal becomes so crowded with data after a while.

I think this trick can be used till the final fix is released.

Revision history for this message
In , RafaelFelipe.SL (espingardapreta) wrote :

(In reply to Alex from comment #66)
> Ok. I tested all three, Chrome, glmark2 and glmark2-es2 running them from
> the terminal using your suggestion.
>
> All three are working fine, the web store is loading fine, videos play ok.
>
> I wish I knew how to incluse those two "always_flush_cache=true
> always_flush_batch=true" in a launcher command line though since the only
> way I can launch chrome this way, is through a terminal and that terminal
> becomes so crowded with data after a while.
>
> I think this trick can be used till the final fix is released.

Yes, it really helps!!! Everything's fine with this workaround.

Revision history for this message
In , Idr (idr) wrote :

Another method is to set those options in either the user or system drirc. I'm posting this from my phone, so I'll leave finding the details as an exercise for the reader. :)

Revision history for this message
In , Alexsecret (alexsecret) wrote :

(In reply to Ian Romanick from comment #68)
> Another method is to set those options in either the user or system drirc.
> I'm posting this from my phone, so I'll leave finding the details as an
> exercise for the reader. :)

First of all thank you for showing us this workaround. I read about drirc and found about driconf too and how we can set those options. Using this syntax though: "always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome-stable & exit" runs chrome and closes the terminal right after and I guess this seems to be a better way to handle a temporary situation than changing dri settings and then changing them back to what they were when a fix is released.

I'd like to ask what the drawbacks for these two settings are though. Are they slowing things down for the specific app that's using them for example?

Revision history for this message
In , Alexsecret (alexsecret) wrote :

Today, kernel 3.16 was officially released for 14.04 LTS in the Canonical repos. Do you think installing it would solve the problem we're facing or would we still have to use the same workaround anyway please?

Revision history for this message
In , Ryan C. Underwood (nemesis-icequake) wrote :

Don't bother, unless it contains a special patch it doesn't help (also tested with 3.17.x and 3.18-rc5)

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 86757 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

Just aside to everyone suffering from gen4 gpu hangs: If your kernel doesn't manage to reset the gpu after a hang please grab the latest drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel It has fixed up gen4 gpu reset from Ville. I'll try to get this into 3.19 (but the drm subsystem merge window is kinda gone already).

Revision history for this message
In , Nikola Kovacevic (nikolak) wrote :

I can confirm that starting chrome as "always_flush_cache=true always_flush_batch=true google-chrome" mitigates the issue. I can not seem to set that option in ~/.drirc file though, neither by changing default settings or adding chrome as an application using driconf tool, so if someone could post more info on how to do that until the issue gets fixed I'd really appreciate it.

Revision history for this message
In , Nikola Kovacevic (nikolak) wrote :

(In reply to Daniel Vetter from comment #73)
> Just aside to everyone suffering from gen4 gpu hangs: If your kernel doesn't
> manage to reset the gpu after a hang please grab the latest
> drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel It has
> fixed up gen4 gpu reset from Ville. I'll try to get this into 3.19 (but the
> drm subsystem merge window is kinda gone already).

It works - as in it doesn't crash the GPU, but screen flickers (goes black quickly and then shows content again) and system becomes unresponsive until the page is finished rendering.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

(In reply to nikolak from comment #75)
> (In reply to Daniel Vetter from comment #73)
> > Just aside to everyone suffering from gen4 gpu hangs: If your kernel doesn't
> > manage to reset the gpu after a hang please grab the latest
> > drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel It has
> > fixed up gen4 gpu reset from Ville. I'll try to get this into 3.19 (but the
> > drm subsystem merge window is kinda gone already).
>
> It works - as in it doesn't crash the GPU, but screen flickers (goes black
> quickly and then shows content again) and system becomes unresponsive until
> the page is finished rendering.

What you're describing, is the way they've programmed the new Chrome 39 to handle the problem. This happens if you don't issue the "always_flush..." settings or if they don't work for some reason.

The point is that if the symptoms you're describing happen, it means that the driver is switching off hardware acceleration completely for the whole system and it uses software rendering for all apps afterwards. That's why you can see the contents afterwards.

You can verify this by checking the Xorg.0.log file in /var/log. You will see a line near the end of the file saying "Disabling hardware acceleration...".

Personally, I'm using this method to start Chrome on my system which is Ubuntu based. I am opening a terminal and I'm pasting the line "always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome-stable & exit". After pressing ENTER, the terminal is closed automatically and Chrome starts as usual.

Revision history for this message
In , Nikola Kovacevic (nikolak) wrote :

(In reply to Alex from comment #76)
[...]
> The point is that if the symptoms you're describing happen, it means that
> the driver is switching off hardware acceleration completely for the whole
> system and it uses software rendering for all apps afterwards. That's why
> you can see the contents afterwards.
>
> You can verify this by checking the Xorg.0.log file in /var/log. You will
> see a line near the end of the file saying "Disabling hardware
> acceleration...".

Doesn't seem to be it. Xorg.0.log doesn't show anything like that.

The only difference error is in my syslog:

Nov 28 20:13:20 mint kernel: [ 7497.992041] [drm] stuck on render ring
Nov 28 20:13:20 mint kernel: [ 7497.993305] [drm] GPU HANG: ecode 4:0:0x9f47f9fd, in chrome [4630], reason: Ring hung, action: reset
Nov 28 20:13:20 mint kernel: [ 7498.016208] drm/i915: Resetting chip after gpu hang
Nov 28 20:13:21 mint kernel: [ 7498.445210] ------------[ cut here ]------------
Nov 28 20:13:21 mint kernel: [ 7498.445258] WARNING: CPU: 1 PID: 1830 at /home/apw/COD/linux/drivers/gpu/drm/drm_irq.c:1081 drm_wait_one_vblank+0x125/0x130 [drm]()
Nov 28 20:13:21 mint kernel: [ 7498.445262] vblank not available on crtc 1, ret=-22
...etc

And if it did turn off hardware rendering it probably wouldn't happen on each page refresh like it does currently.

> Personally, I'm using this method to start Chrome on my system which is
> Ubuntu based. I am opening a terminal and I'm pasting the line
> "always_flush_cache=true always_flush_batch=true
> /usr/bin/google-chrome-stable & exit". After pressing ENTER, the terminal
> is closed automatically and Chrome starts as usual.

Starting chrome with those parameters does the trick, but doesn't work if chrome gets started by clicking link in an external application for example.

Hopefully better fix, either by intel or google, will make it in official repositories.

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 87000 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Warrickguy-n (warrickguy-n) wrote :

Has anyone had success in either creating a desktop launcher or configuring .drirc to set the always_flush_cache=true and always_flush_batch=true parameters?

Revision history for this message
In , txtsd (thexerothermicsclerodermoid) wrote :

(In reply to Warren from comment #79)
> Has anyone had success in either creating a desktop launcher or configuring
> .drirc to set the always_flush_cache=true and always_flush_batch=true
> parameters?

I use this in my chrome.desktop
Exec=env always_flush_cache=true always_flush_batch=true /usr/bin/google-chrome-unstable %U

But my chrome has been failing to paint tabs, and refreshing them does nothing. I have to open a new tab and enter the same links, or duplicate the tabs.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

(In reply to txtsd from comment #80)
> (In reply to Warren from comment #79)
> > Has anyone had success in either creating a desktop launcher or configuring
> > .drirc to set the always_flush_cache=true and always_flush_batch=true
> > parameters?
>
> I use this in my chrome.desktop
> Exec=env always_flush_cache=true always_flush_batch=true
> /usr/bin/google-chrome-unstable %U
>
> But my chrome has been failing to paint tabs, and refreshing them does
> nothing. I have to open a new tab and enter the same links, or duplicate the
> tabs.

This "unstable" I see in the path, wouldn't have something to do with that, would it?

Revision history for this message
In , Warrickguy-n (warrickguy-n) wrote :

(In reply to txtsd from comment #80)
> (In reply to Warren from comment #79)
> > Has anyone had success in either creating a desktop launcher or configuring
> > .drirc to set the always_flush_cache=true and always_flush_batch=true
> > parameters?
>
> I use this in my chrome.desktop
> Exec=env always_flush_cache=true always_flush_batch=true
> /usr/bin/google-chrome-unstable %U
>
> But my chrome has been failing to paint tabs, and refreshing them does
> nothing. I have to open a new tab and enter the same links, or duplicate the
> tabs.

Thank you. I am unfamiliar with the env parameter for the Exec= entry. I had tried basically the same Exec= statement as your suggestion less the env and it failed. Afer adding the env it works fine. Thanks again.

Revision history for this message
In , txtsd (thexerothermicsclerodermoid) wrote :

(In reply to Alex from comment #81)
> (In reply to txtsd from comment #80)
> > (In reply to Warren from comment #79)
> > > Has anyone had success in either creating a desktop launcher or configuring
> > > .drirc to set the always_flush_cache=true and always_flush_batch=true
> > > parameters?
> >
> > I use this in my chrome.desktop
> > Exec=env always_flush_cache=true always_flush_batch=true
> > /usr/bin/google-chrome-unstable %U
> >
> > But my chrome has been failing to paint tabs, and refreshing them does
> > nothing. I have to open a new tab and enter the same links, or duplicate the
> > tabs.
>
> This "unstable" I see in the path, wouldn't have something to do with that,
> would it?

Yea, I run the dev version, and that was actually a bug in the previous version.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

Guys, this is a bit irrelevant but since all the driver updates and stuff are done using these servers, I thought I'd post these errors they report for the past two days during apt-get update:
--------------------------------------------------
Err https://download.01.org trusty/main amd64 Packages
  server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
Err https://download.01.org trusty/main i386 Packages
  server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
Ign https://download.01.org trusty/main Translation-en_US
Ign https://download.01.org trusty/main Translation-en
Fetched 738 kB in 34s (21.4 kB/s)
W: Failed to fetch https://download.01.org/gfx/ubuntu/14.04/main/dists/trusty/main/binary-amd64/Packages server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

W: Failed to fetch https://download.01.org/gfx/ubuntu/14.04/main/dists/trusty/main/binary-i386/Packages server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

E: Some index files failed to download. They have been ignored, or old ones used instead.
--------------------------------------------------
Is anyone else facing this issue please? Do we know anything about it?

Revision history for this message
In , Ryan C. Underwood (nemesis-icequake) wrote :

*** Bug 86847 has been marked as a duplicate of this bug. ***

Revision history for this message
Felix Schwarz (felix-schwarz) wrote :

This is a bug in the Intel Mesa driver. Unfortunately I have little hope that it will be fixed soon: "This suggests that we're missing a flush somewhere... but finding where is like finding a needle in all the haystacks. :(" (https://bugs.freedesktop.org/show_bug.cgi?id=80568)

Anyways, as far as Ubuntu is concerned this seems to be the same as bug 1382673, bug 1385810 and bug 1394424.

affects: xorg-server → mesa
Changed in mesa:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , Kenxeth (kenxeth) wrote :

Hi all. I believe this should be fixed with Mesa master - specifically, this commit:

commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
Author: Kenneth Graunke <email address hidden>
Date: Sat Jan 17 23:21:15 2015 -0800

    i965: Work around mysterious Gen4 GPU hangs with minimal state changes.

If you're able to test with Mesa master, I'd appreciate any reports of whether this solved the problem for you. It seems to have helped for me.

Changed in mesa:
status: Confirmed → Fix Released
Revision history for this message
In , Dariuskellermann-7 (dariuskellermann-7) wrote :

Updated the mesa package to version 10.4.3 today, which includes the specified commit. Problem is now fixed for me. Thank you very much!

Revision history for this message
Andrea Bini (andrea-bini) wrote :

Thank you Felix for the attention given to my report. I'll comment about the fix that has been released directly on bugs.freedesktop.org. The issue is not solved in my case.

Revision history for this message
In , Andrea Bini (andrea-bini) wrote :

(In reply to Kenneth Graunke from comment #86)
> Hi all. I believe this should be fixed with Mesa master - specifically,
> this commit:
>
> commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
> Author: Kenneth Graunke <email address hidden>
> Date: Sat Jan 17 23:21:15 2015 -0800
>
> i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
>
> If you're able to test with Mesa master, I'd appreciate any reports of
> whether this solved the problem for you. It seems to have helped for me.

Hi all, I'm the reporter of this bug https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1388612 that was redirected here.

I've tested the fix on my machine updating the Xorg stack from https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa A packaged version of Mesa with the fix was released there yesterday only. Unfortunately, even though I'd like to, I'm not able to update the system by myself as soon as a change is committed. Cloning the Mesa repository, building and installing it is not enough, right? Is there a "faster" method than the one I've used?

Anyway, glmark2 ideas benchmark works fine now but, in my case, the issue is not solved yet. An application written by myself using SDL (https://www.libsdl.org/), that before the fix used to cause the hang most of the times, now terminates due to a failing assertion in Mesa.

The most frequent is:
SDLApp: ../../../../src/mesa/vbo/vbo_exec_draw.c:222: vbo_exec_bind_arrays: Assertion `exec->vtx.bufferobj->Mappings[MAP_INTERNAL].Pointer' failed.

Sometimes this happens:
SDLApp: ../../../../src/mesa/vbo/vbo_exec_draw.c:278: vbo_exec_vtx_unmap: Assertion `exec->vtx.buffer_ptr != ((void *)0)' failed.

The application works fine on another machine with a different chip.

The billard game FooBillard++ (http://foobillardplus.sourceforge.net/) still causes the hang immediately.
The workaround of using always_flush_cache=true and always_flush_batch=true doesn't work with my application nor with FooBillard++. I was just able to use it to let ideas benchmark running fine prior to the fix. I've not tested Chrome yet but I can do it if it can be useful.

Thank you Kenneth for the fix and for any further help. I'd really like to use Linux on this machine for my developments. If I can help somehow let me know.

Revision history for this message
In , Janus (ysangkok+launchpad) wrote :

Created attachment 112888
foobilliardplus showing display corruption before crash

foobilliardplus official x86_64 binaries for ubuntu 11.10 running on ubuntu 14.10 with xorg-edgers (mesa build from January 25th, 2015)

Revision history for this message
In , Janus (ysangkok+launchpad) wrote :

The issue is resolved in Chrome for me; thank you Kenneth, your patch makes it so that I do not have this issue every day on YouTube. However, foobilliardplus does make the GPU hang. See my attachment for a screenshot showing display corruption.

Can we reopen this?

Revision history for this message
In , Mattst88 (mattst88) wrote :

(In reply to Janus Troelsen from comment #90)
> The issue is resolved in Chrome for me; thank you Kenneth, your patch makes
> it so that I do not have this issue every day on YouTube. However,
> foobilliardplus does make the GPU hang. See my attachment for a screenshot
> showing display corruption.
>
> Can we reopen this?

We're up to 90 comments, and the foobillardplus crash must be different from the one you confirmed is fixed. Let's file a new bug.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xorg (Ubuntu):
status: New → Confirmed
Changed in xorg (Ubuntu):
importance: Undecided → Medium
importance: Medium → Critical
Revision history for this message
In , Jamie Jackson (jamiejackson) wrote :

What's the edgers/package-based workaround?

I tried the following (on Linux Mint 17):

sudo apt-add-repository ppa:xorg-edgers/ppa && sudo apt-get update && sudo apt-get upgrade

...and I got some new packages, but also saw that some were held back:

The following packages have been kept back:
  libegl1-mesa libgbm1 libgl1-mesa-dri:i386 libgl1-mesa-dri libqt5gui5
  libwayland-egl1-mesa libxatracker2 lxc-docker mintupdate python-cupshelpers
  system-config-printer-gnome

After restarting the system, I still get a Firefox crash or black screen when hitting http://www.google.com/chrome/ (which I think was part of the same problem--I don't think I ever had a 100% reproducible Chrome/YouTube test case).

Revision history for this message
In , Ryan C. Underwood (nemesis-icequake) wrote :

You didn't actually install the packages that include the fix. :-) Try dist-upgrade.

Timo Aaltonen (tjaalton)
affects: xorg (Ubuntu) → mesa (Ubuntu)
Changed in mesa (Ubuntu):
status: Confirmed → Fix Committed
status: Fix Committed → In Progress
importance: Critical → High
Revision history for this message
In , I544cman2000 (i544cman2000) wrote :

(In reply to Ryan Underwood from comment #93)
> You didn't actually install the packages that include the fix. :-) Try
> dist-upgrade.

Newbie here. I tried the same thing as Jamie Jackson but it keeps crashing.
I have a freshly installed Ubuntu 14.04
This happens everytime when I try to visit "google.com/chrome" with Firefox and when I try to watch a YouTube video with Google Chrome.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

I installed the latest available drivers and Mesa 10.5.0 from xorg-edgers today (Jan 28, 2015) on my system and the results SO FAR and after quite extensive and repetitive tests are:

glmark2 : fixed
glmark2-es2 : fixed

Chrome
webstore : fixed
www.google.com/chrome : fixed
youtube videos : fixed

I will continue running tests though, since the new Chrome 40 seems to have screen drawing issues when I change to a different workspace and back to the one Chrome is in and every time Chrome is minimized and restored. Clicking anywhere on the desktop and back on the Chrome window refreshes it though and it goes back to normal.

Thanks anyway! ;-)

Revision history for this message
In , Ryan C. Underwood (nemesis-icequake) wrote :

> Newbie here. I tried the same thing as Jamie Jackson but it keeps crashing.
> I have a freshly installed Ubuntu 14.04

I'm not sure if this means that you have also not actually upgraded to the fixed packages as Jamie Jackson indicated. Please dpkg -l libgl1-mesa-dri and check that the installed version is equivalent to the mesa version listed on xorg-edgers: https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa

Revision history for this message
In , I544cman2000 (i544cman2000) wrote :

(In reply to Ryan Underwood from comment #96)
> I'm not sure if this means that you have also not actually upgraded to the
> fixed packages as Jamie Jackson indicated. Please dpkg -l libgl1-mesa-dri
> and check that the installed version is equivalent to the mesa version
> listed on xorg-edgers: https://launchpad.net/~xorg-edgers/+archive/ubuntu/ppa

This is what I have

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
ii libgl1-mesa-dri:amd64
Version
10.5.0~git20150127.5c83a0d2-0ubuntu0ricotz~trusty

Architecture
amd64

Description
free implementation of the OpenGL API -- DRI modules

Revision history for this message
In , Janus (ysangkok+launchpad) wrote :

Regarding the GPU reset, it is in Linux since 3.19-rc1. So if your GPU is not resetting after crashing, upgrade your kernel. Ubuntu has mainline Linux kernel dpkg packages that work well.

This is the patching commit: https://github.com/torvalds/linux/commit/656bfa3afc14e45e2d9e1624bf60d79b3beb12f2

It sounds like everyone's GPU's are resetting, so I'm wondering if maybe the Ubuntu guys backported this.

Revision history for this message
In , Jamie Jackson (jamiejackson) wrote :

(In reply to 1544c from comment #94)
> (In reply to Ryan Underwood from comment #93)
> > You didn't actually install the packages that include the fix. :-) Try
> > dist-upgrade.
>
> Newbie here. I tried the same thing as Jamie Jackson but it keeps crashing.
> I have a freshly installed Ubuntu 14.04
> This happens everytime when I try to visit "google.com/chrome" with Firefox
> and when I try to watch a YouTube video with Google Chrome.

Thanks, Ryan. I had high hopes for dist-upgrade; alas, it didn't seem to work.

Let me know if I did something wrong, or if I'm barking up the wrong tree, but here's what I got:

# show currently installed version (mine showed 10.1.3-0ubuntu0.3 for amd64 and i386)
dpkg -l libgl1-mesa-dri
# add edgers repo
sudo apt-add-repository ppa:xorg-edgers/ppa
# get the new package lists
sudo apt-get update
# install the edgers packages
sudo apt-get dist-upgrade
# show currently installed version (my packages now show 10.5.0~git20150127)
dpkg -l libgl1-mesa-dri
# reboot
sudo reboot

# try firefox test case
firefox http://www.google.com/chrome/

============= Yields BSOD, with... ================
(process:3075): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::sm-connect after class was initialised

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::show-crash-dialog after class was initialised

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::display after class was initialised

(firefox:3075): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::default-icon after class was initialised
ATTENTION: default value of option force_s3tc_enable overridden by environment.
intel_do_flush_locked failed: Invalid argument
===================================================

BTW:
jamie@minty ~ $ uname -a
Linux minty 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
jamie@minty ~ $ lsb_release -d
Description: Linux Mint 17 Qiana

Revision history for this message
In , Alexsecret (alexsecret) wrote :

Jamie Jackson, having read you issue with www.google.com/chrome, I've visited the site at least 10 times using Chrome after yesterday's updates and all went just fine. Are you facing the same issue when you're using chrome or is it a firefox issue only?

Revision history for this message
In , Ryan C. Underwood (nemesis-icequake) wrote :

Jamie, your uname -a output indicates that you haven't installed a kernel that contains the other part of the fix.
Go here: http://kernel.ubuntu.com/~kernel-ppa/mainline/
Pick a kernel >= v3.19-rc1 and install the proper debs. Then make sure you choose that kernel at your bootloader.

Revision history for this message
In , Jamie Jackson (jamiejackson) wrote :

(In reply to Alex from comment #100)
> Jamie Jackson, having read you issue with www.google.com/chrome, I've
> visited the site at least 10 times using Chrome after yesterday's updates
> and all went just fine. Are you facing the same issue when you're using
> chrome or is it a firefox issue only?

Not sure if the Chrome/YouTube still persisted after that, because that was intermittent--I never had a 100% reliable Chrome test case, which is why I was using the reliable Firefox case.

(In reply to Ryan Underwood from comment #101)
> Jamie, your uname -a output indicates that you haven't installed a kernel
> that contains the other part of the fix.
> Go here: http://kernel.ubuntu.com/~kernel-ppa/mainline/
> Pick a kernel >= v3.19-rc1 and install the proper debs. Then make sure you
> choose that kernel at your bootloader.

Thanks for putting the pieces together for me, Ryan. I seem to have success now. (My package management is probably pretty weird as a result, but I'm going to ignore that.)

For the other n00bs, here was the final piece:

mkdir -p /tmp/kernel && cd /tmp/kernel
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc5-vivid/linux-image-3.19.0-031900rc5-generic_3.19.0-031900rc5.201501180935_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc5-vivid/linux-headers-3.19.0-031900rc5_3.19.0-031900rc5.201501180935_all.deb
sudo dkpg -i linux-image*
# In truth, I did this next one in GDebi, after I ran
# into a problem that "sudo apt-get install -f" fixed
sudo dpkg -i linux-headers*
sudo reboot

uname -a # yields:Linux minty 3.19.0-031900rc5-generic #201501180935 SMP Sun Jan 18 09:36:49 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# try firefox test case
firefox http://www.google.com/chrome/

=== Yields a black blip, but it recovers! and the followng output ===
(process:3038): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::sm-connect after class was initialised

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::show-crash-dialog after class was initialised

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::display after class was initialised

(firefox:3038): GLib-GObject-WARNING **: Attempt to add property GnomeProgram::default-icon after class was initialised
ATTENTION: default value of option force_s3tc_enable overridden by environment.
======================================================================

Revision history for this message
In , shacharr (shacharr) wrote :

I can verify that using updated linux kernel (3.19-rc6) and libmesa (10.5.0~git20150127.5c83a0d2-0ubuntu0ricotz~trusty ), chrome and firefox do not crash the system. I tried Google Inbox in chrome, where scrolling for few minutes was crashing the system, and no crash. I tried going to youtube in google chrome, fiddled around with the videos there, and no crash.

Going in firefox to google.com/chrome/ causes a GPU hang (will attach GPU state dump soon), however the new GPU reset code works well and the system is still functional afterwards. Should I file a new bug on this crash?

--Shachar

Revision history for this message
In , shacharr (shacharr) wrote :
Download full text (3.5 KiB)

Created attachment 112956
/sys/class/drm/card0/error when running firefox http://www.google.com/chrome/

Relevant dmesg print:
[ 151.816215] [drm] stuck on render ring
[ 151.817277] [drm] GPU HANG: ecode 4:0:0xf41b8c79, in firefox [2491], reason: Ring hung, action: reset
[ 151.817279] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 151.817281] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 151.817282] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 151.817284] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 151.817286] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 152.168294] drm/i915: Resetting chip after gpu hang
[ 152.201301] ------------[ cut here ]------------
[ 152.201383] WARNING: CPU: 0 PID: 86 at /home/kernel/COD/linux/drivers/gpu/drm/i915/intel_sdvo.c:1424 intel_sdvo_get_config+0x201/0x220 [i9
15]()
[ 152.201388] SDVO pixel multiplier mismatch, port: 0, encoder: 1
[ 152.201391] Modules linked in: bnep rfcomm dm_crypt snd_hda_codec_idt snd_hda_codec_hdmi snd_hda_codec_generic wl(POE) snd_hda_intel gpio_
ich snd_hda_controller uvcvideo snd_hda_codec videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common dell_wmi snd_hwdep snd_pcm videod
ev sparse_keymap dell_laptop dcdbas i8k snd_seq_midi snd_seq_midi_event snd_rawmidi joydev media serio_raw coretemp r852 snd_seq sm_common na
nd snd_seq_device nand_ecc btusb nand_bch bch snd_timer nand_ids cfg80211 bluetooth snd mtd soundcore r592 memstick lpc_ich mac_hid binfmt_mi
sc parport_pc ppdev lp parport btrfs xor raid6_pq hid_generic usbhid hid psmouse sdhci_pci sdhci firewire_ohci firewire_core ahci crc_itu_t l
ibahci i915 sky2 i2c_algo_bit drm_kms_helper video wmi drm
[ 152.201502] CPU: 0 PID: 86 Comm: kworker/0:2 Tainted: P U W OE 3.19.0-031900rc6-generic #201501261152
[ 152.201507] Hardware name: Dell Inc. Inspiron 1525 /0U990C, BIOS A13 06/27/2008
[ 152.201552] Workqueue: events i915_error_work_func [i915]
[ 152.201557] 0000000000000590 ffff8800b5fabb98 ffffffff817c4584 0000000000000007
[ 152.201565] ffff8800b5fabbe8 ffff8800b5fabbd8 ffffffff81076df7 ffff8800b5fabc08
[ 152.201571] ffff880036368710 ffff88003596a000 0000000000000000 0000000000000001
[ 152.201578] Call Trace:
[ 152.201591] [<ffffffff817c4584>] dump_stack+0x45/0x57
[ 152.201600] [<ffffffff81076df7>] warn_slowpath_common+0x97/0xe0
[ 152.201607] [<ffffffff81076ef6>] warn_slowpath_fmt+0x46/0x50
[ 152.201665] [<ffffffffc02195cf>] ? intel_sdvo_get_value+0x3f/0x60 [i915]
[ 152.201723] [<ffffffffc021ab21>] intel_sdvo_get_config+0x201/0x220 [i915]
[ 152.201776] [<ffffffffc01d4d9e>] intel_modeset_readout_hw_state+0x2ae/0x450 [i915]
[ 152.201830] [<ffffffffc01eeabe>] intel_modeset_setup_hw_state+0x2e/0x3c0 [i915]
[ 152.201883] [<ffffffffc01ef320>] intel_finish_reset+0x160/0x1b0 [i915]
[ 152.201931] [<ffffffffc01b482f>] i915_error_work_func+0xdf/0x150 [i915]
[ 152.201945] [<ffffffff8108f6dd>] process_one_work+0x14d/0x460
[ 152.201952] [<ffffffff810900bb>] worker_thread+0x11b/0...

Read more...

Revision history for this message
In , I544cman2000 (i544cman2000) wrote :

Thanks Jamie Jackson, I haven't experienced any crashes since I applied the update.

Revision history for this message
In , txtsd (thexerothermicsclerodermoid) wrote :

(In reply to Kenneth Graunke from comment #86)
> Hi all. I believe this should be fixed with Mesa master - specifically,
> this commit:
>
> commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
> Author: Kenneth Graunke <email address hidden>
> Date: Sat Jan 17 23:21:15 2015 -0800
>
> i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
>
> If you're able to test with Mesa master, I'd appreciate any reports of
> whether this solved the problem for you. It seems to have helped for me.

Hi.
On Archlinux 3.18.4-1-ARCH, with mesa 10.4.3-1, the crashes caused by chrome have gone away. However, visiting google.com/chrome on firefox now causes the same crash that chrome used to cause.

[179870.322075] [drm] stuck on render ring
[179870.323089] [drm] GPU HANG: ecode 0:0x7f64fafd, in firefox [14619], reason: Ring hung, action: reset
[179870.323092] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[179870.323093] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[179870.323095] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[179870.323097] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[179870.323098] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[179870.323548] [drm:i915_reset] *ERROR* Failed to reset chip: -19

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 88881 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 87088 has been marked as a duplicate of this bug. ***

Revision history for this message
In , GSJ (gsj5) wrote :

Hi,

After searching half a day for a solution to the black screen of death issue, I finally ended up at this thread. I am happy to report the problem on Lenovo T61 with Intel 965GM graphic card on a Ubuntu 14.04 LTS is fixed by following Jamie's steps listed in comment #99. It does need an updated kernel, mine is 3.13.0-46-generic #79-ubuntu.

I would appreciate if someone with more knowledge then I propagates this solution to 50+ sites which discusses the black screen problem but with no viable solution!

Thank you very much for your dedication to help resolve this issue.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

After all these days I've been using the latest drivers and stuff offered by xorg-edgers and everything seemed to be running fine, tonight it happened again. It seems that if I visit www.google.com/chrome using the first tab on Chrome, everything is fine. Tonight I needed to download Chrome for another computer and I visited that page at a moment when several other apps and seven other tabs were already open on Chrome, thus creating an eighth tab. That caused an immediate GPU crash and the screen went off again.

This has become very annoying. I need to work without these issues guys. :(

Revision history for this message
In , Janus (ysangkok+launchpad) wrote :

Did you upgrade your kernel Alex? If you did, the GPU should be able to properly reset.

Revision history for this message
In , Alexsecret (alexsecret) wrote :

(In reply to Janus Troelsen from comment #111)
> Did you upgrade your kernel Alex? If you did, the GPU should be able to
> properly reset.

The last official kernel released for *ubuntus 14.04 is 3.16 which is no good for this issue. If I recall correctly the one needed is 3.19 and in this case it has to be downloaded from the mainline site and they do not recommend that.

I don't know whether I should trust it or not.

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 89249 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 88281 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 88195 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 87770 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 87723 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 89489 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 87550 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 86972 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 87089 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 86937 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 84803 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 86721 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mattst88 (mattst88) wrote :

*** Bug 89341 has been marked as a duplicate of this bug. ***

Revision history for this message
penalvch (penalvch) wrote :

Andrea Bini, thank you for reporting this and helping make Ubuntu better.

As per https://wiki.ubuntu.com/Releases, Ubuntu 14.10 reached EOL on July 23, 2015.

Is this reproducible on a supported release?

tags: added: latest-bios-a15
Changed in mesa (Ubuntu):
importance: High → Medium
status: In Progress → Incomplete
Revision history for this message
Andrea Bini (andrea-bini) wrote :

Yes, I've upgraded the system to version 15.10 and installed all the updates. glmark2 "ideas" benchmark now works but FooBillard++ still triggers the lockup almost immediately. What really changed is that now a reboot from another virtual console is no more needed. After the lockup, the GPU restarts automatically in few seconds, the graphical environment continues to work fine and the application terminates reporting the error: "intel_do_flush_locked failed: Input/output error". I'd like to find other applications, officially distributed with Ubuntu, that causes this issue. I'll do it as soon as I can.

Thanks for your attention!

Revision history for this message
penalvch (penalvch) wrote :

Andrea Bini, to see if this is already resolved, could you please test http://cdimage.ubuntu.com/daily-live/current/ and advise to the results?

tags: added: wily
Revision history for this message
Andrea Bini (andrea-bini) wrote :

Done, same result with FooBillard++ except that the graphical environment becomes unusable; you can only move the mouse cursor. If you switch to another virtual console and restart the lightdm service from there, you come back to the login screen and you don't have to reboot. I've tried OpenArena, Warzone 2100, two WebGL tests found on the web and all glmark2 tests again on Ubuntu 15.10; they all works fine. Seems to be a very specific problem. Can you suggest me an application that performs a comprehensive GPU test or something else that is interesting to try?

Revision history for this message
penalvch (penalvch) wrote :

Andrea Bini, when the graphical environment becomes unusable, is there a crash file in the /var/crash folder?

tags: added: xenial
Revision history for this message
Andrea Bini (andrea-bini) wrote :

No, it's empty.

Revision history for this message
penalvch (penalvch) wrote :

Andrea Bini, it wouldn't hurt to file a net new report upstream for this (unless one of the many duplicates of #80568 was one you personally reported). Could you please advise?

Changed in mesa (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Andrea Bini (andrea-bini) wrote :

Christopher M. Penalver, you mean to file a new bug report at freedesktop.org? I haven't reported any of the duplicates of #80568; the latter was linked to this bug by Felix Schwarz and then many other bugs were mark as duplicates of it on freedesktop.org. Matt Turner already suggested to file a new bug report for FooBillard++ (https://bugs.freedesktop.org/show_bug.cgi?id=80568#c91) but seems that no one has done it. I've found this (https://bugs.freedesktop.org/show_bug.cgi?id=72172) regarding FooBillard++ on GM45 that may be relevant. I own the same chip on another computer and I can confirm the same behaviour on Ubuntu 16.04. The graphic glitches that happen during the game are very similar to the ones shown on GM965, but the GPU doesn't hang and the game keeps running. Maybe before reporting would be better to find other applications that reveal the bug. What do you think about it? It would help or create more confusion?

Revision history for this message
penalvch (penalvch) wrote :

Andrea Bini, both up and downstream prefer folks file separate reports, as it's cheap to mark duplicates after its been confirmed, which helps the most and reduces confusion.

The other folks you mentioned, while it's great they have contributed in their way, they haven't reviewed your report personally, so any belief prior to this would be premature.

Revision history for this message
penalvch (penalvch) wrote :

Andrea Bini, in response to your email, please feel free to file the report following upstream's instructions via https://01.org/linuxgraphics/documentation/how-report-bugs .

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.