[regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on Sandybridge
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| DRI |
Won't Fix
|
Medium
|
|||
| Mesa |
Unknown
|
Unknown
|
|||
| linux (Debian) |
New
|
Unknown
|
|||
| linux (Fedora) |
Won't Fix
|
Undecided
|
|||
| linux (Ubuntu) |
Critical
|
Unassigned | |||
| Precise |
Critical
|
Unassigned | |||
| Quantal |
Critical
|
Unassigned | |||
| Raring |
Critical
|
Unassigned | |||
| linux-lts-quantal (Ubuntu) |
Critical
|
Unassigned | |||
| Precise |
Critical
|
Unassigned | |||
| Quantal |
Critical
|
Unassigned | |||
| Raring |
Critical
|
Unassigned | |||
| linux-lts-raring (Ubuntu) |
Critical
|
Unassigned | |||
| Precise |
Critical
|
Unassigned | |||
| mesa (Ubuntu) |
Critical
|
Unassigned | |||
| Precise |
Critical
|
Unassigned | |||
Bug Description
I'm getting errors about GPU hangs every minute or so (usually only when using FF and scrolling a webpage or something). I also get an annoying ubuntu dialog saying there is a "system error".
This didn't happen with 3.5.0-24-generic.
Here is the dmesg:
[15169.033709] [drm:i915_
[15169.034517] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[15628.480216] [drm:i915_
[15628.480570] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[15844.231372] [drm:i915_
[15844.231773] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[20173.232593] [drm:i915_
[20173.233211] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26285.650393] [drm:i915_
[26285.650980] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26285.658405] ------------[ cut here ]------------
[26285.658472] WARNING: at /build/
[26285.658474] Hardware name: SATELLITE Z830
[26285.658476] Modules linked in: sdhci_pci sdhci btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 snd_hda_codec_hdmi snd_hda_
[26285.658537] Pid: 23433, comm: kworker/u:0 Not tainted 3.5.0-26-generic #40-Ubuntu
[26285.658539] Call Trace:
[26285.658549] [<ffffffff81051
[26285.658553] [<ffffffff81051
[26285.658569] [<ffffffffa02d3
[26285.658584] [<ffffffffa02bf
[26285.658595] [<ffffffffa0295
[26285.658601] [<ffffffff81012
[26285.658612] [<ffffffffa029a
[26285.658618] [<ffffffff81070
[26285.658629] [<ffffffffa029a
[26285.658632] [<ffffffff81071
[26285.658636] [<ffffffff81071
[26285.658640] [<ffffffff81076
[26285.658644] [<ffffffff8168a
[26285.658649] [<ffffffff81075
[26285.658652] [<ffffffff8168a
[26285.658654] ---[ end trace 59c6162fdfcbffee ]---
[26756.021167] [drm:i915_
[26756.021426] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26766.014093] [drm:i915_
[26766.014397] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26932.376233] [drm:i915_
[26932.376544] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26932.384285] ------------[ cut here ]------------
[26932.384354] WARNING: at /build/
[26932.384356] Hardware name: SATELLITE Z830
[26932.384358] Modules linked in: sdhci_pci sdhci btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 snd_hda_codec_hdmi snd_hda_
[26932.384421] Pid: 24262, comm: kworker/u:2 Tainted: G W 3.5.0-26-generic #40-Ubuntu
[26932.384422] Call Trace:
[26932.384431] [<ffffffff81051
[26932.384436] [<ffffffff81051
[26932.384451] [<ffffffffa02d3
[26932.384466] [<ffffffffa02bf
[26932.384476] [<ffffffffa0295
[26932.384482] [<ffffffff81012
[26932.384493] [<ffffffffa029a
[26932.384500] [<ffffffff81070
[26932.384511] [<ffffffffa029a
[26932.384514] [<ffffffff81071
[26932.384517] [<ffffffff81071
[26932.384521] [<ffffffff81076
[26932.384526] [<ffffffff8168a
[26932.384531] [<ffffffff81075
[26932.384534] [<ffffffff8168a
[26932.384536] ---[ end trace 59c6162fdfcbffef ]---
ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: linux-image-
ProcVersionSign
Uname: Linux 3.5.0-26-generic x86_64
ApportVersion: 2.6.1-0ubuntu10
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
CheckboxSubmission: f8b82cd9bc23fe0
CheckboxSystem: b1865df84255b87
Date: Sat Mar 2 22:25:14 2013
HibernationDevice: RESUME=
InstallationDate: Installed on 2012-04-26 (310 days ago)
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
MachineType: TOSHIBA SATELLITE Z830
MarkForUpload: True
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 1.95
SourcePackage: linux
UpgradeStatus: Upgraded to quantal on 2012-10-28 (125 days ago)
dmi.bios.date: 07/31/2012
dmi.bios.vendor: TOSHIBA
dmi.bios.version: Version 1.70
dmi.board.
dmi.board.name: Portable PC
dmi.board.vendor: TOSHIBA
dmi.board.version: Version A0
dmi.chassis.
dmi.chassis.type: 10
dmi.chassis.vendor: TOSHIBA
dmi.chassis.
dmi.modalias: dmi:bvnTOSHIBA:
dmi.product.name: SATELLITE Z830
dmi.product.
dmi.sys.vendor: TOSHIBA
| luca (llucax) wrote : | #1 |
| Changed in linux (Ubuntu): | |
| status: | New → Confirmed |
Another note, when these hungs happen, I get graphic corruption (usually in the fonts/text).
| luca (llucax) wrote : | #4 |
kernel 3.5.0-25-generic also seems to work fine.
| Changed in linux (Ubuntu): | |
| importance: | Undecided → Medium |
| tags: | added: regression-update |
| luca (llucax) wrote : | #5 |
After I resumed from suspension with kernel 3.5.0-25-generic I got again the annoying dialogs saying there was a GPU hung detected asking me to report a bug that I have no idea where is going, but looking at dmesg I can't see anything strange [1]. How can I see why those dialogs are being open to see if there is something wrong?
I had the annoying dialog several times in a very short period of time, like 10 times in about 5 minutes and then it stopped. After that I suspended and resumed my laptop a couple of times and it didn't happen again so far.
[1] Except for messages like this but I'm getting this since I bought this computer about an year ago and never had those annoying dialog about any GPU hang:
[52682.020386] CPU1: Package power limit notification (total events = 5770)
[52682.020389] CPU3: Package power limit notification (total events = 5769)
[52682.020391] CPU2: Package power limit notification (total events = 5761)
[52682.020393] CPU0: Package power limit notification (total events = 5746)
[52682.021517] CPU3: Package power limit normal
[52682.021520] CPU1: Package power limit normal
[52682.021521] CPU2: Package power limit normal
[52682.021526] CPU0: Package power limit normal
| luca (llucax) wrote : | #6 |
Also, I couldn't see any graphic corruption this last time with kernel 3.5.0-25
| Craig McQueen (cmcqueen1975) wrote : | #7 |
This affects me, but in my case I'm running Ubuntu 12.04, and the problem seems to be with kernel 3.2.0-39. Booting to kernel 3.2.0-38 seems to have fixed it.
| Hans (old-man999) wrote : | #9 |
Sry, this is more related to my case: https:/
| luca (llucax) wrote : | #10 |
Seemsto befixed in linux-image-
| luca (llucax) wrote : | #11 |
Nope, stil getting it with linux-image-
[32861.907463] [drm:i915_
[32861.907470] [drm] capturing error event; look for more information in /debug/
[32861.911988] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
...
[39199.903510] [drm:i915_
[39199.903846] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
The same here, previous kernel 3.5.0-25-generic works without problems, 3.5.0-26.42 hanged just now:
$ dmesg|grep i915
[ 0.000000] Command line: BOOT_IMAGE=
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 1.667363] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1.667367] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1.667652] i915 0000:00:02.0: setting latency timer to 64
[ 1.687950] i915 0000:00:02.0: irq 44 for MSI/MSI-X
[ 2.429882] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 330.684154] i915 0000:00:02.0: power state changed by ACPI to D3
[ 331.826825] i915 0000:00:02.0: power state changed by ACPI to D0
[ 331.826829] i915 0000:00:02.0: power state changed by ACPI to D0
[ 331.826830] i915 0000:00:02.0: setting latency timer to 64
[ 1677.075872] [drm:i915_
[ 1677.075876] [drm] capturing error event; look for more information in /debug/
Content of i915_error_state attached.
Will disable rc6 and test what happens then.
With rc6 off the hangup happened 2 minutes after booting:
$ dmesg|grep i915
[ 0.000000] Command line: BOOT_IMAGE=
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 0.857239] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.857242] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.857458] i915 0000:00:02.0: setting latency timer to 64
[ 0.877771] i915 0000:00:02.0: irq 44 for MSI/MSI-X
[ 1.619983] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 128.787009] [drm:i915_
[ 128.787013] [drm] capturing error event; look for more information in /debug/
[ 254.699283] [drm:i915_
Seems it's time to return to 3.5.0-25-generic.
| Laurent (l-perlat) wrote : | #14 |
Same problem here :
Visual corruptions + "GPU hang" error when scrolling in Firefox with 3.5.0-26.
Everything back to normal on 3.5.0-25 (Linux 3.5.0-25-generic #39-Ubuntu SMP Mon Feb 25 18:26:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux)
| tobyS (tobias-schlitt) wrote : | #15 |
Beside visual corruptions and hangs I also experience complete system hang ups (no reaction until hard reboot) and occasional kernel panics. I therefore wonder why this report does not receive higher prio?
| summary: |
- [regression] 3.5.0-26-generic CPU hangs + [regression] 3.5.0-26-generic GPU hangs |
Same issue here. Happens very often during firefox usage, but also on other ocasions.
[51278.392895] [drm:i915_
[51278.392901] [drm] capturing error event; look for more information in /debug/
[51278.397785] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
| shuerhaaken (shkn) wrote : | #17 |
This is really getting annoying, is anybody taking care of this?
| czigor (czigor) wrote : | #18 |
@shkn:
Using 3.5.0-25-generic made my PC usable again. I get an error message only at login.
| Timo Aaltonen (tjaalton) wrote : | #19 |
it's one of these commits (from the quantal kernel), likely the top one since it's happening on sandybridge:
817e8fdee14b05d drm/i915: Implement WaDisableHiZPla
4c443ec9afe7f6f drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits
f534135423c7028 drm/i915: Disable AsyncFlip performance optimisations
c0c1fd8a18479f0 drm/i915: Invalidate the relocation presumed_offsets along the slow path
| Changed in linux (Ubuntu): | |
| assignee: | nobody → Ubuntu Kernel Team (ubuntu-kernel-team) |
| importance: | Medium → Critical |
| Changed in linux (Ubuntu Quantal): | |
| importance: | Undecided → Critical |
| status: | New → Confirmed |
| Changed in linux (Ubuntu Precise): | |
| importance: | Undecided → Critical |
| status: | New → Confirmed |
| Timo Aaltonen (tjaalton) wrote : | #20 |
note that I'm not sure it's affecting raring, maybe not.
| Changed in linux (Ubuntu Precise): | |
| status: | Confirmed → Invalid |
| Changed in linux-lts-quantal (Ubuntu Precise): | |
| status: | New → Confirmed |
| Changed in linux-lts-quantal (Ubuntu Quantal): | |
| status: | New → Invalid |
| Changed in linux-lts-quantal (Ubuntu Raring): | |
| status: | New → Invalid |
| Changed in linux-lts-quantal (Ubuntu Precise): | |
| importance: | Undecided → Critical |
| Changed in linux (Ubuntu Precise): | |
| importance: | Critical → Undecided |
| tags: | added: performing-bisect |
| Joseph Salisbury (jsalisbury) wrote : | #21 |
I'd like to perform a kernel bisect to identify the exact commit that introduced this regression. However, it would be good to test the latest mainline and a test kernel with commit 817e8fdee14b05d reverted.
The latest mainline kernel can be downloaded from:
http://
Can folks affected by this bug test the v3.9-rc3 kernel?
One thing to note, you will need to install both the linux-image and linux-image-extra .deb packages.
I will also build a Quantal test kernel with commit 817e8fdee14b05d reverted and post a link shortly.
Thanks in advance!
| Joseph Salisbury (jsalisbury) wrote : | #22 |
I built a Quantal test kernel with commit 817e8fdee14b05d reverted. The kernel can be downloaded from:
http://
Can folks affected by this bug test this kernel and report back if it fixes the issue?
| Gard Spreemann (gspreemann) wrote : | #23 |
@jsalisbury: I could not successfully test the kernel you linked to in comment #22, as it rendered my system unusable. X started at 640x480, there was no working keyboard/mouse, and I could not SSH in.
@jsalisbury: Thanks for pointing to this kernel version. I have been able to successfully test kernel v3.9-rc3 on Precise 12.04.2 (I am running with quantal-lts xorg stack).
I have been running on it for ~ 3 hours so far with success. It usually took some time for me to hit this bug on my Dell V131, so some more testing might be required.
I will report back if I hit the bug again. So far so good.
| luca (llucax) wrote : | #25 |
Also initial success for now. Still getting the annoying dialog at startup though (but no signs of GPU hungs in dmesg).
@jsakusbury: I tested your kernel 3.5.0-27-generic #45~lp1140716v1 (from comment #22), it was no improvement for my system. I got two hangups within the first hour (one S3 cycle at 1985), the second one forced me to turn off the system:
[ 0.000000] Command line: BOOT_IMAGE=
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 0.804805] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.804809] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.805030] i915 0000:00:02.0: setting latency timer to 64
[ 0.824988] i915 0000:00:02.0: irq 43 for MSI/MSI-X
[ 1.563280] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 1894.853449] i915 0000:00:02.0: power state changed by ACPI to D3
[ 1896.202702] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1896.202708] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1896.202720] i915 0000:00:02.0: setting latency timer to 64
[ 1984.429241] i915 0000:00:02.0: power state changed by ACPI to D3
[ 1985.767157] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1985.767160] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1985.767168] i915 0000:00:02.0: setting latency timer to 64
[ 2132.278551] [drm:i915_
[ 2132.278555] [drm] capturing error event; look for more information in /debug/
[ 3504.895781] [drm:i915_
| luca (llucax) wrote : | #27 |
torsten, maybe you are having a different issue, note that your hang doesn't look like related to rc6 state.
[51278.397785] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
BTW, my system is still surviving without hangs with the patched 3.5 kernel.
| luca (llucax) wrote : | #28 |
I just had a burst of dialogs informing of non-existent GPU hangs (with kernel 3.5 patched). The GPU hans are not reported in dmesg though, so I don't know where is it getting from. Also no corruption or anything. Seems like the dialog madness is started when an unrelated program crashes. Maybe is just an apport bug? How should I proceed to see what's really going on?
@jsalisbury: I've been running your 3.5.0-27-generic #45~lp1140716v1 for 5 hours and I've already had 3 hangs. No improvement here.
[ 5733.121323] [drm:i915_
[ 5733.121330] [drm] capturing error event; look for more information in /debug/
[ 5733.124957] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
| luca (llucax) wrote : | #30 |
OK, it took a while but I got the GPU hang finally with kernel3.
[22344.085044] [drm:i915_
[22344.085051] [drm] capturing error event; look for more information in /debug/
[22344.090106] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
| luca (llucax) wrote : | #31 |
Always happens with firefox, an only with certain sites (consistently).
| luca (llucax) wrote : | #32 |
I got this with a second hang:
[22344.085044] [drm:i915_
[22344.085051] [drm] capturing error event; look for more information in /debug/
[22344.090106] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[23652.138382] [drm:i915_
[23652.138898] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[23652.146420] ------------[ cut here ]------------
[23652.146491] WARNING: at /home/jsalisbur
[23652.146495] Hardware name: SATELLITE Z830
[23652.146497] Modules linked in: sdhci_pci sdhci snd_hda_codec_hdmi snd_hda_
[23652.146578] Pid: 3451, comm: kworker/u:0 Not tainted 3.5.0-27-generic #45~lp1140716v1
[23652.146581] Call Trace:
[23652.146592] [<ffffffff81051
[23652.146599] [<ffffffff81051
[23652.146621] [<ffffffffa03f6
[23652.146640] [<ffffffffa03e2
[23652.146655] [<ffffffffa03b8
[23652.146663] [<ffffffff81012
[23652.146679] [<ffffffffa03bd
[23652.146688] [<ffffffff81070
[23652.146701] [<ffffffffa03bd
[23652.146707] [<ffffffff81071
[23652.146712] [<ffffffff81071
[23652.146719] [<ffffffff81076
[23652.146726] [<ffffffff8168a
[23652.146732] [<ffffffff81075
[23652.146737] [<ffffffff8168a
[23652.146740] ---[ end trace 2153106cc632835c ]---
| Gard Spreemann (gspreemann) wrote : | #33 |
I'm confused as to where the commits referenced by tjaalton in comment #19 live, but for what it's worth, I seem to have a stable system after applying reverse diffs of the following commits from the linux-3.5.y branch of git://kernel.
2964148 - drm/i915: Implement WaDisableHiZPla
899b550 - drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits
Just reverting the first, or using jsalisbury's kernel from comment #22 (ignore my comment #23, I was being an idiot and forgot the modules) gives me a GPU hang and/or graphics corruption within minutes, especially quickly if opening Firefox. After reverting both of the above, I haven't been able to hang the system yet.
Kernel 3.9.0-030900rc3
@jsalisbury: After a few days of use and many suspend-resume cycles, I am yet to encounter a problem with kernel 3.9-rc3 (as indicated in comment #21). No problems whatsoever on my Dell v131 (i5 Sandybridge)...
| Peter Saunderson (peteasa) wrote : | #36 |
I got this a lot with Kernel: 3.5.0-26-generic and used a quick workround to avoid the problem: http://
SNA does not seem to have the same issue just UXA. If I have time I can try a new kernel but I spent so much time on this already it may be a few days before I get the time to try the new kernel.
| Max Rameau (afrimax-e) wrote : | #37 |
I had the problem right after updating (not upgrading) on Saturday using 12.04.
I was able to control it by logging into 2D and immediately opening the System Monitor and shutting down the three instances of Ubuntu One (login, synch and launch), because the machine would freeze upon login to Ubuntu One. I then had to shut down zeitgeist-fts, because that would start eating up resourced (upto 300mb of memory at one point).
At that point, I just decided to reinstall into 12.10. I did that and it worked fine for an hour, so I started transfering over my backed up files and logged into Ubuntu One while running the updates. The problems began immediately, including resource use going up to 100% for long periods of time, mainly through the multiplication of the gkts (?) service. It only used 3.8MB at a time, but at one point there were 20 instances of it open. I concluded it was Ubuntu One causing the problem, so I reinstalled again, this time not logging into Ubuntu One. No problems for 4 hours, even as I installed software. Then I ran the automatic software update, and the problems began again immediately.
Constant crashing, crazy graphic corruption and other issues. Ran the system log and got the similar error:
kernel [224.243459] [drm: Enable RC6 States: RC6 off, RC6p off, RC6p off]
kernal [246.465377] [drm: i95_hangcheck_hung] *ERROR* Hangcheck timer elapsed GPU hung
etc., etc.
This is a nightmare. Need a fix.
| Matthew Eaton (powder) wrote : | #38 |
Test kernel did not fix the issue for me.
Linux matt-work 3.5.0-27-generic #45~lp1140716v1 SMP Fri Mar 22 15:50:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Mar 25 08:12:15 matt-work kernel: [ 158.302349] [drm:i915_
Mar 25 08:12:15 matt-work kernel: [ 158.302353] [drm] capturing error event; look for more information in /debug/
Mar 25 08:12:15 matt-work kernel: [ 158.305230] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
Mar 25 08:12:36 matt-work kernel: [ 179.663557] [drm:i915_
Mar 25 08:12:36 matt-work kernel: [ 179.663780] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
| Joseph Salisbury (jsalisbury) wrote : | #39 |
Thanks, everyone for testing. So it sounds like my test kernel did not fix this bug. However, it sounds like this bug is fixed in the v3.9 mainline kernel, at least in rc3.
I can perform a "Reverse" kernel bisect to identify the commit that fixes this bug. It will first require us to identify the first v3.9 release candidate that does not exhibit this bug.
We know that it is fixed in rc3, so it would be good to test rc1 and rc2. Can folks affected by this bug test those two release candidates:
v3.9-rc1: http://
v3.9-rc2: http://
| Matthew Eaton (powder) wrote : | #40 |
I've been on the rc1 kernel for about 3 hours with no problem.
Linux matt-work 3.9.0-030900rc1
| tags: | added: kernel-key |
| Changed in linux (Ubuntu Precise): | |
| status: | Invalid → Confirmed |
| importance: | Undecided → Critical |
| Changed in linux (Ubuntu Raring): | |
| assignee: | Ubuntu Kernel Team (ubuntu-kernel-team) → nobody |
| assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
| summary: |
- [regression] 3.5.0-26-generic GPU hangs + [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs |
| summary: |
- [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs + [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on + Sandybridge |
| Changed in linux (Ubuntu Raring): | |
| status: | Confirmed → Invalid |
| Changed in linux (Ubuntu Quantal): | |
| assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
| Changed in linux (Ubuntu Precise): | |
| assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
| Changed in linux (Ubuntu Raring): | |
| assignee: | Canonical Kernel Team (canonical-kernel-team) → nobody |
| tags: | added: apport-collected |
| tags: | added: precise |
| tags: | removed: kernel-key |
| tags: | removed: performing-bisect |
| Changed in linux (Ubuntu Quantal): | |
| status: | Confirmed → Fix Committed |
| Changed in linux (Ubuntu Precise): | |
| status: | Confirmed → Fix Committed |
| tags: | added: verification-needed-precise |
| tags: | added: verification-needed-quantal |
| tags: |
added: verification-done-precise removed: verification-needed-precise |
| tags: |
added: verification-done-quantal removed: verification-needed-quantal |
| tags: |
added: verification-failed-precise removed: verification-done-precise |
| tags: |
added: verification-done-precise removed: verification-failed-precise |
| Changed in linux (Ubuntu Precise): | |
| status: | Fix Committed → Fix Released |
| Changed in linux (Ubuntu Precise): | |
| status: | Fix Released → Fix Committed |
| Changed in linux (Ubuntu Raring): | |
| status: | Invalid → New |
| Changed in linux (Ubuntu Raring): | |
| status: | New → Confirmed |
| tags: | added: kernel-da-key |
| Changed in linux (Ubuntu Precise): | |
| status: | Fix Committed → Fix Released |
| Changed in linux-lts-quantal (Ubuntu Precise): | |
| status: | Confirmed → Fix Released |
| Changed in linux (Ubuntu Quantal): | |
| status: | Fix Committed → Fix Released |
| Changed in linux (Ubuntu Raring): | |
| assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
| tags: | added: kernel-stable-key |
| Changed in linux (Debian): | |
| status: | Unknown → New |
| Changed in linux: | |
| importance: | Unknown → Medium |
| status: | Unknown → Invalid |
| Changed in dri: | |
| importance: | Unknown → Medium |
| status: | Unknown → Confirmed |
| no longer affects: | linux-lts-raring (Ubuntu Quantal) |
| no longer affects: | linux-lts-raring (Ubuntu Raring) |
| tags: | added: patch |
| Changed in linux-lts-raring (Ubuntu Precise): | |
| status: | New → Confirmed |
| Changed in linux-lts-raring (Ubuntu): | |
| status: | New → Confirmed |
| information type: | Public → Public Security |
| information type: | Public Security → Public |
| tags: | added: saucy |
| Changed in linux-lts-quantal (Ubuntu): | |
| importance: | Undecided → Critical |
| Changed in linux-lts-quantal (Ubuntu Quantal): | |
| importance: | Undecided → Critical |
| Changed in linux-lts-quantal (Ubuntu Raring): | |
| importance: | Undecided → Critical |
| Changed in linux-lts-raring (Ubuntu): | |
| importance: | Undecided → Critical |
| Changed in linux-lts-raring (Ubuntu Precise): | |
| importance: | Undecided → Critical |
| Changed in linux-lts-quantal (Ubuntu Precise): | |
| status: | Fix Released → Invalid |
| Changed in linux (Ubuntu Raring): | |
| status: | Confirmed → Invalid |
| Changed in linux (Ubuntu Quantal): | |
| status: | Fix Released → Invalid |
| Changed in linux (Ubuntu Precise): | |
| status: | Fix Released → Invalid |
| Changed in linux-lts-raring (Ubuntu): | |
| status: | Confirmed → Triaged |
| Changed in linux-lts-raring (Ubuntu): | |
| status: | Triaged → Invalid |
| Changed in linux-lts-raring (Ubuntu Precise): | |
| status: | Confirmed → Invalid |
| Changed in mesa (Ubuntu): | |
| importance: | Undecided → Critical |
| status: | New → Triaged |
| Changed in linux (Ubuntu Precise): | |
| assignee: | Canonical Kernel Team (canonical-kernel-team) → nobody |
| Changed in linux (Ubuntu Quantal): | |
| assignee: | Canonical Kernel Team (canonical-kernel-team) → nobody |
| Changed in linux (Ubuntu Raring): | |
| assignee: | Canonical Kernel Team (canonical-kernel-team) → nobody |
| Changed in mesa (Ubuntu Precise): | |
| status: | New → Triaged |
| importance: | Undecided → Critical |
| Changed in dri: | |
| status: | Confirmed → In Progress |
|
|
#500 |
*** Bug 89078 has been marked as a duplicate of this bug. ***
| Changed in linux: | |
| importance: | Medium → Unknown |
| status: | Invalid → Unknown |
| affects: | linux → mesa |
| tags: | removed: saucy |
| tags: | added: metabug |
|
|
#501 |
*** Bug 89299 has been marked as a duplicate of this bug. ***
|
|
#502 |
*** Bug 89570 has been marked as a duplicate of this bug. ***
|
|
#503 |
*** Bug 89671 has been marked as a duplicate of this bug. ***
|
|
#504 |
*** Bug 89774 has been marked as a duplicate of this bug. ***
|
|
#505 |
*** Bug 89771 has been marked as a duplicate of this bug. ***
|
|
#506 |
*** Bug 89981 has been marked as a duplicate of this bug. ***
|
|
#507 |
*** Bug 90106 has been marked as a duplicate of this bug. ***
|
|
#508 |
*** Bug 90146 has been marked as a duplicate of this bug. ***
|
|
#509 |
*** Bug 90271 has been marked as a duplicate of this bug. ***
|
|
#510 |
*** Bug 90473 has been marked as a duplicate of this bug. ***
| Changed in linux-lts-quantal (Ubuntu Precise): | |
| status: | Invalid → Fix Committed |
| Changed in linux (Ubuntu Precise): | |
| status: | Invalid → Fix Committed |
|
|
#511 |
*** Bug 90835 has been marked as a duplicate of this bug. ***
Chris, you referred me to this bug as I reported
Bug 90835 - [4.1-rc6] gpu hang: ecode 6:-1:0x00000000, Kicking stuck semaphore on render ring
I skimmed through it and it appears that there are some patches to test? But I am not sure which ones these are. Can you or someone else enlighten me?
Also I note that I still use
Option "AccelMethod" "uxa"
and I have
martin@merkaba:~> cat /etc/modprobe.
options i915 modeset=1 i915_enable_rc6=7
thus maximum energy saving. But according to powertop it never enters the highest sleep state anyway.
I will remove the AccelMethod setting now and see whether it helps. If not, I downgrade to 4.1-rc4 for now, as issues have been at least much less frequent with it.
And its really that for me 4.1-rc6 makes things much *worse*. I am typing this after a clean reboot and already got the GPU hang again. It happens about every few minutes. Are you really sure this is the same GPU hang? I didn´t have this before 4.1 kernel?
|
|
#513 |
(In reply to Martin Steigerwald from comment #225)
> Chris, you referred me to this bug as I reported
>
> Bug 90835 - [4.1-rc6] gpu hang: ecode 6:-1:0x00000000, Kicking stuck
> semaphore on render ring
>
> I skimmed through it and it appears that there are some patches to test? But
> I am not sure which ones these are. Can you or someone else enlighten me?
There's likely a modest improvement in 4.2.
> Also I note that I still use
>
> Option "AccelMethod" "uxa"
>
> and I have
>
> martin@merkaba:~> cat /etc/modprobe.
> options i915 modeset=1 i915_enable_rc6=7
Fortuitously that dangerous option doesn't do anything for your kernel.
> ffffffff813a4b0e
> thus maximum energy saving. But according to powertop it never enters the
> highest sleep state anyway.
>
> I will remove the AccelMethod setting now and see whether it helps. If not,
> I downgrade to 4.1-rc4 for now, as issues have been at least much less
> frequent with it.
Purely circumstantial.
> And its really that for me 4.1-rc6 makes things much *worse*. I am typing
> this after a clean reboot and already got the GPU hang again. It happens
> about every few minutes. Are you really sure this is the same GPU hang? I
> didn´t have this before 4.1 kernel?
Yes.
(In reply to Chris Wilson from comment #226)
> (In reply to Martin Steigerwald from comment #225)
> > Chris, you referred me to this bug as I reported
> >
> > Bug 90835 - [4.1-rc6] gpu hang: ecode 6:-1:0x00000000, Kicking stuck
> > semaphore on render ring
> >
> > I skimmed through it and it appears that there are some patches to test? But
> > I am not sure which ones these are. Can you or someone else enlighten me?
>
> There's likely a modest improvement in 4.2.
Nice.
> > Also I note that I still use
> >
> > Option "AccelMethod" "uxa"
> >
> > and I have
> >
> > martin@merkaba:~> cat /etc/modprobe.
> > options i915 modeset=1 i915_enable_rc6=7
>
> Fortuitously that dangerous option doesn't do anything for your kernel.
Well I found out why, I compiled i915 into the kernel it seems, at least I don´t have an i915 module in lsmod. But also i915.i915_
> > ffffffff813a4b0e
> > thus maximum energy saving. But according to powertop it never enters the
> > highest sleep state anyway.
> >
> > I will remove the AccelMethod setting now and see whether it helps. If not,
> > I downgrade to 4.1-rc4 for now, as issues have been at least much less
> > frequent with it.
>
> Purely circumstantial.
Since using SNA I didn´t see a GPU hang so far. Too early to say for sure, but it seems something in UXA may have triggered it more easily.
|
|
#515 |
*** Bug 91212 has been marked as a duplicate of this bug. ***
|
|
#516 |
*** Bug 91662 has been marked as a duplicate of this bug. ***
|
|
#517 |
*** Bug 91810 has been marked as a duplicate of this bug. ***
|
|
#518 |
*** Bug 91832 has been marked as a duplicate of this bug. ***
(In reply to Chris Wilson from comment #192)
> (In reply to comment #191)
> > What information is most useful for these repeating issues, as it just
> > happened again:
> >
> > Sep 16 08:32:59 arrowsmithlap1 kernel: [1182242.139690] [drm] stuck on
> > render ring
> > Sep 16 08:32:59 arrowsmithlap1 kernel: [1182242.139699] [drm] stuck on
> > blitter ring
>
> So long as it is the same event, there is no more information we need other
> than testing feedback for an eventual workaround.
Is this the same bug?
$ journalctl -p 3 -b -1
Ruj 25 02:13:01 crnigrom kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request.
Ruj 25 02:13:01 crnigrom kernel: [drm:__
... [ repeated messages ] ...
Ruj 25 02:13:33 crnigrom kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request.
Ruj 25 02:13:33 crnigrom kernel: [drm:__
Ruj 25 02:13:34 crnigrom kernel: [drm:stop_ring [i915]] *ERROR* render ring : timed out trying to stop ring
Ruj 25 02:13:34 crnigrom kernel: [drm:init_
Ruj 25 02:13:34 crnigrom kernel: [drm:i915_reset [i915]] *ERROR* Failed hw init on reset -5
Ruj 25 02:13:34 crnigrom gnome-session[
After which gnome crashes with "Oh No Something Is Wrong" screen
$ uname -r
4.1.7-200.
Hardware i3-2100 CPU/GPU
This bug is going on already for a long long time, but at least computer is not hard freezing anymore, although gnome is crashing so any gtk applications running doing something stalls.
|
|
#520 |
*** Bug 92118 has been marked as a duplicate of this bug. ***
|
|
#521 |
*** Bug 92739 has been marked as a duplicate of this bug. ***
FWIW, my issue (https:/
HTH someone.
|
|
#523 |
Created attachment 119432
attachment-
I reported this bug from a system without an SSD. Recently, I have not
seen the kernel messages appear however--currently on linux 4.2.5.
On Sun, Nov 1, 2015 at 10:04 PM, <email address hidden> wrote:
> *Comment # 235 <https:/
> on bug 54226 <https:/
> <email address hidden> <email address hidden> *
>
> FWIW, my issue (https:/
> resolved by uninstalling various components, re-installing and updating them. I
> have a hunch (completely unproven) that it was a transparent bit-fail issue
> from the SSD. By un-installing and re-installing, the files were likely
> installed to a different location on the drive. It wasn't configuration, as I
> tried erasing, and even rolling back to defaults, with the problem still
> persisting. As it was almost daily, prior to uninstall, and hasn't happened
> since the install, this is all I can attribute it to.
>
> HTH someone.
>
> -------
> You are receiving this mail because:
>
> - You are on the CC list for the bug.
>
>
(In reply to Jeffrey E. Bedard from comment #236)
> Created attachment 119432 [details]
> attachment-
>
> I reported this bug from a system without an SSD. Recently, I have not
> seen the kernel messages appear however--currently on linux 4.2.5.
Ah, let me clarify that earlier comment: I dd'd a failing spinning drive to an SSD. There was lots of clicking. Upgraded packages as they came in, but no change. Only the uninstall and re-install cleared the repeat button. :)
|
|
#525 |
Created attachment 119433
attachment-
I think this bug can be marked as closed with the latest linux/mesa/xorg
versions :)
On Fri, Nov 6, 2015 at 1:47 AM, <email address hidden> wrote:
> *Comment # 237 <https:/
> on bug 54226 <https:/
> <email address hidden> <email address hidden> *
>
> (In reply to Jeffrey E. Bedard from comment #236 <https:/
> > attachment-
> >
> > I reported this bug from a system without an SSD. Recently, I have not
> > seen the kernel messages appear however--currently on linux 4.2.5.
>
> Ah, let me clarify that earlier comment: I dd'd a failing spinning drive to an
> SSD. There was lots of clicking. Upgraded packages as they came in, but no
> change. Only the uninstall and re-install cleared the repeat button. :)
>
> -------
> You are receiving this mail because:
>
> - You are on the CC list for the bug.
>
>
|
|
#526 |
*** Bug 92927 has been marked as a duplicate of this bug. ***
|
|
#527 |
*** Bug 93057 has been marked as a duplicate of this bug. ***
Created attachment 120189
error state with 4.2 kernel
|
|
#529 |
*** Bug 93331 has been marked as a duplicate of this bug. ***
|
|
#530 |
*** Bug 93482 has been marked as a duplicate of this bug. ***
|
|
#531 |
*** Bug 93493 has been marked as a duplicate of this bug. ***
|
|
#532 |
*** Bug 89524 has been marked as a duplicate of this bug. ***
|
|
#533 |
*** Bug 93595 has been marked as a duplicate of this bug. ***
|
|
#534 |
*** Bug 93876 has been marked as a duplicate of this bug. ***
|
|
#535 |
*** Bug 93824 has been marked as a duplicate of this bug. ***
|
|
#536 |
*** Bug 94057 has been marked as a duplicate of this bug. ***
Tuesday, March 1, 2016, 9:43:23 PM, you wrote:
> Chris Wilson changed bug 54226
> WhatRemovedAddedCC <email address hidden>
>
> Comment # 249 on bug 54226 from Chris Wilson
> *** Bug 94057 has been marked as a duplicate of this bug. ***
>
> You are receiving this mail because:
> You are on the CC list for the bug.
>
Sorry to say, but:
Is there a way to get off the CC-list of this slightly depressing kind of "catch-all" bug ?
It unfortunately doesn't seem to have be going anywhere for the last 3 to 4 years accept
for an endless stream of duplicates being appended.
--
Sander
(In reply to Sander Eikelenboom from comment #250)
> Is there a way to get off the CC-list of this slightly depressing kind of
> "catch-all" bug ?
CC list is at the top right corner. Choose the address, tick "Remove selected CCs", and hit Save Changes.
I've done this for you now.
|
|
#539 |
*** Bug 95238 has been marked as a duplicate of this bug. ***
| Changed in dri: | |
| status: | In Progress → Fix Released |
| Changed in linux (Fedora): | |
| importance: | Unknown → Undecided |
| status: | Unknown → Won't Fix |
| Changed in dri: | |
| status: | Fix Released → Confirmed |
| Changed in dri: | |
| status: | Confirmed → Won't Fix |


This change was made by a bot.