Ubuntu
linux package

[regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on Sandybridge

Bug #1140716 reported by luca on 2013-03-02

1762

This bug affects 393 people

	Status	Importance	Assigned to
DRI	Won't Fix	Medium	freedesktop-bugs #54226
Mesa	Unknown	Unknown	freedesktop-bugs #70151
linux (Debian)	Fix Released	Unknown	debbugs #705635
linux (Fedora)	Won't Fix	Undecided	redhat-bugs #879823
linux (Ubuntu)	Invalid	Critical	Unassigned
Precise	Fix Released	Critical	Unassigned
Quantal	Invalid	Critical	Unassigned
Raring	Invalid	Critical	Unassigned
linux-lts-quantal (Ubuntu)	Invalid	Critical	Unassigned
Precise	Fix Released	Critical	Unassigned
Quantal	Invalid	Critical	Unassigned
Raring	Invalid	Critical	Unassigned
linux-lts-raring (Ubuntu)	Invalid	Critical	Unassigned
Precise	Invalid	Critical	Unassigned
mesa (Ubuntu)	Fix Released	Critical	Unassigned
Precise	Fix Released	Critical	Unassigned

Bug Description

I'm getting errors about GPU hangs every minute or so (usually only when using FF and scrolling a webpage or something). I also get an annoying ubuntu dialog saying there is a "system error".

This didn't happen with 3.5.0-24-generic.

https://usapillspharma.com

Here is the dmesg:
[15169.033709] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[15169.034517] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[15628.480216] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[15628.480570] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[15844.231372] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[15844.231773] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[20173.232593] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[20173.233211] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26285.650393] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hunghttps://usapillspharma.com/
[26285.650980] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26285.658405] ------------[ cut here ]------------
[26285.658472] WARNING: at /build/buildd/linux-3.5.0/drivers/gpu/drm/i915/intel_pm.c:2505 gen6_enable_rps+0x706/0x710 [i915]()
[26285.658474] Hardware name: SATELLITE Z830
[26285.658476] Modules linked in: sdhci_pci sdhci btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 snd_hda_codec_hdmi snd_hda_codec_realtek joydev btusb coretemp kvm_intel kvm arc4 ghash_clmulni_intel aesni_intel cryptd aes_x86_64 snd_hda_intel snd_hda_codec snd_hwdep uvcvideo snd_pcm videobuf2_core microcode videodev bnep iwlwifi videobuf2_vmalloc snd_seq_midi psmouse videobuf2_memops snd_rawmidi rfcomm pcspkr snd_seq_midi_event serio_raw snd_seq bluetooth mac80211 snd_timer snd_seq_device i915 drm_kms_helper cfg80211 drm toshiba_acpi snd sparse_keymap soundcore wmi i2c_algo_bit toshiba_bluetooth snd_page_alloc parport_pc mei video mac_hid lpc_ich ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc lp parport e1000e ahci libahci [last unloaded: sdhci]https://yourrxpills.com/
[26285.658537] Pid: 23433, comm: kworker/u:0 Not tainted 3.5.0-26-generic #40-Ubuntu
[26285.658539] Call Trace:
[26285.658549] [<ffffffff81051bef>] warn_slowpath_common+0x7f/0xc0
[26285.658553] [<ffffffff81051c4a>] warn_slowpath_null+0x1a/0x20
[26285.658569] [<ffffffffa02d32e6>] gen6_enable_rps+0x706/0x710 [i915]
[26285.658584] [<ffffffffa02bf3f6>] intel_modeset_init_hw+0x66/0xa0 [i915]
[26285.658595] [<ffffffffa02954b4>] i915_reset+0x1a4/0x6e0 [i915]
[26285.658601] [<ffffffff8101257b>] ? __switch_to+0x12b/0x420
[26285.658612] [<ffffffffa029a943>] i915_error_work_func+0xc3/0x110 [i915]
[26285.658618] [<ffffffff8107097a>] process_one_work+0x12a/0x420
[26285.658629] [<ffffffffa029a880>] ? gen6_pm_rps_work+0xe0/0xe0 [i915]
[26285.658632] [<ffffffff8107152e>] worker_thread+0x12e/0x2f0
[26285.658636] [<ffffffff81071400>] ? manage_workers.isra.26+0x200/0x200
[26285.658640] [<ffffffff81076023>] kthread+0x93/0xa0
[26285.658644] [<ffffffff8168a3e4>] kernel_thread_helper+0x4/0x10
[26285.658649] [<ffffffff81075f90>] ? kthread_freezable_should_stop+0x70/0x70
[26285.658652] [<ffffffff8168a3e0>] ? gs_change+0x13/0x13
[26285.658654] ---[ end trace 59c6162fdfcbffee ]---
[26756.021167] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[26756.021426] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26766.014093] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[26766.014397] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26932.376233] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[26932.376544] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[26932.384285] ------------[ cut here ]------------
[26932.384354] WARNING: at /build/buildd/linux-3.5.0/drivers/gpu/drm/i915/intel_pm.c:2505 gen6_enable_rps+0x706/0x710 [i915]()
[26932.384356] Hardware name: SATELLITE Z830
[26932.384358] Modules linked in: sdhci_pci sdhci btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 snd_hda_codec_hdmi snd_hda_codec_realtek joydev btusb coretemp kvm_intel kvm arc4 ghash_clmulni_intel aesni_intel cryptd aes_x86_64 snd_hda_intel snd_hda_codec snd_hwdep uvcvideo snd_pcm videobuf2_core microcode videodev bnep iwlwifi videobuf2_vmalloc snd_seq_midi psmouse videobuf2_memops snd_rawmidi rfcomm pcspkr snd_seq_midi_event serio_raw snd_seq bluetooth mac80211 snd_timer snd_seq_device i915 drm_kms_helper cfg80211 drm toshiba_acpi snd sparse_keymap soundcore wmi i2c_algo_bit toshiba_bluetooth snd_page_alloc parport_pc mei video mac_hid lpc_ich ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc lp parport e1000e ahci libahci [last unloaded: sdhci]
[26932.384421] Pid: 24262, comm: kworker/u:2 Tainted: G W 3.5.0-26-generic #40-Ubuntu
[26932.384422] Call Trace:
[26932.384431] [<ffffffff81051bef>] warn_slowpath_common+0x7f/0xc0
[26932.384436] [<ffffffff81051c4a>] warn_slowpath_null+0x1a/0x20
[26932.384451] [<ffffffffa02d32e6>] gen6_enable_rps+0x706/0x710 [i915]
[26932.384466] [<ffffffffa02bf3f6>] intel_modeset_init_hw+0x66/0xa0 [i915]
[26932.384476] [<ffffffffa02954b4>] i915_reset+0x1a4/0x6e0 [i915]
[26932.384482] [<ffffffff8101257b>] ? __switch_to+0x12b/0x420
[26932.384493] [<ffffffffa029a943>] i915_error_work_func+0xc3/0x110 [i915]
[26932.384500] [<ffffffff8107097a>] process_one_work+0x12a/0x420
[26932.384511] [<ffffffffa029a880>] ? gen6_pm_rps_work+0xe0/0xe0 [i915]
[26932.384514] [<ffffffff8107152e>] worker_thread+0x12e/0x2f0
[26932.384517] [<ffffffff81071400>] ? manage_workers.isra.26+0x200/0x200
[26932.384521] [<ffffffff81076023>] kthread+0x93/0xa0
[26932.384526] [<ffffffff8168a3e4>] kernel_thread_helper+0x4/0x10
[26932.384531] [<ffffffff81075f90>] ? kthread_freezable_should_stop+0x70/0x70
[26932.384534] [<ffffffff8168a3e0>] ? gs_change+0x13/0x13
[26932.384536] ---[ end trace 59c6162fdfcbffef ]---

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: linux-image-3.5.0-26-generic 3.5.0-26.40
ProcVersionSignature: Ubuntu 3.5.0-26.40-generic 3.5.7.6
Uname: Linux 3.5.0-26-generic x86_64
ApportVersion: 2.6.1-0ubuntu10
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC0: luca 2084 F.... pulseaudio
CheckboxSubmission: f8b82cd9bc23fe075e5068a9824afda5
CheckboxSystem: b1865df84255b8716d3bcc269ff410d1
Date: Sat Mar 2 22:25:14 2013
HibernationDevice: RESUME=UUID=20fe6da8-7d68-4660-953f-6e4ae1d348a7
InstallationDate: Installed on 2012-04-26 (310 days ago)
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
MachineType: TOSHIBA SATELLITE Z830
MarkForUpload: True
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-26-generic root=UUID=36929bf3-a158-44d9-a80d-3adac2840fa8 ro quiet splash acpi_backlight=vendor i915.i915_enable_rc6=1 i915.lvds_downclock=1 vt.handoff=7
RelatedPackageVersions:
linux-restricted-modules-3.5.0-26-generic N/A
linux-backports-modules-3.5.0-26-generic N/A
linux-firmware 1.95
SourcePackage: linux
UpgradeStatus: Upgraded to quantal on 2012-10-28 (125 days ago)
dmi.bios.date: 07/31/2012
dmi.bios.vendor: TOSHIBA
dmi.bios.version: Version 1.70
dmi.board.asset.tag: 0000000000
dmi.board.name: Portable PC
dmi.board.vendor: TOSHIBA
dmi.board.version: Version A0
dmi.chassis.asset.tag: 0000000000
dmi.chassis.type: 10
dmi.chassis.vendor: TOSHIBA
dmi.chassis.version: Version 1.0
dmi.modalias: dmi:bvnTOSHIBA:bvrVersion1.70:bd07/31/2012:svnTOSHIBA:pnSATELLITEZ830:pvrPT22LE-00300GGR:rvnTOSHIBA:rnPortablePC:rvrVersionA0:cvnTOSHIBA:ct10:cvrVersion1.0:
dmi.product.name: SATELLITE Z830
dmi.product.version: PT22LE-00300GGR
dmi.sys.vendor: TOSHIBA

See original description

Tags:

Related branches

lp:ubuntu/precise-proposed/linux-lts-quantal

lp:ubuntu/saucy/linux-ti-omap4

lp:ubuntu/precise-security/linux-ti-omap4

lp:ubuntu/precise-proposed/linux-ti-omap4

lp:ubuntu/quantal-proposed/linux-ti-omap4

Revision history for this message

luca (llucax) wrote on 2013-03-02:

AlsaInfo.txt Edit (30.5 KiB, text/plain; charset="utf-8")
BootDmesg.txt Edit (53.9 KiB, text/plain; charset="utf-8")
CRDA.txt Edit (257 bytes, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (50.2 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (2.9 KiB, text/plain; charset="utf-8")
IwConfig.txt Edit (518 bytes, text/plain; charset="utf-8")
Lspci.txt Edit (9.6 KiB, text/plain; charset="utf-8")
Lsusb.txt Edit (467 bytes, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (3.6 KiB, text/plain; charset="utf-8")
ProcEnviron.txt Edit (301 bytes, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (2.4 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (4.1 KiB, text/plain; charset="utf-8")
PulseList.txt Edit (20.2 KiB, text/plain; charset="utf-8")
RfKill.txt Edit (126 bytes, text/plain; charset="utf-8")
UdevDb.txt Edit (113.1 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (289.0 KiB, text/plain; charset="utf-8")
WifiSyslog.txt Edit (199.9 KiB, text/plain; charset="utf-8")

Revision history for this message

Brad Figg (brad-figg) wrote on 2013-03-02: Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status:	New → Confirmed

Revision history for this message

luca (llucax) wrote on 2013-03-02: Re: [regression] 3.5.0-26-generic CPU hangs

Another note, when these hungs happen, I get graphic corruption (usually in the fonts/text).

Revision history for this message

luca (llucax) wrote on 2013-03-03:

kernel 3.5.0-25-generic also seems to work fine.

Joseph Salisbury (jsalisbury) on 2013-03-05

Changed in linux (Ubuntu):
importance:	Undecided → Medium
tags:	added: regression-update

Revision history for this message

luca (llucax) wrote on 2013-03-05:

After I resumed from suspension with kernel 3.5.0-25-generic I got again the annoying dialogs saying there was a GPU hung detected asking me to report a bug that I have no idea where is going, but looking at dmesg I can't see anything strange [1]. How can I see why those dialogs are being open to see if there is something wrong?

I had the annoying dialog several times in a very short period of time, like 10 times in about 5 minutes and then it stopped. After that I suspended and resumed my laptop a couple of times and it didn't happen again so far.

[1] Except for messages like this but I'm getting this since I bought this computer about an year ago and never had those annoying dialog about any GPU hang:
[52682.020386] CPU1: Package power limit notification (total events = 5770)
[52682.020389] CPU3: Package power limit notification (total events = 5769)
[52682.020391] CPU2: Package power limit notification (total events = 5761)
[52682.020393] CPU0: Package power limit notification (total events = 5746)
[52682.021517] CPU3: Package power limit normal
[52682.021520] CPU1: Package power limit normal
[52682.021521] CPU2: Package power limit normal
[52682.021526] CPU0: Package power limit normal

Revision history for this message

luca (llucax) wrote on 2013-03-05:

Also, I couldn't see any graphic corruption this last time with kernel 3.5.0-25

Revision history for this message

Craig McQueen (cmcqueen1975) wrote on 2013-03-05:

This affects me, but in my case I'm running Ubuntu 12.04, and the problem seems to be with kernel 3.2.0-39. Booting to kernel 3.2.0-38 seems to have fixed it.

1 comments hidden

view all 557 comments

Revision history for this message

Hans (old-man999) wrote on 2013-03-12:

Sry, this is more related to my case: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1135759

Revision history for this message

luca (llucax) wrote on 2013-03-14:

#10

Seemsto befixed in linux-image-3.5.0-26-generic 3.5.0-26.42

Revision history for this message

luca (llucax) wrote on 2013-03-15:

#11

Nope, stil getting it with linux-image-3.5.0-26-generic 3.5.0-26.42

[32861.907463] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[32861.907470] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[32861.911988] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
...
[39199.903510] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[39199.903846] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

Revision history for this message

Torsten Hilbrich (torsten-hilbrich) wrote on 2013-03-19:

#12

i915_error_state Edit (2.2 MiB, text/plain)

The same here, previous kernel 3.5.0-25-generic works without problems, 3.5.0-26.42 hanged just now:

$ dmesg|grep i915
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.5.0-26-generic root=/dev/mapper/System-root ro quiet splash i915.i915_enable_rc6=1 vt.handoff=7
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.5.0-26-generic root=/dev/mapper/System-root ro quiet splash i915.i915_enable_rc6=1 vt.handoff=7
[ 1.667363] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1.667367] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1.667652] i915 0000:00:02.0: setting latency timer to 64
[ 1.687950] i915 0000:00:02.0: irq 44 for MSI/MSI-X
[ 2.429882] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 330.684154] i915 0000:00:02.0: power state changed by ACPI to D3
[ 331.826825] i915 0000:00:02.0: power state changed by ACPI to D0
[ 331.826829] i915 0000:00:02.0: power state changed by ACPI to D0
[ 331.826830] i915 0000:00:02.0: setting latency timer to 64
[ 1677.075872] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 1677.075876] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state

Content of i915_error_state attached.

Will disable rc6 and test what happens then.

Revision history for this message

Torsten Hilbrich (torsten-hilbrich) wrote on 2013-03-19:

#13

i915_error_state Edit (2.2 MiB, text/plain)

With rc6 off the hangup happened 2 minutes after booting:

$ dmesg|grep i915
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.5.0-26-generic root=/dev/mapper/System-root ro quiet splash i915.i915_enable_rc6=0 vt.handoff=7
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.5.0-26-generic root=/dev/mapper/System-root ro quiet splash i915.i915_enable_rc6=0 vt.handoff=7
[ 0.857239] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.857242] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.857458] i915 0000:00:02.0: setting latency timer to 64
[ 0.877771] i915 0000:00:02.0: irq 44 for MSI/MSI-X
[ 1.619983] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 128.787009] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 128.787013] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 254.699283] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

Seems it's time to return to 3.5.0-25-generic.

Revision history for this message

Laurent (l-perlat) wrote on 2013-03-19:

#14

Same problem here :

Visual corruptions + "GPU hang" error when scrolling in Firefox with 3.5.0-26.

Everything back to normal on 3.5.0-25 (Linux 3.5.0-25-generic #39-Ubuntu SMP Mon Feb 25 18:26:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux)

Revision history for this message

tobyS (tobias-schlitt) wrote on 2013-03-20:

#15

Beside visual corruptions and hangs I also experience complete system hang ups (no reaction until hard reboot) and occasional kernel panics. I therefore wonder why this report does not receive higher prio?

czigor (czigor) on 2013-03-20

summary:

- [regression] 3.5.0-26-generic CPU hangs
+ [regression] 3.5.0-26-generic GPU hangs

Revision history for this message

shuerhaaken (shkn) wrote on 2013-03-20: Re: [regression] 3.5.0-26-generic GPU hangs

#16

Same issue here. Happens very often during firefox usage, but also on other ocasions.

[51278.392895] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[51278.392901] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[51278.397785] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

Revision history for this message

shuerhaaken (shkn) wrote on 2013-03-22:

#17

This is really getting annoying, is anybody taking care of this?

Revision history for this message

czigor (czigor) wrote on 2013-03-22:

#18

@shkn:
Using 3.5.0-25-generic made my PC usable again. I get an error message only at login.

Revision history for this message

Timo Aaltonen (tjaalton) wrote on 2013-03-22:

#19

it's one of these commits (from the quantal kernel), likely the top one since it's happening on sandybridge:

817e8fdee14b05d drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled
4c443ec9afe7f6f drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits
f534135423c7028 drm/i915: Disable AsyncFlip performance optimisations
c0c1fd8a18479f0 drm/i915: Invalidate the relocation presumed_offsets along the slow path

Changed in linux (Ubuntu):
assignee:	nobody → Ubuntu Kernel Team (ubuntu-kernel-team)
importance:	Medium → Critical
Changed in linux (Ubuntu Quantal):
importance:	Undecided → Critical
status:	New → Confirmed
Changed in linux (Ubuntu Precise):
importance:	Undecided → Critical
status:	New → Confirmed

Revision history for this message

Timo Aaltonen (tjaalton) wrote on 2013-03-22:

#20

note that I'm not sure it's affecting raring, maybe not.

Adam Conrad (adconrad) on 2013-03-22

Changed in linux (Ubuntu Precise):
status:	Confirmed → Invalid
Changed in linux-lts-quantal (Ubuntu Precise):
status:	New → Confirmed
Changed in linux-lts-quantal (Ubuntu Quantal):
status:	New → Invalid
Changed in linux-lts-quantal (Ubuntu Raring):
status:	New → Invalid
Changed in linux-lts-quantal (Ubuntu Precise):
importance:	Undecided → Critical
Changed in linux (Ubuntu Precise):
importance:	Critical → Undecided

Joseph Salisbury (jsalisbury) on 2013-03-22

tags:

added: performing-bisect

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2013-03-22:

#21

I'd like to perform a kernel bisect to identify the exact commit that introduced this regression. However, it would be good to test the latest mainline and a test kernel with commit 817e8fdee14b05d reverted.

The latest mainline kernel can be downloaded from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.9-rc3-raring/

Can folks affected by this bug test the v3.9-rc3 kernel?

One thing to note, you will need to install both the linux-image and linux-image-extra .deb packages.

I will also build a Quantal test kernel with commit 817e8fdee14b05d reverted and post a link shortly.

Thanks in advance!

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2013-03-22:

#22

I built a Quantal test kernel with commit 817e8fdee14b05d reverted. The kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1140716/

Can folks affected by this bug test this kernel and report back if it fixes the issue?

Revision history for this message

Gard Spreemann (gspreemann) wrote on 2013-03-22:

#23

@jsalisbury: I could not successfully test the kernel you linked to in comment #22, as it rendered my system unusable. X started at 640x480, there was no working keyboard/mouse, and I could not SSH in.

Revision history for this message

franglais.125 (franglais.125-deactivatedaccount) wrote on 2013-03-22:

#24

@jsalisbury: Thanks for pointing to this kernel version. I have been able to successfully test kernel v3.9-rc3 on Precise 12.04.2 (I am running with quantal-lts xorg stack).
I have been running on it for ~ 3 hours so far with success. It usually took some time for me to hit this bug on my Dell V131, so some more testing might be required.
I will report back if I hit the bug again. So far so good.

Revision history for this message

luca (llucax) wrote on 2013-03-23:

#25

Also initial success for now. Still getting the annoying dialog at startup though (but no signs of GPU hungs in dmesg).

Revision history for this message

Torsten Hilbrich (torsten-hilbrich) wrote on 2013-03-23:

#26

@jsakusbury: I tested your kernel 3.5.0-27-generic #45~lp1140716v1 (from comment #22), it was no improvement for my system. I got two hangups within the first hour (one S3 cycle at 1985), the second one forced me to turn off the system:

[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.5.0-27-generic root=/dev/mapper/System-root ro quiet splash i915.i915_enable_rc6=1 vt.handoff=7
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.5.0-27-generic root=/dev/mapper/System-root ro quiet splash i915.i915_enable_rc6=1 vt.handoff=7
[ 0.804805] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.804809] i915 0000:00:02.0: power state changed by ACPI to D0
[ 0.805030] i915 0000:00:02.0: setting latency timer to 64
[ 0.824988] i915 0000:00:02.0: irq 43 for MSI/MSI-X
[ 1.563280] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 1894.853449] i915 0000:00:02.0: power state changed by ACPI to D3
[ 1896.202702] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1896.202708] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1896.202720] i915 0000:00:02.0: setting latency timer to 64
[ 1984.429241] i915 0000:00:02.0: power state changed by ACPI to D3
[ 1985.767157] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1985.767160] i915 0000:00:02.0: power state changed by ACPI to D0
[ 1985.767168] i915 0000:00:02.0: setting latency timer to 64
[ 2132.278551] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 2132.278555] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 3504.895781] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung

Revision history for this message

luca (llucax) wrote on 2013-03-23:

#27

torsten, maybe you are having a different issue, note that your hang doesn't look like related to rc6 state.

[51278.397785] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

BTW, my system is still surviving without hangs with the patched 3.5 kernel.

Revision history for this message

luca (llucax) wrote on 2013-03-23:

#28

I just had a burst of dialogs informing of non-existent GPU hangs (with kernel 3.5 patched). The GPU hans are not reported in dmesg though, so I don't know where is it getting from. Also no corruption or anything. Seems like the dialog madness is started when an unrelated program crashes. Maybe is just an apport bug? How should I proceed to see what's really going on?

Revision history for this message

Alexis Lauthier (alx7539-launchpad) wrote on 2013-03-23:

#29

@jsalisbury: I've been running your 3.5.0-27-generic #45~lp1140716v1 for 5 hours and I've already had 3 hangs. No improvement here.

[ 5733.121323] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 5733.121330] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[ 5733.124957] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

Revision history for this message

luca (llucax) wrote on 2013-03-23:

#30

OK, it took a while but I got the GPU hang finally with kernel3.5.0-27-generic #45~lp1140716v1 :

Revision history for this message

luca (llucax) wrote on 2013-03-23:

#31

Always happens with firefox, an only with certain sites (consistently).

Revision history for this message

luca (llucax) wrote on 2013-03-23:

#32

I got this with a second hang:

[22344.085044] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[22344.085051] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[22344.090106] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[23652.138382] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[23652.138898] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[23652.146420] ------------[ cut here ]------------
[23652.146491] WARNING: at /home/jsalisbury/bugs/lp1140716/ubuntu-quantal/drivers/gpu/drm/i915/intel_pm.c:2505 gen6_enable_rps+0x706/0x710 [i915]()
[23652.146495] Hardware name: SATELLITE Z830
[23652.146497] Modules linked in: sdhci_pci sdhci snd_hda_codec_hdmi snd_hda_codec_realtek joydev btusb coretemp kvm_intel kvm ghash_clmulni_intel aesni_intel arc4 cryptd aes_x86_64 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm uvcvideo videobuf2_core videodev snd_seq_midi videobuf2_vmalloc videobuf2_memops snd_rawmidi microcode snd_seq_midi_event iwlwifi snd_seq snd_timer snd_seq_device i915 bnep rfcomm mac80211 toshiba_acpi sparse_keymap drm_kms_helper wmi toshiba_bluetooth snd pcspkr bluetooth drm i2c_algo_bit cfg80211 soundcore psmouse mac_hid snd_page_alloc video serio_raw mei lpc_ich parport_pc ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl lp sunrpc parport ahci libahci e1000e [last unloaded: sdhci]
[23652.146578] Pid: 3451, comm: kworker/u:0 Not tainted 3.5.0-27-generic #45~lp1140716v1
[23652.146581] Call Trace:
[23652.146592] [<ffffffff81051bef>] warn_slowpath_common+0x7f/0xc0
[23652.146599] [<ffffffff81051c4a>] warn_slowpath_null+0x1a/0x20
[23652.146621] [<ffffffffa03f6316>] gen6_enable_rps+0x706/0x710 [i915]
[23652.146640] [<ffffffffa03e2446>] intel_modeset_init_hw+0x66/0xa0 [i915]
[23652.146655] [<ffffffffa03b84b4>] i915_reset+0x1a4/0x6e0 [i915]
[23652.146663] [<ffffffff8101257b>] ? __switch_to+0x12b/0x420
[23652.146679] [<ffffffffa03bd943>] i915_error_work_func+0xc3/0x110 [i915]
[23652.146688] [<ffffffff8107098a>] process_one_work+0x12a/0x420
[23652.146701] [<ffffffffa03bd880>] ? gen6_pm_rps_work+0xe0/0xe0 [i915]
[23652.146707] [<ffffffff8107153e>] worker_thread+0x12e/0x2f0
[23652.146712] [<ffffffff81071410>] ? manage_workers.isra.26+0x200/0x200
[23652.146719] [<ffffffff81076033>] kthread+0x93/0xa0
[23652.146726] [<ffffffff8168ab24>] kernel_thread_helper+0x4/0x10
[23652.146732] [<ffffffff81075fa0>] ? kthread_freezable_should_stop+0x70/0x70
[23652.146737] [<ffffffff8168ab20>] ? gs_change+0x13/0x13
[23652.146740] ---[ end trace 2153106cc632835c ]---

I got this with a second hang:

[22344.085044] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[22344.085051] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[22344.090106] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[23652.138382] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[23652.138898] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[23652.146420] ------------[ cut here ]------------
[23652.146491] WARNING: at /home/jsalisbury/bugs/lp1140716/ubuntu-quantal/drivers/gpu/drm/i915/intel_pm.c:2505 gen6_enable_rps+0x706/0x710 [i915]()
[23652.146495] Hardware name: SATELLITE Z830
[23652.146497] Modules linked in: sdhci_pci sdhci snd_hda_codec_hdmi snd_hda_codec_realtek joydev btusb coretemp kvm_intel kvm ghash_clmulni_intel aesni_intel arc4 cryptd aes_x86_64 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm uvcvideo videobuf2_core videodev snd_seq_midi videobuf2_vmalloc videobuf2_memops snd_rawmidi microcode snd_seq_midi_event iwlwifi snd_seq snd_timer snd_seq_device i915 bnep rfcomm mac80211 toshiba_acpi sparse_keymap drm_kms_helper wmi toshiba_bluetooth snd pcspkr bluetooth drm i2c_algo_bit cfg80211 soundcore psmouse mac_hid snd_page_alloc video serio_raw mei lpc_ich parport_pc ppdev nfsd nfs lockd fscache auth_rpcgss nfs_acl lp sunrpc parport ahci libahci e1000e [last unloaded: sdhci]
[23652.146578] Pid: 3451, comm: kworker/u:0 Not tainted 3.5.0-27-generic #45~lp1140716v1
[23652.146581] Call Trace:
[23652.146592]  [<ffffffff81051bef>] warn_slowpath_common+0x7f/0xc0
[23652.146599]  [<ffffffff81051c4a>] warn_slowpath_null+0x1a/0x20
[23652.146621]  [<ffffffffa03f6316>] gen6_enable_rps+0x706/0x710 [i915]
[23652.146640]  [<ffffffffa03e2446>] intel_modeset_init_hw+0x66/0xa0 [i915]
[23652.146655]  [<ffffffffa03b84b4>] i915_reset+0x1a4/0x6e0 [i915]
[23652.146663]  [<ffffffff8101257b>] ? __switch_to+0x12b/0x420
[23652.146679]  [<ffffffffa03bd943>] i915_error_work_func+0xc3/0x110 [i915]
[23652.146688]  [<ffffffff8107098a>] process_one_work+0x12a/0x420
[23652.146701]  [<ffffffffa03bd880>] ? gen6_pm_rps_work+0xe0/0xe0 [i915]
[23652.146707]  [<ffffffff8107153e>] worker_thread+0x12e/0x2f0
[23652.146712]  [<ffffffff81071410>] ? manage_workers.isra.26+0x200/0x200
[23652.146719]  [<ffffffff81076033>] kthread+0x93/0xa0
[23652.146726]  [<ffffffff8168ab24>] kernel_thread_helper+0x4/0x10
[23652.146732]  [<ffffffff81075fa0>] ? kthread_freezable_should_stop+0x70/0x70
[23652.146737]  [<ffffffff8168ab20>] ? gs_change+0x13/0x13
[23652.146740] ---[ end trace 2153106cc632835c ]---

Revision history for this message

Gard Spreemann (gspreemann) wrote on 2013-03-24:

#33

I'm confused as to where the commits referenced by tjaalton in comment #19 live, but for what it's worth, I seem to have a stable system after applying reverse diffs of the following commits from the linux-3.5.y branch of git://kernel.ubuntu.com/ubuntu/linux.git to the 3.5.0-27.45 sources:

2964148 - drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled
899b550 - drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits

Just reverting the first, or using jsalisbury's kernel from comment #22 (ignore my comment #23, I was being an idiot and forgot the modules) gives me a GPU hang and/or graphics corruption within minutes, especially quickly if opening Firefox. After reverting both of the above, I haven't been able to hang the system yet.

Revision history for this message

Torsten Hilbrich (torsten-hilbrich) wrote on 2013-03-24:

#34

Kernel 3.9.0-030900rc3-generic from comment #21 is much more stable for me, no problems so far after 4h of operation.

Revision history for this message

franglais.125 (franglais.125-deactivatedaccount) wrote on 2013-03-24:

#35

@jsalisbury: After a few days of use and many suspend-resume cycles, I am yet to encounter a problem with kernel 3.9-rc3 (as indicated in comment #21). No problems whatsoever on my Dell v131 (i5 Sandybridge)...

Revision history for this message

Peter Saunderson (peteasa) wrote on 2013-03-25:

#36

I got this a lot with Kernel: 3.5.0-26-generic and used a quick workround to avoid the problem: http://askubuntu.com/questions/225356/how-can-i-enable-the-sna-acceleration-method-for-intel-cards-under-ubuntu-12-04

SNA does not seem to have the same issue just UXA. If I have time I can try a new kernel but I spent so much time on this already it may be a few days before I get the time to try the new kernel.

Revision history for this message

Max Rameau (afrimax-e) wrote on 2013-03-25:

#37

I had the problem right after updating (not upgrading) on Saturday using 12.04.

I was able to control it by logging into 2D and immediately opening the System Monitor and shutting down the three instances of Ubuntu One (login, synch and launch), because the machine would freeze upon login to Ubuntu One. I then had to shut down zeitgeist-fts, because that would start eating up resourced (upto 300mb of memory at one point).

At that point, I just decided to reinstall into 12.10. I did that and it worked fine for an hour, so I started transfering over my backed up files and logged into Ubuntu One while running the updates. The problems began immediately, including resource use going up to 100% for long periods of time, mainly through the multiplication of the gkts (?) service. It only used 3.8MB at a time, but at one point there were 20 instances of it open. I concluded it was Ubuntu One causing the problem, so I reinstalled again, this time not logging into Ubuntu One. No problems for 4 hours, even as I installed software. Then I ran the automatic software update, and the problems began again immediately.

Constant crashing, crazy graphic corruption and other issues. Ran the system log and got the similar error:

kernel [224.243459] [drm: Enable RC6 States: RC6 off, RC6p off, RC6p off]
kernal [246.465377] [drm: i95_hangcheck_hung] *ERROR* Hangcheck timer elapsed GPU hung

etc., etc.

This is a nightmare. Need a fix.

Revision history for this message

Matthew Eaton (meaton) wrote on 2013-03-25:

#38

Test kernel did not fix the issue for me.

Linux matt-work 3.5.0-27-generic #45~lp1140716v1 SMP Fri Mar 22 15:50:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Mar 25 08:12:15 matt-work kernel: [ 158.302349] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 25 08:12:15 matt-work kernel: [ 158.302353] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Mar 25 08:12:15 matt-work kernel: [ 158.305230] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
Mar 25 08:12:36 matt-work kernel: [ 179.663557] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 25 08:12:36 matt-work kernel: [ 179.663780] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2013-03-25:

#39

Thanks, everyone for testing. So it sounds like my test kernel did not fix this bug. However, it sounds like this bug is fixed in the v3.9 mainline kernel, at least in rc3.

I can perform a "Reverse" kernel bisect to identify the commit that fixes this bug. It will first require us to identify the first v3.9 release candidate that does not exhibit this bug.

We know that it is fixed in rc3, so it would be good to test rc1 and rc2. Can folks affected by this bug test those two release candidates:

v3.9-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.9-rc1-raring/
v3.9-rc2: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.9-rc2-raring/

Revision history for this message

Matthew Eaton (meaton) wrote on 2013-03-25:

#40

I've been on the rc1 kernel for about 3 hours with no problem.

Linux matt-work 3.9.0-030900rc1-generic #201303060659 SMP Wed Mar 6 12:00:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Joseph Salisbury (jsalisbury) on 2013-03-29

tags:

added: kernel-key

Robert Hooker (sarvatt) on 2013-04-02

Changed in linux (Ubuntu Precise):
status:	Invalid → Confirmed
importance:	Undecided → Critical

Joseph Salisbury (jsalisbury) on 2013-04-02

Changed in linux (Ubuntu Raring):
assignee:	Ubuntu Kernel Team (ubuntu-kernel-team) → nobody
assignee:	nobody → Canonical Kernel Team (canonical-kernel-team)

Robert Hooker (sarvatt) on 2013-04-02

summary:	- [regression] 3.5.0-26-generic GPU hangs + [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs
summary:	- [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs + [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on + Sandybridge

Robert Hooker (sarvatt) on 2013-04-02

Changed in linux (Ubuntu Raring):
status:	Confirmed → Invalid
Changed in linux (Ubuntu Quantal):
assignee:	nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Precise):
assignee:	nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Raring):
assignee:	Canonical Kernel Team (canonical-kernel-team) → nobody

Kamil (lampshade-t) on 2013-04-06

tags:

added: apport-collected

Aymeric (mulx) on 2013-04-09

tags:

added: precise

Joseph Salisbury (jsalisbury) on 2013-04-09

tags:	removed: kernel-key
tags:	removed: performing-bisect

Tim Gardner (timg-tpi) on 2013-04-09

Changed in linux (Ubuntu Quantal):
status:	Confirmed → Fix Committed
Changed in linux (Ubuntu Precise):
status:	Confirmed → Fix Committed

Steve Conklin (sconklin) on 2013-04-15

tags:	added: verification-needed-precise
tags:	added: verification-needed-quantal

Roman Shipovskij (roman-shipovskij) on 2013-04-16

tags:

added: verification-done-precise
removed: verification-needed-precise

Pat McGowan (pat-mcgowan) on 2013-04-17

tags:

added: verification-done-quantal
removed: verification-needed-quantal

Roman Shipovskij (roman-shipovskij) on 2013-04-18

tags:

added: verification-failed-precise
removed: verification-done-precise

Stephan Springer (geryon) on 2013-04-18

tags:

added: verification-done-precise
removed: verification-failed-precise

Sergio (sergio-otero) on 2013-04-23

Changed in linux (Ubuntu Precise):
status:	Fix Committed → Fix Released

Brad Figg (brad-figg) on 2013-04-23

Changed in linux (Ubuntu Precise):
status:	Fix Released → Fix Committed

Manaf Yusuf (manaf-yousif) on 2013-04-29

Changed in linux (Ubuntu Raring):
status:	Invalid → New

Launchpad Janitor (janitor) on 2013-04-30

Changed in linux (Ubuntu Raring):
status:	New → Confirmed

Joseph Salisbury (jsalisbury) on 2013-05-01

tags:

added: kernel-da-key

Launchpad Janitor (janitor) on 2013-05-01

Changed in linux (Ubuntu Precise):
status:	Fix Committed → Fix Released
Changed in linux-lts-quantal (Ubuntu Precise):
status:	Confirmed → Fix Released

Launchpad Janitor (janitor) on 2013-05-01

Changed in linux (Ubuntu Quantal):
status:	Fix Committed → Fix Released

Joseph Salisbury (jsalisbury) on 2013-05-06

Changed in linux (Ubuntu Raring):
assignee:	nobody → Canonical Kernel Team (canonical-kernel-team)

Joseph Salisbury (jsalisbury) on 2013-05-29

tags:

added: kernel-stable-key

Bug Watch Updater (bug-watch-updater) on 2013-06-26

Changed in linux (Debian):
status:	Unknown → New

Bug Watch Updater (bug-watch-updater) on 2013-06-26

Changed in linux:
importance:	Unknown → Medium
status:	Unknown → Invalid

Bug Watch Updater (bug-watch-updater) on 2013-09-06

Changed in dri:
importance:	Unknown → Medium
status:	Unknown → Confirmed

franglais.125 (franglais.125-deactivatedaccount) on 2013-09-20

no longer affects:	linux-lts-raring (Ubuntu Quantal)
no longer affects:	linux-lts-raring (Ubuntu Raring)

Ubuntu Foundations Team Bug Bot (crichton) on 2013-09-21

tags:

added: patch

Launchpad Janitor (janitor) on 2013-10-04

Changed in linux-lts-raring (Ubuntu Precise):
status:	New → Confirmed
Changed in linux-lts-raring (Ubuntu):
status:	New → Confirmed

gokul (gokulnathonline) on 2013-10-08

information type:	Public → Public Security
information type:	Public Security → Public

theghost (theghost) on 2013-10-15

tags:

added: saucy

Alberto Salvia Novella (es20490446e) on 2013-10-21

Changed in linux-lts-quantal (Ubuntu):
importance:	Undecided → Critical
Changed in linux-lts-quantal (Ubuntu Quantal):
importance:	Undecided → Critical
Changed in linux-lts-quantal (Ubuntu Raring):
importance:	Undecided → Critical
Changed in linux-lts-raring (Ubuntu):
importance:	Undecided → Critical
Changed in linux-lts-raring (Ubuntu Precise):
importance:	Undecided → Critical
Changed in linux-lts-quantal (Ubuntu Precise):
status:	Fix Released → Invalid
Changed in linux (Ubuntu Raring):
status:	Confirmed → Invalid
Changed in linux (Ubuntu Quantal):
status:	Fix Released → Invalid
Changed in linux (Ubuntu Precise):
status:	Fix Released → Invalid

Alberto Salvia Novella (es20490446e) on 2013-10-21

Changed in linux-lts-raring (Ubuntu):
status:	Confirmed → Triaged

Alberto Salvia Novella (es20490446e) on 2013-10-21

Changed in linux-lts-raring (Ubuntu):
status:	Triaged → Invalid
Changed in linux-lts-raring (Ubuntu Precise):
status:	Confirmed → Invalid
Changed in mesa (Ubuntu):
importance:	Undecided → Critical
status:	New → Triaged
Changed in linux (Ubuntu Precise):
assignee:	Canonical Kernel Team (canonical-kernel-team) → nobody
Changed in linux (Ubuntu Quantal):
assignee:	Canonical Kernel Team (canonical-kernel-team) → nobody
Changed in linux (Ubuntu Raring):
assignee:	Canonical Kernel Team (canonical-kernel-team) → nobody
Changed in mesa (Ubuntu Precise):
status:	New → Triaged
importance:	Undecided → Critical

Bug Watch Updater (bug-watch-updater) on 2014-03-20

Changed in dri:
status:	Confirmed → In Progress

Mathew Hodson (mhodson) on 2015-02-18

Changed in linux:
importance:	Medium → Unknown
status:	Invalid → Unknown
affects:	linux → mesa

Mathew Hodson (mhodson) on 2015-02-18

tags:

removed: saucy

Mathew Hodson (mhodson) on 2015-02-18

tags:

added: metabug

Andy Whitcroft (apw) on 2015-05-21

Changed in linux-lts-quantal (Ubuntu Precise):
status:	Invalid → Fix Committed
Changed in linux (Ubuntu Precise):
status:	Invalid → Fix Committed

477 comments hidden

view all 557 comments

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-09-02:

#518

*** Bug 91832 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Samuel Rakitničan (semirocket) wrote on 2015-09-25:

#519

(In reply to Chris Wilson from comment #192)
> (In reply to comment #191)
> > What information is most useful for these repeating issues, as it just
> > happened again:
> >
> > Sep 16 08:32:59 arrowsmithlap1 kernel: [1182242.139690] [drm] stuck on
> > render ring
> > Sep 16 08:32:59 arrowsmithlap1 kernel: [1182242.139699] [drm] stuck on
> > blitter ring
>
> So long as it is the same event, there is no more information we need other
> than testing feedback for an eventual workaround.

Is this the same bug?

$ journalctl -p 3 -b -1
Ruj 25 02:13:01 crnigrom kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request.
Ruj 25 02:13:01 crnigrom kernel: [drm:__gen6_gt_wait_for_thread_c0.isra.16 [i915]] *ERROR* GT thread status wait timed out
... [ repeated messages ] ...
Ruj 25 02:13:33 crnigrom kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request.
Ruj 25 02:13:33 crnigrom kernel: [drm:__gen6_gt_wait_for_thread_c0.isra.16 [i915]] *ERROR* GT thread status wait timed out
Ruj 25 02:13:34 crnigrom kernel: [drm:stop_ring [i915]] *ERROR* render ring : timed out trying to stop ring
Ruj 25 02:13:34 crnigrom kernel: [drm:init_ring_common [i915]] *ERROR* render ring initialization failed ctl 00000000 (valid? 0) head 00000000 tail 00000000 start 00000000 [expected 00000000]
Ruj 25 02:13:34 crnigrom kernel: [drm:i915_reset [i915]] *ERROR* Failed hw init on reset -5
Ruj 25 02:13:34 crnigrom gnome-session[1823]: Unrecoverable failure in required component gnome-shell.desktop

After which gnome crashes with "Oh No Something Is Wrong" screen

$ uname -r
4.1.7-200.fc22.x86_64

Hardware i3-2100 CPU/GPU

This bug is going on already for a long long time, but at least computer is not hard freezing anymore, although gnome is crashing so any gtk applications running doing something stalls.

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-09-28:

#520

*** Bug 92118 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-10-31:

#521

*** Bug 92739 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Arrowsmith (arrowsmith) wrote on 2015-11-02:

#522

FWIW, my issue (https://bugs.freedesktop.org/show_bug.cgi?id=54226#c191), was resolved by uninstalling various components, re-installing and updating them. I have a hunch (completely unproven) that it was a transparent bit-fail issue from the SSD. By un-installing and re-installing, the files were likely installed to a different location on the drive. It wasn't configuration, as I tried erasing, and even rolling back to defaults, with the problem still persisting. As it was almost daily, prior to uninstall, and hasn't happened since the install, this is all I can attribute it to.

HTH someone.

Revision history for this message

In freedesktop.org Bugzilla #54226, Jefbed (jefbed) wrote on 2015-11-06:

#523

Created attachment 119432
attachment-28908-0.html

I reported this bug from a system without an SSD. Recently, I have not
seen the kernel messages appear however--currently on linux 4.2.5.

On Sun, Nov 1, 2015 at 10:04 PM, <email address hidden> wrote:

> *Comment # 235 <https://bugs.freedesktop.org/show_bug.cgi?id=54226#c235>
> on bug 54226 <https://bugs.freedesktop.org/show_bug.cgi?id=54226> from
> <email address hidden> <email address hidden> *
>
> FWIW, my issue (https://bugs.freedesktop.org/show_bug.cgi?id=54226#c191), was
> resolved by uninstalling various components, re-installing and updating them. I
> have a hunch (completely unproven) that it was a transparent bit-fail issue
> from the SSD. By un-installing and re-installing, the files were likely
> installed to a different location on the drive. It wasn't configuration, as I
> tried erasing, and even rolling back to defaults, with the problem still
> persisting. As it was almost daily, prior to uninstall, and hasn't happened
> since the install, this is all I can attribute it to.
>
> HTH someone.
>
> ------------------------------
> You are receiving this mail because:
>
> - You are on the CC list for the bug.
>
>

Revision history for this message

In freedesktop.org Bugzilla #54226, Arrowsmith (arrowsmith) wrote on 2015-11-06:

#524

(In reply to Jeffrey E. Bedard from comment #236)
> Created attachment 119432 [details]
> attachment-28908-0.html
>
> I reported this bug from a system without an SSD. Recently, I have not
> seen the kernel messages appear however--currently on linux 4.2.5.

Ah, let me clarify that earlier comment: I dd'd a failing spinning drive to an SSD. There was lots of clicking. Upgraded packages as they came in, but no change. Only the uninstall and re-install cleared the repeat button. :)

Revision history for this message

In freedesktop.org Bugzilla #54226, Jefbed (jefbed) wrote on 2015-11-06:

#525

Created attachment 119433
attachment-32271-0.html

I think this bug can be marked as closed with the latest linux/mesa/xorg
versions :)

On Fri, Nov 6, 2015 at 1:47 AM, <email address hidden> wrote:

> *Comment # 237 <https://bugs.freedesktop.org/show_bug.cgi?id=54226#c237>
> on bug 54226 <https://bugs.freedesktop.org/show_bug.cgi?id=54226> from
> <email address hidden> <email address hidden> *
>
> (In reply to Jeffrey E. Bedard from comment #236 <https://bugs.freedesktop.org/show_bug.cgi?id=54226#c236>)> Created attachment 119432 <https://bugs.freedesktop.org/attachment.cgi?id=119432> [details] <https://bugs.freedesktop.org/attachment.cgi?id=119432&action=edit>
> > attachment-28908-0.html
> >
> > I reported this bug from a system without an SSD. Recently, I have not
> > seen the kernel messages appear however--currently on linux 4.2.5.
>
> Ah, let me clarify that earlier comment: I dd'd a failing spinning drive to an
> SSD. There was lots of clicking. Upgraded packages as they came in, but no
> change. Only the uninstall and re-install cleared the repeat button. :)
>
> ------------------------------
> You are receiving this mail because:
>
> - You are on the CC list for the bug.
>
>

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-11-12:

#526

*** Bug 92927 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-11-21:

#527

*** Bug 93057 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Kurt Roeckx (kurt-roeckx) wrote on 2015-11-28:

#528

Created attachment 120189
error state with 4.2 kernel

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-12-10:

#529

*** Bug 93331 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-12-23:

#530

*** Bug 93482 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-12-24:

#531

*** Bug 93493 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2015-12-30:

#532

*** Bug 89524 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-01-05:

#533

*** Bug 93595 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-01-26:

#534

*** Bug 93876 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-02-19:

#535

*** Bug 93824 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-03-01:

#536

*** Bug 94057 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, sander eikelenboom (b-linux) wrote on 2016-03-01:

#537

Tuesday, March 1, 2016, 9:43:23 PM, you wrote:

> Chris Wilson changed bug 54226
> WhatRemovedAddedCC <email address hidden>
>

> Comment # 249 on bug 54226 from Chris Wilson
> *** Bug 94057 has been marked as a duplicate of this bug. ***
>

> You are receiving this mail because:
> You are on the CC list for the bug.
>

Sorry to say, but:
Is there a way to get off the CC-list of this slightly depressing kind of "catch-all" bug ?
It unfortunately doesn't seem to have be going anywhere for the last 3 to 4 years accept
for an endless stream of duplicates being appended.

--
Sander

Revision history for this message

In freedesktop.org Bugzilla #54226, Jani-nikula (jani-nikula) wrote on 2016-03-02:

#538

(In reply to Sander Eikelenboom from comment #250)
> Is there a way to get off the CC-list of this slightly depressing kind of
> "catch-all" bug ?

CC list is at the top right corner. Choose the address, tick "Remove selected CCs", and hit Save Changes.

I've done this for you now.

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-05-02:

#539

*** Bug 95238 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Samantham (samantham) wrote on 2016-06-24:

#540

Chris, I seem to be experiencing this bug in Linux 4.7rc3 on an x220 ThinkPad with Intel HD 3000 chipset. I was getting random full system freeze, non responsive over network.

The main messages before the crash were:
Jun 23 19:11:18 athena kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request.
Jun 23 19:11:18 athena kernel: [drm:__gen6_gt_wait_for_thread_c0.isra.7 [i915]] *ERROR* GT thread status wait timed out.

The original crash I haven't been able to reproduce easily but I CAN reproduce every time a full system lockup running the following intel-gpu-tools tests (I have not even close to run all the tests though) [**This may or may not be related to the original crash**]

gem_sync, subtest: bsd2-hang
drv_hangman, subtest: error-state-capture-bit

I do not know if these tests are helpful or related (maybe some are known to fail? not sure).
I have drm debugging turned on for when I ran those tests. (drm.debug=0x1e log_buf_len=1M)
I can post logs of the hangs associated with the two tests/subtests and run any other tests if you desire (with kernel drm debug on), I will wait for the issue to reappear with the drm debug on before posting that log though. By the number of similar bugs you may already have the CALL TRACE and non-debug level logs.

I know how to patch and am able to compile kernels to test. The bug effects me maybe once every 1 or 2 days. I use XOrg with Glamor. I have been seeing these crashes since 4.6 (maybe 4.5 or earlier not sure).

I know how to apply patches and am able to compile drm-next or any patches you have to see if this issue can be isolated. Thanks, sorry for the long response.

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-08-11:

#541

*** Bug 97304 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-08-23:

#542

*** Bug 97451 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Yann-argotti (yann-argotti) wrote on 2016-10-17:

#543

*** Bug 98294 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2016-11-21:

#544

*** Bug 98807 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2017-03-17:

#545

*** Bug 100245 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Ricardo-vega-u (ricardo-vega-u) wrote on 2017-05-09:

#546

Adding tag into "Whiteboard" field - ReadyForDev
The bug still active
*Status is correct
*Platform is included
*Feature is included
*Priority and Severity correctly set
*Logs included

Revision history for this message

In freedesktop.org Bugzilla #54226, Samuel Rakitničan (semirocket) wrote on 2017-07-14:

#547

I doesn't seem to be getting mentioned Gnome crashes on my sandybridge anymore with mainline kernels, that is currently 4.11 and I think even with 4.10 I was not getting any issues, with mainline longterm 4.4.61 and default centos 7 kernels I am definitely getting very frequent GPU crashes that brings down Gnome.

So it is either fixed for good, or it become much rarer. The issue I am/was experiencing happens when Gnome is running, it does not happen when only GDM is loaded. System load seems to not have effect on the bug triggering, seems to happen any time, on idle, or when machine is loaded.

Revision history for this message

In freedesktop.org Bugzilla #54226, Elizabethx-de-la-torre-mena (elizabethx-de-la-torre-mena) wrote on 2017-07-31:

#548

(In reply to samuel.rakitnican from comment #260)
> I doesn't seem to be getting mentioned Gnome crashes on my sandybridge
> anymore with mainline kernels, that is currently 4.11 and I think even with
> 4.10 I was not getting any issues, with mainline longterm 4.4.61 and default
> centos 7 kernels I am definitely getting very frequent GPU crashes that
> brings down Gnome.
>
> So it is either fixed for good, or it become much rarer. The issue I am/was
> experiencing happens when Gnome is running, it does not happen when only GDM
> is loaded. System load seems to not have effect on the bug triggering, seems
> to happen any time, on idle, or when machine is loaded.
Hopefully, is fixed for good. I'm closing this bug, if problem arise with latest kernel versions https://www.kernel.org/ please open a NEW bug with HW and SW information, steps to reproduce and relevant logs.Thank you.

Bug Watch Updater (bug-watch-updater) on 2017-07-31

Changed in dri:
status:	In Progress → Fix Released

Bug Watch Updater (bug-watch-updater) on 2017-10-28

Changed in linux (Fedora):
importance:	Unknown → Undecided
status:	Unknown → Won't Fix

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2017-10-28:

#549

(In reply to Elizabeth from comment #261)
> (In reply to samuel.rakitnican from comment #260)
> > I doesn't seem to be getting mentioned Gnome crashes on my sandybridge
> > anymore with mainline kernels, that is currently 4.11 and I think even with
> > 4.10 I was not getting any issues, with mainline longterm 4.4.61 and default
> > centos 7 kernels I am definitely getting very frequent GPU crashes that
> > brings down Gnome.
> >
> > So it is either fixed for good, or it become much rarer. The issue I am/was
> > experiencing happens when Gnome is running, it does not happen when only GDM
> > is loaded. System load seems to not have effect on the bug triggering, seems
> > to happen any time, on idle, or when machine is loaded.
> Hopefully, is fixed for good. I'm closing this bug, if problem arise with
> latest kernel versions https://www.kernel.org/ please open a NEW bug with HW
> and SW information, steps to reproduce and relevant logs.Thank you.

There was no fix for this HW issue.

Revision history for this message

In freedesktop.org Bugzilla #54226, Aaron-lu-a (aaron-lu-a) wrote on 2017-10-31:

#550

Created attachment 135173
gpu error file on 4.13.5-200.fc26.x86_64

This problem reappeared on 4.13.5-200.fc26.x86_64 last Friday.

[774249.632109] [drm] GPU HANG: ecode 6:0:0x85fffff8, in Xorg [696], reason: Hang on rcs0, action: reset
[774249.632110] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[774249.632111] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[774249.632111] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[774249.632111] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[774249.632112] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[774249.632172] drm/i915: Resetting chip after gpu hang

Bug Watch Updater (bug-watch-updater) on 2017-10-31

Changed in dri:
status:	Fix Released → Confirmed

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2017-11-20:

#551

commit 0da715ee60774401bea00dc71fca6fd1096c734a
Author: Chris Wilson <email address hidden>
Date: Mon Nov 20 20:55:02 2017 +0000

drm/i915: Disable semaphores on Sandybridge

Bug Watch Updater (bug-watch-updater) on 2017-11-21

Changed in dri:
status:	Confirmed → Won't Fix

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2017-12-13:

#552

*** Bug 104243 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2017-12-17:

#553

*** Bug 104304 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2018-01-24:

#554

*** Bug 104772 has been marked as a duplicate of this bug. ***

Revision history for this message

In freedesktop.org Bugzilla #54226, Jani-saarinen-g (jani-saarinen-g) wrote on 2018-03-28:

#555

I will close this now.

Revision history for this message

In freedesktop.org Bugzilla #54226, Chris Wilson (ickle) wrote on 2018-04-18:

#556

*** Bug 106119 has been marked as a duplicate of this bug. ***

florida Smith (orderpillsonline) on 2019-03-05

summary:	- [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on - Sandybridge + Order Xanax Online Overnight
description:	updated
description:	updated

Steve Langasek (vorlon) on 2019-03-05

summary:	- Order Xanax Online Overnight + [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on + Sandybridge
description:	updated

nflfdglkfm b (lvmfldbmflmb) on 2019-03-29

summary:	- [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on - Sandybridge + Best place to order Tramadol Online in Religh NC
description:	updated

Craig McQueen (cmcqueen1975) on 2019-03-29

description:	updated
summary:	- Best place to order Tramadol Online in Religh NC + [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on + Sandybridge

smithava (smithava23) on 2019-03-29

summary:	- [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on - Sandybridge + Buy Adipex Online To Suppress Appetite
description:	updated

Michael Rowland Hunter (michaelrf-hunter) on 2019-04-05

summary:

- Buy Adipex Online To Suppress Appetite
+ [regression] 3.5.0-26-generic and 3.2.0-39-generic GPU hangs on
+ Sandybridge

Revision history for this message

Timo Aaltonen (tjaalton) wrote on 2019-04-05:

#557

closing mesa as fixed according to upstream years ago

Changed in mesa (Ubuntu):
status:	Triaged → Fix Released
Changed in mesa (Ubuntu Precise):
status:	Triaged → Fix Released
Changed in linux-lts-quantal (Ubuntu Precise):
status:	Fix Committed → Fix Released
Changed in linux (Ubuntu Precise):
status:	Fix Committed → Fix Released