GPU Hang in Xorg rcs0 drm/i915

Bug #1723680 reported by Aun
42
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
High
Unassigned
Zesty
Triaged
High
Unassigned
Artful
Won't Fix
High
Unassigned

Bug Description

Distributor ID: Ubuntu
Description: Ubuntu Artful Aardvark (development branch)
Release: 17.10
Codename: artful

When using linux kernel linux-image-4.13.0-12-generic to linux-image-4.13.0-16-generic Xorg fails to start with kernel error messages below. Kernel boots and displays messages allows for GUI to enter disk encryption key then tries to start Xorg login. Shows blank screen, sometimes part of login dialog. Then hangs and goes blank usually showing the low graphics mode error screen from Xorg and no functionality after a while. Can ctrl+alt+f1 to console. Using linux-image-4.10.0-35-generic everything works as expected. System worked without issue with Ubuntu 17.04

Kernel error messages:
[ 43.836641] [drm] GPU HANG: ecode 8:0:0x00dfffff, in Xorg [1027], reason: Hang on rcs0, action: reset
[ 43.836645] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 43.836646] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 43.836647] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 43.836648] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 43.836649] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 43.836718] drm/i915: Resetting chip after gpu hang
[ 55.812149] drm/i915: Resetting chip after gpu hang
[ 63.840645] drm/i915: Resetting chip after gpu hang
[ 71.836614] drm/i915: Resetting chip after gpu hang
[ 79.836882] drm/i915: Resetting chip after gpu hang
[ 88.828508] drm/i915: Resetting chip after gpu hang

have tried kernel parameter intel_iommu=off, intel_iommu=igfx_off, i915.enable_rc6=0 with no better results.
lshw, dmesg and card0/error file attached.

Revision history for this message
Aun (aun-sswick) wrote :
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1723680

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful
Revision history for this message
Aun (aun-sswick) wrote :

unable to run apport-collect without graphical env for browser.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.14 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc5

tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
Revision history for this message
Aun (aun-sswick) wrote :

Did this issue start happening after an update/upgrade?
     It started after upgrading to 17.10 artful aardvark.

Was there a prior kernel version where you were not having this particular problem?
     It worked/works with 4.10.0-35 (still in grub boot list)

Would it be possible for you to test the latest upstream kernel?
     Installed and booted v4.14-rc5. Issue was got worse. Completely wedged on what should be lightdm login. Just non blinking cursor in top left of screen, no

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

Please follow the instructions on the wiki page[0]. The first step is to email the appropriate mailing list. If no response is received, then a bug may be opened on bugzilla.kernel.org.

Once this bug is reported upstream, please add the tag: 'kernel-bug-reported-upstream'.

[0] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in linux (Ubuntu Zesty):
status: New → Triaged
importance: Undecided → High
tags: added: needs-bisect
Revision history for this message
Aun (aun-sswick) wrote :

I am unable to get useful debug info from the kernel for a bug report. It hangs at a black screen and will not respond to any of the magic sysreq commands. I tried the intel drm-tip too and same result. Fully hung, can't ssh in, no console, nothing, completely wedged. Don't want to send bug report without details.

I'm building some tags between working and not working and will try a bisect after I narrow it down. linus v4.10 tag build works fine.

Any suggestions?

Revision history for this message
Aun (aun-sswick) wrote :

v4.11 good
v4.12 bad

I am starting bisect testing.

dad@octo:~/src/linux$ git bisect start -- drivers/gpu/drm/i915
dad@octo:~/src/linux$ git bisect bad v4.12
dad@octo:~/src/linux$ git bisect good
Bisecting: 397 revisions left to test after this (roughly 9 steps)
[3dc38eea665f383c84cc8d858b9a7645c0b29c54] drm/i915: Remove direct usages of intel_crtc->config from DDI code

Revision history for this message
Aun (aun-sswick) wrote :

bisect build result
3dc38eea665f383c84cc8d858b9a7645c0b29c54 bad
72affdf good
309663ab7b4f0de1540aff212fd067e3dd92acf3 bad
8ee7c6e23bb1b3c37ef27e81395db056bd7eac53 bad
ec151f31cd81cc99b957d6b528709d6ecfb25801 bad
1188bc66eb33e64ac7452b5acd62ce0395204148 good
f0a22974acbdd17b03cff4bdee880e4f08cccf6d bad
9231da70b338b336b982c74fad4afab5b55e6534 bad
8448661d65f6f5dbcdb9c5cba185b284f2464b65 bad
cbc4e9e6a6d31fcc44921d2be41104425be8ab01 good

dad@octo:~/src/linux$ git bisect good
8448661d65f6f5dbcdb9c5cba185b284f2464b65 is the first bad commit

Revision history for this message
Aun (aun-sswick) wrote :

Following bug reporting instructions for DRM i915 reported here:

https://bugs.freedesktop.org/show_bug.cgi?id=103494

Aun (aun-sswick)
tags: added: kernel-bug-reported-upstream
Revision history for this message
Horst Schirmeier (horst) wrote :

Possibly this bug? https://bugs.freedesktop.org/show_bug.cgi?id=102435
-> Ubuntu 17.10's Mesa 17.2.2 seems to be broken

tags: added: bionic
Revision history for this message
Aun (aun-sswick) wrote :

reply to Hosrt Schirmeier:

>Possibly this bug? https://bugs.freedesktop.org/show_bug.cgi?id=102435
>-> Ubuntu 17.10's Mesa 17.2.2 seems to be broken
No, the 64 bit kernel works it is the 32 bit build that hangs. Seems like this issue is reported against 64 bit. Could be some hardware specific issues and somehow related but on one system I have that hangs this is all I can say. See https://bugs.freedesktop.org/show_bug.cgi?id=103494 perhaps you will find something specific to i915 hanging only on 32 bit kernel after kernel v4.11 that is related.

Revision history for this message
Andy Whitcroft (apw) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie artful. The bug task representing the artful nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Artful):
status: Triaged → Won't Fix
Revision history for this message
themusicgod1 (themusicgod1) wrote :

I would have marked it for bionic only if it affected bionic too. This should have a bionic task in addition to artful

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.