i915 0000:00:02.0: Resetting rcs0 for stuck wait on rcs0

Bug #1862865 reported by Haw Loeung on 2020-02-12
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Focal
Focal
Undecided
Unassigned
mesa (Ubuntu)
Status tracked in Focal
Focal
Undecided
Unassigned

Bug Description

Hi,

Every now and again, my session locks up with the following logged:

| i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

Most of the time, it recovers on it's own but sometimes locks up completely requring a reboot. When that happens this is logged:

| Feb 2 10:37:23 dharkan kernel: [71018.593184] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:23 dharkan kernel: [71018.593993] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
| Feb 2 10:37:23 dharkan kernel: [71018.594130] i915 0000:00:02.0: Resetting chip for hang on rcs0
| Feb 2 10:37:23 dharkan kernel: [71018.595937] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
| Feb 2 10:37:23 dharkan kernel: [71018.596728] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
| Feb 2 10:37:28 dharkan kernel: [71022.816232] Asynchronous wait on fence i915:compiz[1495]:105ece timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
| Feb 2 10:37:31 dharkan kernel: [71026.593149] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:31 dharkan kernel: [71026.593958] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
| Feb 2 10:37:31 dharkan kernel: [71026.596916] i915 0000:00:02.0: Resetting chip for hang on rcs0
| Feb 2 10:37:31 dharkan kernel: [71026.598661] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
| Feb 2 10:37:31 dharkan kernel: [71026.599391] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
| Feb 2 10:37:35 dharkan kernel: [71030.592141] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:39 dharkan kernel: [71034.323140] GpuWatchdog[1860]: segfault at 0 ip 000055689a4eb33d sp 00007f69b674d700 error 6 in chrome[5568968fd000+6abe000]
| Feb 2 10:37:39 dharkan kernel: [71034.323157] Code: 00 79 09 48 8b 7d a0 e8 31 2c 72 fc 41 8b 85 00 01 00 00 85 c0 0f 84 ab 00 00 00 49 8b 45 00 4c 89 ef be 01 00 00 00 ff 50 58 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 61 3c 59 03 01 80 bd 7f ff
| Feb 2 10:37:43 dharkan kernel: [71038.592101] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:45 dharkan kernel: [71040.576093] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:47 dharkan kernel: [71042.592066] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:49 dharkan kernel: [71044.576075] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:51 dharkan kernel: [71046.592045] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:53 dharkan kernel: [71048.576098] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:55 dharkan kernel: [71050.592029] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:57 dharkan kernel: [71052.576037] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:37:59 dharkan kernel: [71054.592034] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:01 dharkan kernel: [71056.576021] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:03 dharkan kernel: [71058.592006] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:04 dharkan kernel: [71059.461584] GpuWatchdog[43507]: segfault at 0 ip 000055ac6731833d sp 00007fe71901e700 error 6 in chrome[55ac6372a000+6abe000]
| Feb 2 10:38:04 dharkan kernel: [71059.461600] Code: 00 79 09 48 8b 7d a0 e8 31 2c 72 fc 41 8b 85 00 01 00 00 85 c0 0f 84 ab 00 00 00 49 8b 45 00 4c 89 ef be 01 00 00 00 ff 50 58 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 61 3c 59 03 01 80 bd 7f ff
| Feb 2 10:38:05 dharkan kernel: [71060.575986] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:07 dharkan kernel: [71062.591982] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:09 dharkan kernel: [71064.575976] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:11 dharkan kernel: [71066.591956] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:13 dharkan kernel: [71068.575937] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:15 dharkan kernel: [71070.591932] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:17 dharkan kernel: [71072.575930] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:19 dharkan kernel: [71074.591922] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:21 dharkan kernel: [71076.575911] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:23 dharkan kernel: [71078.591901] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:25 dharkan kernel: [71080.575883] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:27 dharkan kernel: [71082.591879] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:29 dharkan kernel: [71084.575886] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:31 dharkan kernel: [71086.591860] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:33 dharkan kernel: [71088.575858] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:35 dharkan kernel: [71090.591846] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:37 dharkan kernel: [71092.575821] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| Feb 2 10:38:39 dharkan kernel: [71094.591814] i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
| Feb 2 10:38:39 dharkan kernel: [71094.593429] i915 0000:00:02.0: Resetting chip for hang on rcs0
| Feb 2 10:38:51 dharkan kernel: [71105.759798] Asynchronous wait on fence i915:compiz[1495]:105ed8 timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
| Feb 2 10:38:55 dharkan kernel: [71110.591731] i915 0000:00:02.0: Resetting rcs0 for no progress on rcs0
| Feb 2 10:39:05 dharkan kernel: [71119.765965] GpuWatchdog[43528]: segfault at 0 ip 0000556a07be533d sp 00007fa45c34d700 error 6 in chrome[556a03ff7000+6abe000]
| Feb 2 10:39:05 dharkan kernel: [71119.766015] Code: 00 79 09 48 8b 7d a0 e8 31 2c 72 fc 41 8b 85 00 01 00 00 85 c0 0f 84 ab 00 00 00 49 8b 45 00 4c 89 ef be 01 00 00 00 ff 50 58 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 61 3c 59 03 01 80 bd 7f ff
| Feb 2 10:39:11 dharkan kernel: [71126.591644] i915 0000:00:02.0: Resetting rcs0 for no progress on rcs0
| Feb 2 14:47:04 dharkan kernel: [ 0.000000] microcode: microcode updated early to revision 0xca, date = 2019-09-26

Running Focal and seems to only have started after upgrading to Focal (from Disco).

| $ apt-cache policy libgl1-mesa-dri
| libgl1-mesa-dri:
| Installed: 19.3.3-1ubuntu1
| Candidate: 19.3.3-1ubuntu1
| Version table:
| *** 19.3.3-1ubuntu1 500
| 500 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages
| 100 /var/lib/dpkg/status
---
ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version k5.4.0-14-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu16
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: hloeung 1216 F.... pulseaudio
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
CurrentDesktop: Unity:Unity7:ubuntu
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2017-07-23 (935 days ago)
InstallationMedia: Ubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Lsusb-t:
 /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
 /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
MachineType: LENOVO 20HRCTO1WW
Package: linux (not installed)
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-generic root=UUID=a0c29d11-7825-4e4b-a12c-b4cec62819a6 ro nosmt=force quiet splash drm.vblankoffdelay=0 pcie_aspm=force intel_pstate=enable loop.max_loop=1 i915.enable_psr=0 vt.handoff=7
ProcVersionSignature: Ubuntu 5.4.0-14.17-generic 5.4.18
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-14-generic N/A
 linux-backports-modules-5.4.0-14-generic N/A
 linux-firmware 1.186
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: focal
Uname: Linux 5.4.0-14-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 12/12/2017
dmi.bios.vendor: LENOVO
dmi.bios.version: N1MET42W (1.27 )
dmi.board.asset.tag: Not Available
dmi.board.name: 20HRCTO1WW
dmi.board.vendor: LENOVO
dmi.board.version: SDK0J40709 WIN
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.modalias: dmi:bvnLENOVO:bvrN1MET42W(1.27):bd12/12/2017:svnLENOVO:pn20HRCTO1WW:pvrThinkPadX1Carbon5th:rvnLENOVO:rn20HRCTO1WW:rvrSDK0J40709WIN:cvnLENOVO:ct10:cvrNone:
dmi.product.family: ThinkPad X1 Carbon 5th
dmi.product.name: 20HRCTO1WW
dmi.product.sku: LENOVO_MT_20HR_BU_Think_FM_ThinkPad X1 Carbon 5th
dmi.product.version: ThinkPad X1 Carbon 5th
dmi.sys.vendor: LENOVO

Haw Loeung (hloeung) wrote :

| 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02) (prog-if 00 [VGA controller])
| Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen

Linux 5.4.0-9.12

Changed in mesa (Ubuntu):
status: New → Invalid
affects: linux-meta-5.4 (Ubuntu Focal) → linux (Ubuntu Focal)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1862865

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

tags: added: apport-collected focal
description: updated
Haw Loeung (hloeung) wrote : CRDA.txt

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu Focal):
status: Incomplete → New
status: New → Confirmed
Haw Loeung (hloeung) wrote :

| [Fri Feb 14 18:55:37 2020] i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
| [Fri Feb 14 18:55:37 2020] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
| [Fri Feb 14 18:55:37 2020] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
| [Fri Feb 14 18:55:37 2020] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
| [Fri Feb 14 18:55:37 2020] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
| [Fri Feb 14 18:55:37 2020] GPU crash dump saved to /sys/class/drm/card0/error
| [Fri Feb 14 18:55:37 2020] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
| [Fri Feb 14 18:55:45 2020] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

So included saved crash dump from /sys/class/drm/card0/error.

Haw Loeung (hloeung) wrote :

Also, I even tried disabling the power saving feature (i915.enable_psr=0) but it still seems to lock up or hang.

meskaya (meskaya) wrote :

I am on Focal Fossa, I just installed drm-tip kernel.
Let's see how it goes :)

meskaya (meskaya) wrote :

New kernel with a fix ?

Changelog
linux (5.4.0-16.19) focal; urgency=medium

  * focal/linux: 5.4.0-16.19 -proposed tracker (LP: #1864889)

  * system hang: i915 Resetting rcs0 for hang on rcs0 (LP: #1861395)
    - drm/i915/execlists: Always force a context reload when rewinding RING_TAIL

I am not having any issues with drm-tip kernel.

I will try 5.4.0-16 later today.

Haw Loeung (hloeung) wrote :

Duplicate of LP:1861395

Per LP:1861395 both 5.4.0-17 (from the Kernel Team's Unstable PPA) and 5.4.0-18 (from -proposed) seems to fixes it for me.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers