Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2)

Bug #1878670 reported by Kurt Aaholst
30
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
Invalid
Medium
Leon Liao
linux (Ubuntu)
Invalid
Undecided
Unassigned
linux-oem-osp1 (Ubuntu)
Invalid
Undecided
Unassigned
mesa (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

The system freezes frequently then logs out.
If happens at apparently random times, and using no particular app.

Extract of syslog is attached.

Requested info is below:

1) release
Ubuntu 18.04.2 LTS (beaver-osp1-melisa X30)

2) package version
gnome-session:
  Installeret: (ingen)
  Kandidat: 3.28.1-0ubuntu3
  Versionstabel:
     3.28.1-0ubuntu3 500
        500 http://dk.archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages
     3.28.1-0ubuntu2 500
        500 http://dk.archive.ubuntu.com/ubuntu bionic/universe amd64 Packages

3) Gnome shouldn't crash

4) Gnome crashed

Revision history for this message
Kurt Aaholst (kaaholst) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It sounds like some part of the system has crashed. To help us find the cause of the crash please follow these steps:

1. Look in /var/crash for crash files and if found run:
    ubuntu-bug YOURFILE.crash
Then tell us the ID of the newly-created bug.

2. If step 1 failed then look at https://errors.ubuntu.com/user/ID where ID is the content of file /var/lib/whoopsie/whoopsie-id on the machine. Do you find any links to recent problems on that page? If so then please send the links to us.

3. If step 2 also failed then apply the workaround from bug 994921, reboot, reproduce the crash, and retry step 1.

Please take care to avoid attaching .crash files to bugs as we are unable to process them as file attachments. It would also be a security risk for yourself.

tags: added: bionic
affects: gnome-session (Ubuntu) → gnome-shell (Ubuntu)
Changed in gnome-shell (Ubuntu):
status: New → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Also it looks like at least one of your crashes originates in the kernel :(

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Please also run:

  apport-collect 1878670

to send more info about the machine.

Revision history for this message
Anthony Wong (anthonywong) wrote :

Hi Kurt,

I see you have a Ubuntu preload machine, we will try to see if we can reproduce your problem.
Meanwhile, please upload more details by apport-collect as Daniel said.

Changed in linux (Ubuntu):
status: Incomplete → Triaged
Rex Tsai (chihchun)
Changed in oem-priority:
assignee: nobody → Leon Liao (lihow731)
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
You-Sheng Yang (vicamo) wrote :

Probably a duplicate of bug 1872001.

Revision history for this message
Kurt Aaholst (kaaholst) wrote :

Hi,

Thanks for the fast responses.

I have tried to collect the requested information but:

1) There are no files in /var/crash

2) The only item in https://errors.ubuntu.com/user/8dfc634ec... is:
2020-04-28 23:04 2020-04-28 21:04 UTC Crash google-chrome-stable
https://errors.ubuntu.com/oops/6c75c4c0-8996-11ea-a340-fa163e6cac46

3) I commented out line23 :"'problem_types': ['Bug', 'Package']," from /etc/apport/crashdb.conf, rebooted and managed to reproduce the crash. (it's not easily reproducible, or at least I haven't found out how to reliably reproduce). But still no files in var/crash.

I also tried "apport-collect 1878670" (with and without sudo):

kaa@kaa-XPS-13-9300:~$ apport-collect 1878670
ERROR: The python3-launchpadlib package is not installed. This functionality is not available.
kaa@kaa-XPS-13-9300:~$ sudo apt list python3-launchpadlib
Listing... Færdig
python3-launchpadlib/bionic,bionic,now 1.10.6-1 all [Installeret]

Revision history for this message
Kurt Aaholst (kaaholst) wrote :

As there are no files in /var/crash, I checked the apport-settings:

kaa@kaa-XPS-13-9300:~$ cat /etc/default/apport
# set this to 0 to disable apport, or to 1 to enable it
# you can temporarily override this with
# sudo service apport start force_start=1
enabled=1

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK then. Let's ignore the Gnome part and make this bug just about the kernel crash (which is *probably* what you also see in Gnome).

summary: - Gnome crash frequently
+ i915 crashes hsw_power_well_enable
no longer affects: gnome-shell (Ubuntu)
Revision history for this message
Kurt Aaholst (kaaholst) wrote : Re: i915 crashes hsw_power_well_enable

Sounds resasonable.

Do you need any more info from me?

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

Hello.
I am having the same problem on a Dell XPS 2020 with Ubuntu 18.04
I uploaded details in a duplicate bug report here :
https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1879389

If there is anything you need I can upload or give.

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

Received a new kernel :
linux-headers-5.0.0-1052-oem-osp1 linux-image-5.0.0-1052-oem-osp1 linux-modules-5.0.0-1052-oem-osp1 linux-oem-osp1-headers-5.0.0-1052

Upgrading and will report to see if this has any good impact on the problem

summary: - i915 crashes hsw_power_well_enable
+ [Ice Lake] i915 crashes in hsw_power_well_enable() from
+ icl_tc_phy_aux_power_well_enable()
Revision history for this message
Gilbertf (gilbert-fernandes) wrote : Re: [Ice Lake] i915 crashes in hsw_power_well_enable() from icl_tc_phy_aux_power_well_enable()

I have found a way to very easily reproduce it.
I launch Software then click on Search icon
I type "Visual" to look for Visual Studio Code
The window seems to freeze while it is searching.
After 10 seconds or so, the line for Visual Studio Code does appear
But interface is non-responding to me : I cannot click or do anything

After 5 to 10 seconds, I am sent back to the login screen...

Seems that anything that makes a window go into "high usage" when working makes Xorg die
And I am then sent back to the login screen

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

I am getting a weird behavior from Software
After 2 crashes and Xorg dying in a row, now when I launch it the window does not show the same content as previously. Searching for "Visual" shows no result
I rebooted the machine, and now launching Software again seems to show the proper content
Typing "Visual" gave me a corrupted-content window followed with a freeze, and 10 seconds later, with Xorg dying.
Picture I took from the phone here :
https://ibb.co/XWJq6ZG

It happens all the time if I do those steps
Something is wrong in the Intel driver ?

Revision history for this message
Leon Liao (lihow731) wrote :

@Gilbertf,

I got a XPS 13 9300 (not sure it is the same SKU).
I saw the i915 crashes dump, but I don't reproduce the xwindow crash issue.
And, I tried the `apport-collect`, the apport-collect works well.
We really need the information collected by apport, could you try to reinstall the apport and re-run the `apport-collect`?

Revision history for this message
Kurt Aaholst (kaaholst) wrote :

I tried:
$ sudo apt-get install --reinstall apport
$ sudo apt-get install --reinstall python3-launchpadlib
which both claims to install and setup the requested packages.

but still:
kaa@kaa-XPS-13-9300:~$ apport-collect 1878670
ERROR: The python3-launchpadlib package is not installed. This functionality is not available.

If you any more suggestions I will be happy to try them and send the requested information.
In the meantime, I will see if I can get apport-collect to work.

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

@Leon Liao : I cannot use apport on this ticket as I am not the person that initiated it. Apport will not allow me to add anything to this ticket.

My SKU is CDRH533
Core i7, 16 Gb Ram, 1 To Mvme, 18.04 Ubuntu
I can reproduce the crash by going to Software. It hangs, then Xorg dies.

I would love to use apport but how ? Apport does not let me send anything to this bug report, because I am not the one that initiated it.

I can use apport and send to the ticket I did open :
https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1879389

I dont know what I am supposed to do with apport.

If I try to use it on my own ticket it does not work :

gilbert@asgard:~$ apport-collect 1879389
The authorization page:
 (https://launchpad.net/+authorize-token?oauth_token=cXBMtm9WRNlnBwPzdNwN&allow_permission=DESKTOP_INTEGRATION)
should be opening in your browser. Use your browser to authorize
this program to access Launchpad on your behalf.
Waiting to hear from Launchpad about your decision...
ERROR: connecting to Launchpad failed: [Errno 13] Permission denied: '/home/gilbert/.cache/apport/launchpad.credentials'
You can reset the credentials by removing the file "/home/gilbert/.cache/apport/launchpad.credentials"

the command opens a web page
I click on accept permanently
then the console says permission denied

I checked : there is no launchpad.credentials file in the indicated path
Because there is no file, it seems it is unable to work

I can reproduce the crash all the time. Going to software, looking for visual studio code makes it dies. I would love to upload what you need, but this ticket is not mine, and trying to use apport on the one I opened.. doesnt seem to work.

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :
Download full text (6.6 KiB)

I made a video of what's happening.
Here it is : https://youtu.be/OQ4FRHmqHIs

I did it several times so you can see.
I go to Software, type "visu" to look for visual studio code
first attempt window got corrupted, everything froze but mouse then I get sent to login screen
second attempt no graphical corruption, just complete freeze and back to login screen

kern.log :

May 19 22:10:34 asgard kernel: [ 1863.748095] WARN_ON(intel_wait_for_register(dev_priv, regs->driver, (0x1 << ((pw_idx) * 2)), (0x1 << ((pw_idx) * 2)), 1))
May 19 22:10:34 asgard kernel: [ 1863.748191] WARNING: CPU: 7 PID: 3611 at /build/linux-oem-osp1-uONeD1/linux-oem-osp1-5.0.0/drivers/gpu/drm/i915/intel_runtime_pm.c:308 hsw_wait_for_power_well_enable.isra.8+0x4c/0x50 [i915]
May 19 22:10:34 asgard kernel: [ 1863.748193] Modules linked in: rfcomm ccm cmac bnep uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev media btusb btrtl btbcm btintel bluetooth ecdh_generic hid_multitouch 8250_dw intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi dell_laptop nls_iso8859_1 sof_pci_dev snd_sof_intel_hda_common snd_soc_hdac_hda arc4 snd_sof_intel_hda snd_sof_xtensa_dsp snd_sof snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi crct10dif_pclmul snd_soc_core crc32_pclmul snd_hda_codec_realtek snd_compress snd_hda_codec_generic ghash_clmulni_intel ledtrig_audio ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_nhlt snd_hda_codec aesni_intel snd_hda_core snd_hwdep snd_pcm aes_x86_64 crypto_simd cryptd glue_helper snd_seq_midi snd_seq_midi_event intel_cstate intel_rapl_perf snd_rawmidi input_leds joydev serio_raw snd_seq iwlmvm dell_wmi dell_smbios snd_seq_device dcdbas mac80211 snd_timer dell_wmi_descriptor intel_wmi_thunderbolt idma64
May 19 22:10:34 asgard kernel: [ 1863.748227] wmi_bmof virt_dma snd iwlwifi soundcore rtsx_pci_ms hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer memstick kfifo_buf hid_sensor_iio_common cfg80211 industrialio mei_me intel_lpss_pci mei intel_lpss ucsi_acpi typec_ucsi typec int3403_thermal int340x_thermal_zone mac_hid intel_hid int3400_thermal acpi_pad sparse_keymap acpi_thermal_rel acpi_tad sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor mmc_block raid6_pq libcrc32c raid1 raid0 multipath linear hid_sensor_hub hid_generic intel_ishtp_hid i915 kvmgt rtsx_pci_sdmmc vfio_mdev mdev vfio_iommu_type1 vfio kvm psmouse irqbypass i2c_algo_bit nvme drm_kms_helper rtsx_pci nvme_core syscopyarea intel_ish_ipc thunderbolt sysfillrect intel_ishtp sysimgblt fb_sys_fops drm wmi i2c_hid hid pinctrl_icelake video pinctrl_intel
May 19 22:10:34 asgard kernel: [ 1863.748262] CPU: 7 PID: 3611 Comm: kworker/7:1 Tainted: G W 5.0.0-1052-oem-osp1 #57-Ubuntu
May 19 22:10:34 asgard kernel: [ 1863.748264] Hardware name: Dell Inc. XPS 13 9300/077Y9N, BIOS 1.0.7 02/26/2020
May 19 22:10:34 asgard kernel: [ 1863.748310] Workqueue: events i915_hpd_poll_init_work [i915]
May 19 22:10:34 asgard kernel: [ 1863.748341] RIP: 0010:hsw_wait_for_power_well_enable.isra.8+0x4c/0x50 [i91...

Read more...

Revision history for this message
Leon Liao (lihow731) wrote :

@Gilbertf,

Could you use apport to file another new bug for tracing your bug?
Let we focus on Mr. Kurt's bug on this bug, thank you.

Revision history for this message
Leon Liao (lihow731) wrote :

@Kurt,

If you can not use the apport, do you mind to upload the output of sosreport.
We need to remind you the sosreport will collect personal information.

We need your help to upload below logs for us to debug, thank you:
1. /var/log/kern.log
2. /var/log/syslog
3. dpkg -l > dpkg-l.log
4. sosreport

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Gilbertf already opened bug 1879389 and it's a duplicate of this one.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Actually I'm not sure this is a crash at all. Maybe it's just a warning leading to a crash elsewhere...

May 14 20:12:32 kaa-XPS-13-9300 kernel: [1283797.420191] WARN_ON(intel_wait_for_register(dev_priv, regs->driver, (0x1 << ((pw_idx) * 2)), (0x1 << ((pw_idx) * 2)), 1))
May 14 20:12:32 kaa-XPS-13-9300 kernel: [1283797.420243] WARNING: CPU: 1 PID: 1440 at /build/linux-oem-osp1-JJcJwF/linux-oem-osp1-5.0.0/drivers/gpu/drm/i915/intel_runtime_pm.c:308 hsw_wait_for_power_well_enable.isra.8+0x4c/0x50 [i915]
...
May 14 20:12:32 kaa-XPS-13-9300 kernel: [1283797.420296] Call Trace:
May 14 20:12:32 kaa-XPS-13-9300 kernel: [1283797.420309] hsw_power_well_enable+0xaf/0x1d0 [i915]
May 14 20:12:32 kaa-XPS-13-9300 kernel: [1283797.420320] icl_tc_phy_aux_power_well_enable+0x7f/0x90 [i915]
...

summary: - [Ice Lake] i915 crashes in hsw_power_well_enable() from
- icl_tc_phy_aux_power_well_enable()
+ [Ice Lake] WARN_ON(intel_wait_for_register(dev_priv, regs->driver, (0x1
+ << ((pw_idx) * 2)), (0x1 << ((pw_idx) * 2)), 1)) [from
+ i915/intel_runtime_pm.c:308]
Revision history for this message
Gilbertf (gilbert-fernandes) wrote : Re: [Ice Lake] WARN_ON(intel_wait_for_register(dev_priv, regs->driver, (0x1 << ((pw_idx) * 2)), (0x1 << ((pw_idx) * 2)), 1)) [from i915/intel_runtime_pm.c:308]

I have uploaded a lot of files and details in the :
https://bugs.launchpad.net/bugs/1879389

Do you need something else ? More files or information ?

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

I have reproduced the Xorg crash again and uploaded all files here :
https://bugs.launchpad.net/ubuntu/+source/linux-oem-osp1/+bug/1879652

My previous apport-bug targeted xorg
But since the crash or bug seems to happen in the kernel, I have done it again
but this time using "linux" as target

would giving you remote access to the machine help in some way ?

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

the crash also happens if i open a terminal
view the kern.log file
and then i use page down to scroll down

after a few seconds of scrolling down, everything becomes slow, erratic
then freezes, and xorg dies

anything ask the i915 driver to work more than lightly makes xorg die
I have a file that appeared in /var/crash related to that xterm
gilbert@asgard:/var/crash$ ls -l
total 20616
-rw-r----- 1 gdm whoopsie 21107149 May 20 11:24 _usr_bin_gnome-shell.121.crash

Do you guys want this file ?

Revision history for this message
Kurt Aaholst (kaaholst) wrote :
Revision history for this message
Kurt Aaholst (kaaholst) wrote :
Revision history for this message
Kurt Aaholst (kaaholst) wrote :
Revision history for this message
Kurt Aaholst (kaaholst) wrote :
Revision history for this message
Kurt Aaholst (kaaholst) wrote :

I have uploaded the requested information.

I reproduced the crash to make sure the logs contains relevant information. The crash was reproduced by "vim /var/log/syslog" and hold down the down-arrow.

The sosreport output may not be what you want however, as no plugins are enabled.

Revision history for this message
Kurt Aaholst (kaaholst) wrote :

There are now files in /var/crash
kaa@kaa-XPS-13-9300:~$ ll /var/crash/*.crash
-rw-r----- 1 gdm whoopsie 21332092 maj 21 00:02 /var/crash/_usr_bin_gnome-shell.121.crash
-rw-r----- 1 gdm whoopsie 3269523 maj 21 00:04 /var/crash/_usr_bin_Xwayland.121.crash

I uploaded these via ubuntu-bug as requested by Daniel.

Xwayland bug: 1879799
gnome bug: 1879800

Revision history for this message
AceLan Kao (acelankao) wrote :

The i915 messages are just warning messages, it doesn't look like to lead to gnome desktop crashes.
Could you also upload the journal log, too? There are more info there to check.
   journalctl -b > journalctl_b.log

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

Xorg dies and sends me back to the login screen. So it is not Gnome that seems to crash, but rather Xorg. I can reproduce it 100 % of the time just by either scrolling in a less/view over a big file, or just searching for software in the "Software" application. In my kern.log I get a cut [cut here] with a kind of stack trace, so something is happening. Using IntelliJ on a big project also sees the graphical interface during indexing. Just trying to install Visual Code Studio by using the search function of Software has Xorg die on me. I'm gonna make it crash again and report the journal.

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

please find the journal file after a Xorg crash/dying (went into software, search button, typed "visua" and watched it die after 10 seconds with back to login screen.

Revision history for this message
Kurt Aaholst (kaaholst) wrote :

journalctl -b > journalctl_b.log

Note - the reason I originally targeted this to gnome is because of this (from syslog):
May 21 12:49:28 kaa-XPS-13-9300 gnome-session-binary[6054]: Unrecoverable failure in required component org.gnome.Shell.desktop
May 21 12:49:28 kaa-XPS-13-9300 gnome-session[6054]: gnome-session-binary[6054]: CRITICAL: We failed, but the fail whale is dead. Sorry....
May 21 12:49:28 kaa-XPS-13-9300 gnome-session-binary[6054]: WARNING: App 'org.gnome.Shell.desktop' respawning too quickly
May 21 12:49:28 kaa-XPS-13-9300 gnome-session-binary[6054]: CRITICAL: We failed, but the fail whale is dead. Sorry....

Revision history for this message
Martin Packman (gz) wrote :

I am experiencing similar issues with a brand new Dell Inc. XPS 13 9300/077Y9N.

Perhaps this bug should be renamed for the symptoms and user impact, as there are bunch of linked bugs for specific crashes related to packages?

@vanvugt If it's helpful, I have also uploaded (with some struggle apport is failing) the following oopses which are likely related to the symptoms experienced:

linux-image-5.0.0-1050-oem-osp1 https://errors.ubuntu.com/oops/bf9d4f57-9b4f-11ea-aa30-fa163ee63de6 https://errors.ubuntu.com/oops/b9fc3920-9b4f-11ea-aa30-fa163ee63de6

xwayland https://errors.ubuntu.com/oops/a1951eb0-9b4f-11ea-a4d2-fa163e6cac46
gnome-shell https://errors.ubuntu.com/oops/e502d63d-9b4b-11ea-9b64-fa163e102db1

@gilbert-fernandes As a workaround, or at least to see if it helps, I've disabled wayland, which is pretty pointless on 18.04 anyway, you may want to try as well:
https://askubuntu.com/a/975098

Revision history for this message
Martin Packman (gz) wrote :

Both posted journalctl logs feature sections like this:

May 21 11:03:29 asgard kernel: [drm] GPU HANG: ecode 11:0:0x85dffffb, in Xorg [2421], reason: hang on rcs0, action: reset
May 21 11:03:29 asgard kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 11:03:37 asgard kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 11:03:45 asgard kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 11:03:53 asgard kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 11:04:01 asgard kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 11:04:01 asgard /usr/lib/gdm3/gdm-x-session[2419]: i965: Failed to submit batchbuffer: Input/output error
May 21 11:04:01 asgard gnome-calendar[3680]: gnome-calendar: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
May 21 11:04:01 asgard seahorse[3679]: seahorse: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.

The GPU stopped responding, so the session falls over. Which makes sense.

SO, this does look very much like bug 1872001 - did a bad graphics driver update get backported to the 5.0.0 OEM kernel?

@lihow731 Do you need anything else to progress on this?

Revision history for this message
Martin Packman (gz) wrote :

From similar-but-maybe-not-the-same drm issues, seems capturing the GPU crash dump (if there is one) would be useful

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1861395/comments/2
https://gitlab.freedesktop.org/drm/intel/issues/673

I have a dmesg full of "WARNING ... [i915]" but no gpu failures since last restart yet.

Revision history for this message
Martin Packman (gz) wrote :

Captured one, context from journalctl:

May 21 14:13:51 xps2020 kernel: [drm] GPU HANG: ecode 11:0:0x85dffffb, in Xorg [1985], reason: hang on rcs0, action: reset
May 21 14:13:51 xps2020 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 14:13:54 xps2020 kernel: Asynchronous wait on fence i915:gnome-shell[2119]/1:27478 timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
May 21 14:13:59 xps2020 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
May 21 14:13:59 xps2020 /usr/lib/gdm3/gdm-x-session[1983]: (II) modeset(0): EDID vendor "SHP", prod id 5324
May 21 14:13:59 xps2020 /usr/lib/gdm3/gdm-x-session[1983]: (II) modeset(0): Printing DDC gathered Modelines:
May 21 14:13:59 xps2020 /usr/lib/gdm3/gdm-x-session[1983]: (II) modeset(0): Modeline "3840x2400"x0.0 592.50 3840 3888 3920 4000 2400 2403 2409 2469 -hsync -vsync (148.1 kHz eP)
May 21 14:13:59 xps2020 /usr/lib/gdm3/gdm-x-session[1983]: (II) modeset(0): Modeline "3840x2400"x0.0 474.00 3840 3888 3920 4000 2400 2403 2409 2469 -hsync -vsync (118.5 kHz e)

And head of /sys/class/drm/card0/error:

GPU HANG: ecode 11:0:0x85dffffb, in Xorg [1985], reason: hang on rcs0, action: reset
Kernel: 5.0.0-1052-oem-osp1
Time: 1590066831 s 42118 us
Boottime: 17262 s 908428 us
Uptime: 17259 s 317934 us
Epoch: 4299206488 jiffies (250 HZ)
Capture: 4299207993 jiffies; 121328 ms ago, 6020 ms after epoch
Active process (on ring rcs0): Xorg [1985], score 0
Reset count: 0
Suspend count: 0
Platform: ICELAKE

( the rest is attached, dump of a bunch of things from the graphics card buffers)

Revision history for this message
Martin Packman (gz) wrote :

Okay, pretty sure have found the right upstream bug, and have what looks like a good workaround.

$ gsettings set org.gnome.settings-daemon.plugins.xsettings antialiasing 'grayscale'

https://gitlab.freedesktop.org/mesa/mesa/issues/2183

The steps to reproduce does kill the session with a gpu hang on the text drawing benchmark, after changing the antialiasing font mode and logging in again, it completes. Will run for the next few days and see if daily crashes continue or not.

summary: - [Ice Lake] WARN_ON(intel_wait_for_register(dev_priv, regs->driver, (0x1
- << ((pw_idx) * 2)), (0x1 << ((pw_idx) * 2)), 1)) [from
- i915/intel_runtime_pm.c:308]
+ Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2)
Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

I have done the gsettings to grayscale
And my Xorg dying on me seems gone when I now try to reproduce the crashes :)

Revision history for this message
Kurt Aaholst (kaaholst) wrote :

I can confirm that the workaround:
$ gsettings set org.gnome.settings-daemon.plugins.xsettings antialiasing 'grayscale'
followed by logout+login.
does stop the crashes.

Note:
To stop Intellij IDEA / Android Studio from crashing the system, it necessary to set Antialiasing to Greyscale in Settings / Appearance & Behaviour / Appearance, as suggested in
https://gitlab.freedesktop.org/mesa/mesa/issues/2183

Thanks very much to everyone for helping sort this out, and I look forward to receive an official software update for this.

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

Thanks for the tip. I have set my IntelliJ and GoLand settings properly :)

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

Today when booting the laptop from OFF i went up to the login screen.
Typed the password, then screen went black. I could still dim up/down the keys, but screen remained black. I had to keep power pressed and boot several times, still getting a black screen. The only thing that worked was to unplugg the power source, turn it on. After the login, I went up to the desktop.

If I check the kern.log it's filled with the same hsw_wait_for_power_well_enable stuff in loop.

Nothing in /var/crash

Revision history for this message
Martin Packman (gz) wrote :
Download full text (5.4 KiB)

Well, similar experience for me today, booted and logged in okay, but then the desktop session died leaving monitor backlight and keyboard lights on, but no other sign of life. Shutting lid to suspend then resume didn't help, had to hard power off. Then on next boot, it hung after login, but recovered and was fine thereafter. More oddness, but not the gpu in the blame this time.

Log sequence extracts:

May 26 09:28:03 xps2020 systemd[1]: Startup finished in 5.461s (firmware) + 3.431s (loader) + 14.216s (kernel) + 20.511s (userspace) = 43.620s.

...

May 26 09:29:14 xps2020 update-notifier.desktop[4555]: /usr/lib/ubuntu-release-upgrader/check-new-release-gtk:30: PyGIWarning: Gtk was imported without specifying a version first. Use gi.require_version('Gtk', '3.0') before import to ensure that the right version gets loaded.
May 26 09:29:14 xps2020 update-notifier.desktop[4555]: from gi.repository import Gtk
May 26 09:29:14 xps2020 update-notifier.desktop[4555]: WARNING:root:timeout reached, exiting
May 26 09:29:14 xps2020 gnome-session[1422]: gnome-session-binary[1422]: WARNING: Application 'org.gnome.SettingsDaemon.Color.desktop' failed to register before timeout
May 26 09:29:14 xps2020 gnome-session-binary[1422]: WARNING: Application 'org.gnome.SettingsDaemon.Color.desktop' failed to register before timeout
May 26 09:29:14 xps2020 gnome-session-binary[1422]: Unrecoverable failure in required component org.gnome.SettingsDaemon.Color.desktop
May 26 09:29:14 xps2020 gnome-session[1422]: gnome-session-binary[1422]: CRITICAL: We failed, but the fail whale is dead. Sorry....
May 26 09:29:14 xps2020 gnome-session-binary[1422]: CRITICAL: We failed, but the fail whale is dead. Sorry....

...

May 26 09:29:15 xps2020 systemd[1]: Stopped User Manager for UID 121.
May 26 09:29:15 xps2020 systemd[1]: Removed slice User Slice of gdm.
May 26 09:29:54 xps2020 PackageKit[1605]: get-updates transaction /309_bcbbbbbe from uid 1001 finished with success after 398ms
May 26 09:29:57 xps2020 org.gnome.Shell.desktop[3032]: ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
May 26 09:29:57 xps2020 org.gnome.Shell.desktop[3032]: ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
May 26 09:30:09 xps2020 systemd-logind[1187]: Lid closed.
May 26 09:30:09 xps2020 systemd-logind[1187]: Suspending...

...

May 26 09:30:13 xps2020 systemd-logind[1187]: Lid opened.
May 26 09:30:23 xps2020 systemd-logind[1187]: Power key pressed.
May 26 09:30:31 xps2020 systemd-logind[1187]: Power key pressed.
May 26 09:30:39 xps2020 systemd-logind[1187]: Delay lock is active (UID 1001/martin, PID 3032/gnome-shell) but inhibitor timeout is reached.
May 26 09:30:39 xps2020 systemd[1]: Starting TLP suspend/resume...
May 26 09:30:39 xps2020 systemd[1]: Started TLP suspend/resume.
May 26 09:30:39 xps2020 systemd[1]: Reached target Sleep.
May 26 09:30:39 xps2020 systemd[1]: Starting Suspend...
May 26 09:30:39 xps2020 systemd-sleep[4909]: Suspending system...
May 26 09:30:39 xps2020 kernel: PM: suspend entry (s2idle)
May 26 09:30:41 xps2020 kernel: PM: Syncing filesystems ... done.
May 26 09:30:41 xps202...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-oem-osp1 (Ubuntu):
status: New → Confirmed
Changed in mesa (Ubuntu):
status: New → Confirmed
Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

I have opened a ticket at IntelliJ
Even with the grayscale fix in both Gnome and IntelliJ settings
Importing any moderate to big size project in IntelliJ makes IDEA freeze and I have to kill the binary.

Anything intensive on the machine dies.

Revision history for this message
Gilbertf (gilbert-fernandes) wrote :
Revision history for this message
Gilbertf (gilbert-fernandes) wrote :

Dell and Ubuntu have certified a few days ago the XPS 2020 9300 for 20.04
I guess I am going to move from 18.04 to 20.04 in a few days, a week at most
I hope this problem will be fixed there :/

Rex Tsai (chihchun)
tags: added: beaver-osp1-melisa
tags: added: oem-priority
Revision history for this message
Kurt Aaholst (kaaholst) wrote :

There is now an official 20.04 for the XPS-13-9300 which does not have this issue.

You may close this bug if you like.

Thanks again.

Timo Aaltonen (tjaalton)
Changed in linux-oem-osp1 (Ubuntu):
status: Confirmed → Invalid
Changed in mesa (Ubuntu):
status: Confirmed → Invalid
Changed in linux (Ubuntu):
status: Triaged → Invalid
Changed in oem-priority:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.