[bionic] drm/i915: softpin broken, needs to be fixed for 32bit mesa

Bug #1815172 reported by Alkis Georgopoulos on 2019-02-08
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mesa
Fix Released
Medium
linux (Ubuntu)
Medium
Unassigned
Bionic
Medium
Timo Aaltonen
Cosmic
Medium
Unassigned
mesa (Ubuntu)
Medium
Unassigned
Bionic
High
Timo Aaltonen
Cosmic
Medium
Timo Aaltonen

Bug Description

[Impact]
Several schools reported black screens after normally updating their Ubuntu boxes from 18.0.5-0ubuntu0~18.04.1 to 18.2.2-0ubuntu1~18.04.1.

Downgrading mesa fixes the problem.

lspci: 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:1912] (rev 06) Subsystem: ASUSTeK Computer Inc. HD Graphics 530 [1043:8694] Kernel modules: i915

Unfortunately I can't find a lot of useful information, here are some bits:
 * systemctl --failed says "gpu-manager" and "lightdm" have failed
 * Xorg.log is clean: https://termbin.com/6l2b
 * dmesg too: https://termbin.com/ip4e
 * It happens on lightdm/MATE, I don't know about Ubuntu GNOME.
 * If one runs `xinit` from ssh, it fails with:
i965: Failed to submit batchbuffer: Invalid argument

This is caused by mesa assuming that soft-pinning on GEN8+ is working since kernel 4.5, but in fact this issue wasn't fixed until 4.19.3. So a proper fix would be to backport commits from 4.19.3/4.20 to fix GTT sizes/pin flags, but that's left for future.

[Test case]
install fixed mesa or kernel, check that the regression is fixed

[Regression potential]
mesa: shouldn't be any, it just reverts the change to always soft-pin
(TODO kernel: adds commits from upstream stable, which have been well tested upstream by now)

Several schools reported black screens after normally updating their Ubuntu boxes from 18.0.5-0ubuntu0~18.04.1 to 18.2.2-0ubuntu1~18.04.1.

Downgrading mesa fixes the problem.

lspci: 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:1912] (rev 06) Subsystem: ASUSTeK Computer Inc. HD Graphics 530 [1043:8694] Kernel modules: i915

Unfortunately I can't find a lot of useful information, here are some bits:
 * systemctl --failed says "gpu-manager" and "lightdm" have failed
 * Xorg.log is clean: https://termbin.com/6l2b
 * dmesg too: https://termbin.com/ip4e
 * It happens on lightdm/MATE, I don't know about Ubuntu GNOME.
 * If one runs `xinit` from ssh, it fails with:
i965: Failed to submit batchbuffer: Invalid argument

Timo Aaltonen (tjaalton) on 2019-02-08
Changed in mesa (Ubuntu Bionic):
assignee: nobody → Timo Aaltonen (tjaalton)
Changed in mesa (Ubuntu Cosmic):
assignee: nobody → Timo Aaltonen (tjaalton)

Hello Alkis, thank you for the report.
Do you know, what kernel version was used on those machines? It might be helpful also.

I have HD 520 graphic card, quite close to mentioned by you, so can try to reproduce

Thank you Denis,

I tried with the following Ubuntu-built kernels, all having the issue:
4.15.0-45, 4.15.0-44, 4.15.0-20.

It happens on both 32bit and 64bit installations.

The failed units logs are:
systemctl status gpu-manager: https://termbin.com/kkix
systemctl status lightdm: https://termbin.com/ybev
lightdm.log: https://termbin.com/ssrb

The weird part is that I don't see xorg segfaults or /var/crash/* reports or anything, the only error I got was that xinit line that I mentioned, "i965: Failed to submit batchbuffer: Invalid argument"

I have the `lspci` results from 4 schools so far, they're all 8086:1912 (HD Graphics 530), although from different vendors, ASUS, Dell etc.

The Ubuntu folks uploaded 18.3.3 in their ppa:ubuntu-x-swat/updates PPA for me to test with, and it has the exact same issue.

Can you please:

$ echo 0xf > /sys/modules/drm/parameters/debug
$ xinit
$ echo 0 > /sys/modules/drm/parameters/debug
$ dmesg > dmesg.txt # and upload

Hopefully we remembered to tag the EINVAL.

I used "/sys/module" instead of "/sys/modules":

$ echo 0xf /sys/module/drm/parameters/debug
$ xinit
i965: Failed to submit batchbuffer: Invalid argument
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error

$ echo 0 /sys/module/drm/parameters/debug
$ dmesg | nc termbin.com 9999

https://termbin.com/fb2m

I don't see any additional messages though...

Sorry, I forgot the >

Here's the correct log:
https://termbin.com/6n12

(In reply to Chris Wilson from comment #4)
> Hopefully we remembered to tag the EINVAL.

[ 1494.995482] [drm:drm_ioctl [drm]] pid=2746, dev=0xe200, auth=1, DRM_IOCTL_MODE_CREATE_DUMB
[ 1494.995493] [drm:drm_ioctl [drm]] pid=2746, dev=0xe200, auth=1, DRM_IOCTL_MODE_CREATE_DUMB
[ 1494.995584] [drm:drm_ioctl [drm]] pid=2746, dev=0xe200, auth=1, I915_GEM_EXECBUFFER2_WR
[ 1494.995595] [drm:drm_ioctl [drm]] ret = -22

That'll be no then.

Changed in mesa:
importance: Unknown → Medium
status: Unknown → Confirmed

I just tried with 4.18.0-14-generic, the same issue happens there as well.

And, another school reported the issue on HD Graphics 630:

root@pc02:~# lspci -nn -k | grep -A 2 VGA 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 630 [8086:5912] (rev 04) Subsystem: ASRock Incorporation HD Graphics 630 [1849:5912] Kernel driver in use: i915

one theory is that this is related to legacy bios boot

Hello, Alkis.
Could you, please, try to use a kernel not less than v4.19.3?

And could you provide outputs of
/usr/bin/glxinfo -B
file /usr/bin/glxinfo
uname -a

Hello Sergii,

I tried with 4.20.7 and it appears to work fine! Thanks!

Output of the commands:

# uname -a
Linux srv-6gym-chalk 4.20.7-042007-generic #201902061234 SMP Wed Feb 6 17:49:39 UTC 2019 i686 i686 i686 GNU/Linux

# file /usr/bin/glxinfo
/usr/bin/glxinfo: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=6cef7eab38835376734bdc80c5ab1ee786a6157a, stripped

# glxinfo -B
name of display: :0
display: :0 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Intel Open Source Technology Center (0x8086)
    Device: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2) x86/MMX/SSE2 (0x1912)
    Version: 18.3.3
    Accelerated: yes
    Video memory: 1536MB
    Unified memory: yes
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 3.0
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2) x86/MMX/SSE2
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.3.3
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 3.0 Mesa 18.3.3
OpenGL shading language version string: 1.30
OpenGL context flags: (none)

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 18.3.3
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20

< I tried with 4.20.7 and it appears to work fine! Thanks!

You are welcome :)
So closing?

I don't know the software stacks involved:

If I understood it correctly,
mesa 18.3.3 doesn't work with older kernels while 18.0 did work,
and so I'll do a bisection to see which kernel commit fixes the issue,
and then distro kernel maintainers may cherrypick it for older kernels.

If there's no need for mesa to work with older kernels without the cherrypicked commit, then I guess yes, this issue should be closed, in which case please close it for me or tell me to close it.

Thanks again!

Most likely mesa were broken with commit:
commit a363bb2cd0e2a141f2c60be005009703bffcbe4e
Author: Kenneth Graunke <email address hidden>
Date: Tue Apr 10 01:18:25 2018 -0700

    i965: Allocate VMA in userspace for full-PPGTT systems.

In the kernel (especially 32-bit) it requires such commits (all of them are present from kernel 4.19.3):
1. commit 0014868b9c3c1dda1de6711cf58c3486fb422d07
Author: Chris Wilson <email address hidden>
Date: Fri Nov 2 16:12:09 2018 +0000

    drm/i915: Mark pin flags as u64

2. commit 085603287452fc96376ed4888bf29f8c095d2b40
Author: Chris Wilson <email address hidden>
Date: Thu Oct 25 10:18:23 2018 +0100

    drm/i915: Compare user's 64b GTT offset even on 32b

3. commit c58281056a8b26d5d9dc15c19859a7880835ef44
Author: Chris Wilson <email address hidden>
Date: Thu Oct 25 10:18:22 2018 +0100

    drm/i915: Mark up GTT sizes as u64

4. commit 83b466b1dc5f0b4d33f0a901e8b00197a8f3582d
Author: Chris Wilson <email address hidden>
Date: Fri Nov 2 16:12:09 2018 +0000

    drm/i915: Mark pin flags as u64

5. commit 6fc4e48f9ed46e9adff236a0c350074aafa3b7fa
Author: Chris Wilson <email address hidden>
Date: Thu Oct 25 10:18:23 2018 +0100

    drm/i915: Compare user's 64b GTT offset even on 32b

So its question to Chris if he could propagate such commits into kernel. Because looks like even 4.15 is commonly used still.

I tried a small kernel bisection using the ubuntu kernel binaries,
4.19.2-041902=fails,
4.19.3-041903=works

this is caused by enabling softpin in mesa, which is supported in kernel 4.5 and up but is still somewhat buggy as seen here.. we need these commits backported to the kernel:

9125963a9494253fa5a29cc1b4169885d2be7042 drm/i915: Mark up GTT sizes as u64
6fc4e48f9ed46e9adff236a0c350074aafa3b7fa drm/i915: Compare user's 64b GTT offset even on 32b

but since 18.04.2 is about to be released, it's best to revert enabling softpin in mesa, and enable it again for 18.04.3.

Changed in linux (Ubuntu):
status: New → Confirmed

4.15 is EOL upstream, but Canonical does pull patches from stable to the distro kernel in 18.04, I'm not sure why these never got in.. probably didn't apply cleanly.

anyway, I'll make sure they get applied eventually, but for now will disable softpin from mesa so that 32bit 18.04.2 image will work on gen8+

closing again, thanks!

FWIW, I am not able to reproduce this issue on Arch Linux with Mesa 18.3.3 and kernel 4.18.16.

Hello Alkis, or anyone else affected,

Accepted mesa into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mesa/18.2.2-0ubuntu1~18.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in mesa (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed verification-needed-bionic
Changed in mesa (Ubuntu Cosmic):
status: New → Fix Committed
tags: added: verification-needed-cosmic
Adam Conrad (adconrad) wrote :

Hello Alkis, or anyone else affected,

Accepted mesa into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mesa/18.2.8-0ubuntu0~18.10.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Brad Figg (brad-figg) on 2019-02-08
tags: added: bjf-tracking

I verify that the bionic-proposed package addresses the issue.

Tested in:
# lspci -nn -k | grep -A 2 VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 630 [8086:5912] (rev 04)
        Subsystem: Gigabyte Technology Co., Ltd HD Graphics 630 [1458:d000]
        Kernel driver in use: i915

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mesa - 18.2.2-0ubuntu1~18.04.2

---------------
mesa (18.2.2-0ubuntu1~18.04.2) bionic; urgency=medium

  * i965-revert-enabling-softpin.diff: Don't enable softpin, causes
    issues on 32bit installs. (LP: #1815172)

 -- Timo Aaltonen <email address hidden> Fri, 08 Feb 2019 19:12:58 +0200

Changed in mesa (Ubuntu Bionic):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for mesa has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in mesa:
status: Confirmed → Fix Released

Hello Alkis, or anyone else affected,

Accepted mesa into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mesa/18.2.8-0ubuntu0~18.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in mesa (Ubuntu Bionic):
status: Fix Released → Fix Committed
tags: added: verification-needed-bionic
removed: verification-done-bionic

Just correcting a wrong comment (#2) I made:
> It happens on both 32bit and 64bit installations.

I asked the school that reported the issue on 64bit to check again, and they said they have a 32bit installation after all.

So the problem has only been reported in 32bit installations; I don't know if it happened on 64bit. And btw everything is fixed now, thanks again to all. :)

disco has 4.19.x

Changed in mesa (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Timo Aaltonen (tjaalton) wrote :

please test the new mesa on bionic/cosmic

Changed in mesa (Ubuntu):
status: Invalid → Fix Released
Changed in mesa (Ubuntu Cosmic):
importance: Undecided → Medium
Changed in mesa (Ubuntu Bionic):
importance: Undecided → High
Changed in mesa (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux (Ubuntu Cosmic):
importance: Undecided → Medium
Timo Aaltonen (tjaalton) wrote :

Alkis, please test 18.2.8 for bionic and if possible, cosmic

Changed in linux (Ubuntu Bionic):
assignee: nobody → Timo Aaltonen (tjaalton)
Timo Aaltonen (tjaalton) on 2019-03-12
tags: added: verification-done verification-done-bionic verification-done-cosmic
removed: verification-needed verification-needed-bionic verification-needed-cosmic
Timo Aaltonen (tjaalton) wrote :

actually, bionic was already tested earlier, and the tested fix is a one-line change to mesa which is still the same on 18.2.8 so marking bionic verified.. and also cosmic since it's the exact same version there, and cosmic kernel was also tested to be buggy

Timo Aaltonen (tjaalton) wrote :

bionic was already fixed earlier by 18.2.2-0ubuntu1~18.04.2 in time for 18.04.2 release

cosmic got the exact same oneliner in 18.2.8-0ubuntu0~blah, and then that was backported to bionic which reopened this bug. Requiring Alkis to test this on cosmic is too much IMO, as it would involve first to set up a school to use cosmic, then test remotely or ask folks over there to report back... all for a oneliner.

Alkis, please correct me if I'm wrong..

description: updated
Alkis Georgopoulos (alkisg) wrote :

Exactly. I did the "verification-done-bionic" step for 18.2.2 in comment #12 above; and unfortunately I don't have an affected school nearby, where I could test 18.2.8 in cosmic, and installing cosmic in a remote school would be hard.

Thanks a lot Timo!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mesa - 18.2.8-0ubuntu0~18.04.2

---------------
mesa (18.2.8-0ubuntu0~18.04.2) bionic; urgency=medium

  * Backport to bionic.

mesa (18.2.8-0ubuntu0~18.10.2) cosmic; urgency=medium

  * i965-revert-enabling-softpin.diff: Don't enable softpin, causes
    issues on 32bit installs. (LP: #1815172)

mesa (18.2.8-0ubuntu0~18.04.1) bionic; urgency=medium

  * Backport to bionic.
  * intel-whl-aml-cfl-ids.diff: Dropped, upstream.

mesa (18.2.8-0ubuntu0~18.10.1) cosmic; urgency=medium

  * New upstream bugfix release. (LP: #1811225)
    - add missing gpu-id's. (LP: #1789924)
  * Cherry-picked from disco:
    Move KHR/khrplatform.h from libegl1-mesa-dev to mesa-common-dev
    because GL/glcorearb.h and GL/glext.h started to depend on this
    header too (Closes: #914167).

 -- Timo Aaltonen <email address hidden> Sat, 09 Feb 2019 00:02:44 +0200

Changed in mesa (Ubuntu Bionic):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mesa - 18.2.8-0ubuntu0~18.10.2

---------------
mesa (18.2.8-0ubuntu0~18.10.2) cosmic; urgency=medium

  * i965-revert-enabling-softpin.diff: Don't enable softpin, causes
    issues on 32bit installs. (LP: #1815172)

mesa (18.2.8-0ubuntu0~18.10.1) cosmic; urgency=medium

  * New upstream bugfix release. (LP: #1811225)
    - add missing gpu-id's. (LP: #1789924)
  * Cherry-picked from disco:
    Move KHR/khrplatform.h from libegl1-mesa-dev to mesa-common-dev
    because GL/glcorearb.h and GL/glext.h started to depend on this
    header too (Closes: #914167).

 -- Timo Aaltonen <email address hidden> Fri, 08 Feb 2019 19:12:58 +0200

Changed in mesa (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Timo Aaltonen (tjaalton) on 2019-07-18
Changed in linux (Ubuntu Cosmic):
status: New → Won't Fix
summary: - Black screen on skylake after 18.0 => 18.2 update
+ [bionic] drm/i915: softpin broken, needs to be fixed for 32bit mesa
Timo Aaltonen (tjaalton) wrote :

Reopening mesa for bionic, because the revert needs to be dropped once the kernel is fixed, otherwise Ice Lake is broken because it needs softpin for the DRI driver to work.

Changed in mesa (Ubuntu Bionic):
status: Fix Released → In Progress
Timo Aaltonen (tjaalton) wrote :

could you please test with ppa:canonical-hwe-team/ppa which has a new 4.15 kernel with a couple of backports that should fix this, and it also comes with mesa that dropped the revert

apparently I don't have hw which can install 32bit ubuntu and reproduce this

Timo Aaltonen (tjaalton) wrote :

Nevermind, I used a thinkpad t470s plus an external ssd which I could wipe out and install ubuntu mate 18.04 32bit on it. To reproduce I enabled the HWE ppa and upgraded mesa, then installed linux-generic and booted 4.15.0-55. It ended up with a mostly blank screen but keyboard had no effect other than the power button.

Then I installed 'linux-image-4.15.0-56-generic linux-modules-4.15.0-56-generic linux-modules-extra-4.15.0-56-generic' and rebooted, and now I got to the display manager normally.

Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in mesa (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Timo Aaltonen (tjaalton) on 2019-08-13
Changed in mesa (Ubuntu Bionic):
status: Fix Committed → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.