vulkan-smoketest segfaults steam vulkan games segfault

Bug #1720890 reported by Lawrence A Fossi on 2017-10-02
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
mesa (Ubuntu)
Undecided
Timo Aaltonen
Xenial
Undecided
Unassigned
Artful
Undecided
Unassigned

Bug Description

$ lsb_release -rd
Description: Ubuntu Artful Aardvark (development branch)
Release: 17.10

$ apt-cache policy mesa-vulkan-drivers
mesa-vulkan-drivers:
  Installed: 17.2.1-0ubuntu1
  Candidate: 17.2.1-0ubuntu1
  Version table:
 *** 17.2.1-0ubuntu1 500
        500 http://us.archive.ubuntu.com/ubuntu artful/universe amd64 Packages
        100 /var/lib/dpkg/status

Easily reproducible : apt-get install mesa-vulkan-drivers apt-get install vulkan-utils

Open a terminal type vulkan-smoketest instead of a glxgears type window it will segfault.
message from kernel.log:
Sep 29 16:23:53 Tardis-1 kernel: [17709.532915] vulkan-smoketes[11798]: segfault at 0 ip 00007fed61a17914 sp 00007ffedcb8f850 error 6 in libvulkan_radeon.so[7fed619ab000+18b000]
syslog:
Sep 28 10:38:13 Tardis-1 kernel: [18292.174313] vulkan-smoketes[13385]: segfault at 0 ip 00007f62705a7914 sp 00007ffd0edc6260 error 6 in libvulkan_radeon.so[7f627053b000+18b000]

What should happen is a glxgears like window will open and render properly instead of segfaulting.
It appears that the culprit is an old or broken libvulkan_radeon.so provided by mesa-vulkan-drivers.

As a test I renamed the old lib and dropped in a newer version from oibaf's ppa. smoketest now passes steam's The Talos Princible and Mad Max now work properly in Vulkan mode as well instead of segfaulting.

This issue is also present on Zesty and fixed in both Oibaf's and Padoka's ppa's since April back when they were rolling 17.2 mesa.

I have no idea if libvulkan_intel.so also needs updating no hardware to check.

Lawrence A Fossi (darkfoss) wrote :
Lawrence A Fossi (darkfoss) wrote :
Lawrence A Fossi (darkfoss) wrote :

Original Talos log before swapping libdrm_radeon.so

Lawrence A Fossi (darkfoss) wrote :

System at a glance:

OS: Ubuntu 17.10 artful
Kernel: x86_64 Linux 4.13.0-12-generic
Uptime: 1d 5h 28m
Packages: 1846
Shell: bash 4.4.12
DE: GNOME
WM: GNOME Shell
WM Theme: Adwaita
GTK Theme: Ambiance [GTK2/3]
Icon Theme: Adwaita
Font: Ubuntu 11
CPU: AMD FX-8350 Eight-Core @ 8x 4GHz [27.0°C]
MB: CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
RAM:Crucial Ballistix Elite 16GB DDR3@1866 ===|===2005MiB / 16021MiB
GPU: AMD Radeon (TM) R9 Fury Series (AMD FIJI / DRM 3.18.0 / 4.13.0-12-generic, LLVM 5.0.0)
Monitor: Samsung S23A950D (16:9) 119.98 Hz displayport
Resolution: 1920x1080

Timo Aaltonen (tjaalton) wrote :

works fine on intel. mesa 17.2.2 is uploaded to the queue

Lawrence A Fossi (darkfoss) wrote :

After seeing you comment last night I did an apt-get purge of mesa-vulkan drivers and vulkan-utils then ran apt-get autoremove to remove libvulkan1 verified that all vulkan related files were gone then rebooted and reinstalled them.

Mesa upgraded fine to 17.2.2 this morning but the problem remains.
$ apt-cache policy mesa-vulkan-drivers
mesa-vulkan-drivers:
  Installed: 17.2.2-0ubuntu1
  Candidate: 17.2.2-0ubuntu1
  Version table:
 *** 17.2.2-0ubuntu1 500
        500 http://us.archive.ubuntu.com/ubuntu artful/universe amd64 Packages
        100 /var/lib/dpkg/status

vulkan-smoketest
WARNING: radv is not a conformant vulkan implementation, testing use only.
WARNING: radv is not a conformant vulkan implementation, testing use only.
Segmentation fault

from kernel log: Oct 5 10:22:17 Tardis-1 kernel: [ 4445.683796] vulkan-smoketes[4128]: segfault at 0 ip 00007f43321ef924 sp 00007ffdee3530a0 error 6 in libvulkan_radeon.so[7f4332183000+18b000]

from syslog: Oct 5 10:22:17 Tardis-1 kernel: [ 4445.683796] vulkan-smoketes[4128]: segfault at 0 ip 00007f43321ef924 sp 00007ffdee3530a0 error 6 in libvulkan_radeon.so[7f4332183000+18b000]

Timo Aaltonen (tjaalton) wrote :

ok, would you mind filing it upstream then? https://bugs.freedesktop.org -- Mesa, Drivers/Vulkan/radeon

Lawrence A Fossi (darkfoss) wrote :

I didn't realize there was libvulkan-dev package available I did another purge/clean install including the dev file and noticed there was extra information in vulkaninfo after..My appologies for forgetting to include one initially.
It comes down to 4 extensions not seen/available with the supplied libvulkan_radeon.so that become available once I drop in Oibaf's libvulkan_radeon.so . They are :

===============4 missing from default===========

 VK_KHR_image_format_list : extension revision 1
 VK_KHR_bind_memory2 : extension revision 1
 VK_AMD_rasterization_order : extension revision 1
 VK_KHX_multiview : extension revision 1

Hopefully the above might be helpful if a bit late

Lawrence A Fossi (darkfoss) wrote :

Vulkan info with oiba'f lib

Lawrence A Fossi (darkfoss) wrote :

missing 4 side by side.
If you still want me to file upstream I will.

Lawrence A Fossi (darkfoss) wrote :

After looking at the khronos registry spec VK_AMD_rasterization_order : extension revision 1
seems to be the key to have basic vulkan support on mesa 17.2.2 for amd cards. The other 3 are too new.

I guess this isn't really a bug but more of a much needed feature request to enable it. That would be Ubuntu's decision to make wouldn't it rather than an upstream one?

Timo Aaltonen (tjaalton) wrote :

no, missing extensions shouldn't make it crash, and even so it's an upstream bug and nothing Ubuntu can do about but pull new versions when they happen.

Lawrence A Fossi (darkfoss) wrote :

Fair enough Thank you.

Steven Ragan (stevecragan) wrote :

Isn't this already fixed upstream? If I use the Debian package (https://packages.debian.org/sid/mesa-vulkan-drivers) which is Mesa 17.2.3 it works fine for me, but if I switch back to Ubuntu's package it segfaults.

Timo Aaltonen (tjaalton) wrote :

sooo, turns out this is caused by debian/patches/vulkan-mir.patch, which needs to be either dropped or updated to cover radv too

Changed in mesa (Ubuntu):
assignee: nobody → Timo Aaltonen (tjaalton)
status: New → Triaged
Lawrence A Fossi (darkfoss) wrote :

Thanks for the update. I'm glad to see I wasn't imagining things or a user/hardware issue on my end.
When Ubuntu updated the mesa packages to 17.2.2 I made sure apport had the extra packages installed it needed. I also manually installed the 4 vulkan ddbug packages before running the vulkan-smoketest. Apport kicked in and auto-reported it. Hopefully that generated report helped since it does a far better job that my first attempt.

After I rebuilt the Ubuntu vulkan-utils package and used quilt to restore the amd mesa source but was unable to rebuilt mesa for the mesa vulkan drivers so was unable to test them. I had observed what the patch was doing and came to the decision that radv was being disabled by design. Thats why I didn't report upstream I didn't want to waste dev time for what appeared to be a packaging choice. My apologies for that.

Timo Aaltonen (tjaalton) wrote :

a new version has been uploaded to xenial/artful/bionic, and for X/A you can find it also on the updates ppa:

https://launchpad.net/~ubuntu-x-swat/+archive/ubuntu/updates

(artful still building ATM)

Changed in mesa (Ubuntu):
status: Triaged → In Progress
Lawrence A Fossi (darkfoss) wrote :

Awesome Thank you. I'll revert from Oibaf's ppa later today looking forward to checking them out

Lawrence A Fossi (darkfoss) wrote :

I added the ppa vulkan-smoketest now runs correctly as well as The Talos Principle. Just one issue left vulkaninfo shows an elf class error on both of the mesa-vulkan-drivers:i386 packages

Timo Aaltonen (tjaalton) wrote :

thanks for testing!

32bit drivers need 32bit vulkaninfo, and I don't know how to provide that for 64bit.. so it's a non-issue right now

Lawrence A Fossi (darkfoss) wrote :

to be specific the libvulkan_intel.so and libvulkan_radeon.so i386 provided by mesa-vulkan-drivers:i386

also works great with $ uname -a
Linux 4.14.0-041400rc8-generic #201711052313 SMP Sun Nov 5 23:14:08 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Lawrence A Fossi (darkfoss) wrote :

Didn't see your last post no worries I'm not really too concerned about i386 was just trying to be as complete as possible. I just ran The Talos Princible in moddable version non x64 and it completed just fine despite the elf class warning no loss of fps or any visible graphical errors so really not a big issue. Thank you can't wait for 17.2.4 to hit the official update repo's.
As far as I'm comcerned issues fixed.

When should we expect the baseline mesa-vulkan-drivers to get that patch for all users of the regular repository? Just so our support agents know what to tell affected Ubuntu users.

Timo Aaltonen (tjaalton) wrote :

After the package has been reviewed and accepted to enter -proposed it'll usually take 7 days before an update is allowed to enter -updates, assuming it was verified to fix the issue during that period.

Hello Lawrence, or anyone else affected,

Accepted mesa into artful-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mesa/17.2.4-0ubuntu1~17.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-artful to verification-done-artful. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-artful. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in mesa (Ubuntu Artful):
status: New → Fix Committed
tags: added: verification-needed verification-needed-artful
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mesa - 17.2.4-0ubuntu2

---------------
mesa (17.2.4-0ubuntu2) bionic; urgency=medium

  * Import changes from 17.2.2-0ubuntu2
  * Make mesa-va-drivers enhance libva2 rather than libva1.
  * vulkan-mir.patch: Dropped, breaks radeon vulkan driver. (LP: #1720890)

 -- Timo Aaltonen <email address hidden> Wed, 08 Nov 2017 16:29:58 +0200

Changed in mesa (Ubuntu):
status: In Progress → Fix Released
Adam Conrad (adconrad) wrote :

Hello Lawrence, or anyone else affected,

Accepted mesa into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mesa/17.2.4-0ubuntu1~16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in mesa (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed-xenial
Lawrence A Fossi (darkfoss) wrote :

Sorry to get back so late with the holiday and a few days testing.
17.2.4-proposed is working fine on:
lsb_release -rd
Description: Ubuntu 17.10
Release: 17.10

vulkan-smoketest now runs without issues. Radv works well with The Talos Priciple no performance loss/visual artifacts on multiple benches manually or with Phoronix test Suite compared to oibaf/padoka ppa's during their 17.2.x mesa runs.
OpenGL performance is unaffected.
Dvd playback was fine as well as streaming such as Netflix.

There were several packages all :i386 removed during the installation possibly induced by myself since it was the first time I have ever used aptitude.
 Removing libmirclient9:i386 (0.28.0+17.10.20171011.1-0ubuntu1) ...
Removing libmircommon7:i386 (0.28.0+17.10.20171011.1-0ubuntu1) ...
Removing libboost-filesystem1.62.0:i386 (1.62.0+dfsg-4build3) ...
Removing libmircore1:i386 (0.28.0+17.10.20171011.1-0ubuntu1) ...
Removing libboost-system1.62.0:i386 (1.62.0+dfsg-4build3) ...
Removing libcapnp-0.5.3:i386 (0.5.3-2ubuntu2) ...
Removing libmirprotobuf3:i386 (0.28.0+17.10.20171011.1-0ubuntu1) ...
Removing libprotobuf-lite10:i386 (3.0.0-9ubuntu5) ...
Removing libxkbcommon0:i386 (0.7.1-2) ...

Tested with kernels : 4.13.0-17
4.15.0-041500rc1, performance increase with rc1 is amazing.

Issue is fixed with no regression on my system.

tags: added: verification-done-artful
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers