Mesa causes a segmentation fault on arm64 (wrong count of uniform locations)

Bug #1585942 reported by Alberto Mardegan
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Fix Released
Critical
Unassigned
mesa (Ubuntu)
Fix Released
Critical
Timo Aaltonen
Xenial
Fix Released
Undecided
Unassigned
ubuntu-system-settings-online-accounts (Ubuntu)
In Progress
Critical
Alberto Mardegan
webbrowser-app (Ubuntu)
Fix Released
Critical
Olivier Tilloy
Xenial
Fix Released
Undecided
Unassigned

Bug Description

This error appeared when running unit tests for a QML app in our Jenkins/silo infrastructure, on arm64 only: https://launchpadlibrarian.net/261581280/buildlog_ubuntu-yakkety-arm64.ubuntu-system-settings-online-accounts_0.7+16.10.20160525.1-0ubuntu1_BUILDING.txt.gz

Pasting the relevant lines here in case the link above goes away:

=======================
QT_PLUGIN_PATH=/usr/lib/aarch64-linux-gnu/qt5/plugins LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH} xvfb-run -s '-screen 0 640x480x24' -a dbus-test-runner -t ./tst_online_accounts_qml
DBus daemon: unix:abstract=/tmp/dbus-2tbhBHxLZq,guid=03f9df417d619b79067a68045745ad95
task-0: Started with PID: 16930
task-0: ********* Start testing of online_accounts_qml *********
task-0: Config: Using QtTest library 5.5.1, Qt 5.5.1 (arm64-little_endian-lp64 shared (dynamic) release build; by GCC 5.3.1 20160519)
task-0: PASS : online_accounts_qml::AccountCreationPage::initTestCase()
task-0: QWARN : online_accounts_qml::AccountCreationPage::test_fallback() file:///dummy/path/testPlugin/Main.qml: File not found
task-0: PASS : online_accounts_qml::AccountCreationPage::test_fallback()
task-0: QWARN : online_accounts_qml::AccountCreationPage::test_flickable() file:///dummy/path/testPlugin/Main.qml: File not found
task-0: PASS : online_accounts_qml::AccountCreationPage::test_flickable()
task-0: PASS : online_accounts_qml::AccountCreationPage::cleanupTestCase()
task-0: QWARN : online_accounts_qml::UnknownTestFunc() QEGLPlatformContext: Failed to make temporary surface current, format not updated
task-0: PASS : online_accounts_qml::AuthorizationPage::initTestCase()
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_1_one_account() file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:54: TypeError: Cannot call method 'indexOf' of undefined
task-0: PASS : online_accounts_qml::AuthorizationPage::test_1_one_account()
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_2_add_another(with button) file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:54: TypeError: Cannot call method 'indexOf' of undefined
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_2_add_another(with button) file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:79:23: Unable to assign [undefined] to QString
task-0: PASS : online_accounts_qml::AuthorizationPage::test_2_add_another(with button)
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_2_add_another(with button) [PERFORMANCE]: Last frame took 254 ms to render.
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_2_add_another(without button) file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:54: TypeError: Cannot call method 'indexOf' of undefined
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_2_add_another(without button) file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:79:23: Unable to assign [undefined] to QString
task-0: PASS : online_accounts_qml::AuthorizationPage::test_2_add_another(without button)
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_2_add_another(without button) [PERFORMANCE]: Last frame took 210 ms to render.
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_3_many_accounts(first account) [PERFORMANCE]: Last frame took 146 ms to render.
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_3_many_accounts(first account) file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:54: TypeError: Cannot call method 'indexOf' of undefined
task-0: QWARN : online_accounts_qml::AuthorizationPage::test_3_many_accounts(first account) file:///«BUILDDIR»/ubuntu-system-settings-online-accounts-0.7+16.10.20160525.1/online-accounts-ui/qml/AuthorizationPage.qml:79:23: Unable to assign [undefined] to QString
task-0: PASS : online_accounts_qml::AuthorizationPage::test_3_many_accounts(first account)
Mesa 11.2.1 implementation error: Failed to link fixed function fragment shader: error: count of uniform locations > MAX_UNIFORM_LOCATIONS(4294967295 > 98304)

Please report at https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa
task-0: Exited with status 139
task-0: Shutting down
DBus daemon: Shutdown
=======================

This happened 2 out of 2 times, after which I disabled running the unit tests for arm64.

Related branches

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mesa (Ubuntu):
status: New → Confirmed
Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

And it started happening after Qt on arm64 switched to OpenGL ES, similar to armhf.

Revision history for this message
Gerry Boland (gerboland) wrote :

arm64? Should it not be using a hybris-based driver instead of mesa? What hardware is this?

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

This is our arm64 builders without graphical output, running software rendering inside xvfb. Therefore Mesa is used. It's not a real test of any our target hardware that would use hardware accelerated OpenGL ES.

description: updated
Alberto Mardegan (mardy)
Changed in ubuntu-system-settings-online-accounts (Ubuntu):
assignee: nobody → Alberto Mardegan (mardy)
importance: Undecided → Critical
status: New → In Progress
Revision history for this message
Steve Langasek (vorlon) wrote :

This bug also breaks gammaray on arm64:

21/21 Test #21: quickinspectortest ...............***Exception: Other 1.40 sec
********* Start testing of QuickInspectorTest *********
Config: Using QtTest library 5.5.1, Qt 5.5.1 (arm64-little_endian-lp64 shared (dynamic) release build; by GCC 5.3.1 20160519)
[...]
QWARN : QuickInspectorTest::testCustomRenderModes() QEGLPlatformContext: Failed to make temporary surface current, format not updated
Mesa 11.2.1 implementation error: Failed to link fixed function fragment shader: error: count of uniform locations > MAX_UNIFORM_LOCATIONS(4294967295 > 98304)

Please report at https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa
QFatal in quickinspectortest (/«PKGBUILDDIR»/obj-qt5/bin/quickinspectortest)
START BACKTRACE:
1 /«PKGBUILDDIR»/obj-qt5/lib/libgammaray_core-qt5.5-aarch64.so.2.4.0(+0xc2128) [0x7f9138f128]
2 /«PKGBUILDDIR»/obj-qt5/lib/libgammaray_core-qt5.5-aarch64.so.2.4.0(+0xa6368) [0x7f91373368]
3 /usr/lib/aarch64-linux-gnu/libQt5Core.so.5(+0x91620) [0x7f9067a620]
4 /usr/lib/aarch64-linux-gnu/libQt5Core.so.5(QMessageLogger::fatal(char const*, ...) const+0x9c) [0x7f9067c0fc]
5 /usr/lib/aarch64-linux-gnu/libQt5Test.so.5(+0xad20) [0x7f91268d20]
6 [0x7f91484510]
7 /usr/lib/aarch64-linux-gnu/dri/swrast_dri.so(+0x9631c) [0x7f8c5dc31c]
END BACKTRACE
QFATAL : QuickInspectorTest::testCustomRenderModes() Received signal 11
FAIL! : QuickInspectorTest::testCustomRenderModes() Received a fatal error.
   Loc: [Unknown file(0)]
Totals: 6 passed, 1 failed, 0 skipped, 0 blacklisted
********* Finished testing of QuickInspectorTest *********

https://launchpad.net/ubuntu/+source/gammaray/2.4.0-1build1/+build/9894589

This problem apparently does not occur on armhf even though GLES is also used on that architecture.

It also definitely looks like a mesa bug to me. ./src/compiler/glsl/linker.cpp:link_shaders() calls ./src/compiler/glsl/linker.cpp:check_explicit_uniform_locations(), which returns -1 if ctx->Extensions.ARB_explicit_uniform_location is not set (which it doesn't have to be AFAICS), then assigns this -1 to an unsigned int variable and carries on using the value instead of treating this as an error.

Changed in mesa (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Steve Langasek (vorlon) wrote :

Since this is a crash in the swrast driver, it's worth noting that mesa is built with llvmpipe on armhf but not on arm64. This may account for why the bug manifests on arm64+gles but not on armhf+gles.

Revision history for this message
Steve Langasek (vorlon) wrote :

Rebuilding mesa with llvmpipe enabled on arm64, I'm able to get this test to pass as:

LIBGL_DRIVERS_PATH=../mesa-11.2.1/debian/libgl1-mesa-dri/usr/lib/aarch64-linux-gnu/dri/ xvfb-run -a -s "-screen 0 640x480x24" ./obj-qt5/bin/quickinspectortest

Not sure what negative impact there might be from enabling llvmpipe or arm64... that it works at all seems like a good sign to me.

Revision history for this message
Steve Langasek (vorlon) wrote :

patch to enable llvmpipe (and opencl) on arm64.

Revision history for this message
Steve Langasek (vorlon) wrote :

alternative patch; this doesn't quite work, it lets the quickinspectortest tests pass but when running the full test suite, strangely, it fails as follows:

21/21 Test #21: quickinspectortest ...............***Exception: Other 0.46 sec
libEGL warning: DRI2: failed to open swrast (search paths ../mesa-11.2.1/debian/libgl1-mesa-dri/usr/lib/aarch64-linux-gnu/dri/)
libEGL warning: DRI2: failed to open swrast (search paths ../mesa-11.2.1/debian/libgl1-mesa-dri/usr/lib/aarch64-linux-gnu/dri/)
********* Start testing of QuickInspectorTest *********
Config: Using QtTest library 5.5.1, Qt 5.5.1 (arm64-little_endian-lp64 shared (dynamic) release build; by GCC 5.3.1 20160519)
[...]
QFatal in quickinspectortest (/home/vorlon/gammaray-2.4.0/obj-qt5/bin/quickinspectortest)
START BACKTRACE:
1 /home/vorlon/gammaray-2.4.0/obj-qt5/lib/libgammaray_core-qt5.5-aarch64.so.2.4.0(+0xc2128) [0x7fae5f9128]
2 /home/vorlon/gammaray-2.4.0/obj-qt5/lib/libgammaray_core-qt5.5-aarch64.so.2.4.0(+0xa6368) [0x7fae5dd368]
3 /usr/lib/aarch64-linux-gnu/libQt5Core.so.5(+0x91620) [0x7fad8e1620]
4 /usr/lib/aarch64-linux-gnu/libQt5Core.so.5(QMessageLogger::fatal(char const*, ...) const+0x9c) [0x7fad8e30fc]
5 /usr/lib/aarch64-linux-gnu/libQt5Quick.so.5(QSGRenderLoop::handleContextCreationFailure(QQuickWindow*, bool)+0x194) [0x7fae2dbfac]
END BACKTRACE
QFATAL : QuickInspectorTest::testModelsReparent() Failed to create OpenGL context for format QSurfaceFormat(version 2.0, options QFlags(), depthBufferSize 24, redBufferSize -1, greenBufferSize -1, blueBufferSize -1, alphaBufferSize -1, stencilBufferSize 8, samples -1, swapBehavior 2, swapInterval 1, profile 0)
FAIL! : QuickInspectorTest::testModelsReparent() Received a fatal error.
   Loc: [Unknown file(0)]
Totals: 1 passed, 1 failed, 0 skipped, 0 blacklisted
********* Finished testing of QuickInspectorTest *********

it's unclear to me why this shows up as a failure to load the driver when this loaded perfectly well when calling the test case directly.

Anyway, patch attached for consideration.

tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-system-settings-online-accounts - 0.7+16.10.20160610-0ubuntu1

---------------
ubuntu-system-settings-online-accounts (0.7+16.10.20160610-0ubuntu1) yakkety; urgency=medium

  [ Alberto Mardegan ]
  * Immediately accept degenerate requests from unconfined processes
    (LP: #1582824)
  * Replace incorrect usage of UbuntuColors (LP: #1581047)
  * Update pot file for translations (LP: #1533091)
  * Open popups in an overlaid webview (LP: #1428591)
  * Skip tests on arm64 (LP: #1585942)

  [ Alberto Mardegan, Timo Jyrinki ]
  * Stop depending on transitional packages. (LP: #1583079)

 -- Alberto Mardegan <email address hidden> Fri, 10 Jun 2016 09:46:37 +0000

Changed in ubuntu-system-settings-online-accounts (Ubuntu):
status: In Progress → Fix Released
Olivier Tilloy (osomon)
Changed in webbrowser-app (Ubuntu):
status: New → In Progress
assignee: nobody → Olivier Tilloy (osomon)
importance: Undecided → Critical
Changed in canonical-devices-system-image:
status: New → Confirmed
importance: Undecided → Critical
milestone: none → xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package webbrowser-app - 0.23+16.10.20160701.1-0ubuntu1

---------------
webbrowser-app (0.23+16.10.20160701.1-0ubuntu1) yakkety; urgency=medium

  * Generate UA override list files at build time to un-hardcode ubuntu
    and chromium version numbers. (LP: #1591220)
  * Fix one flaky autopilot test. (LP: #1591120)
  * Temporarily skip tests on arm64 to unblock package build without
    oxide. (LP: #1585942)
  * Update the target that lists non-compiled files.

 -- Olivier Tilloy <email address hidden> Fri, 01 Jul 2016 13:10:14 +0000

Changed in webbrowser-app (Ubuntu):
status: In Progress → Fix Released
Changed in canonical-devices-system-image:
status: Confirmed → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

I don't see why arm64 couldn't enable llvm like armhf. If it should be enabled on xenial too, then it would need to be dropped from lts-xenial backport because llvm-3.8 FTBFS on trusty arm64..

Revision history for this message
Steve Langasek (vorlon) wrote :

Yes, whatever fix is applied for mesa would certainly be needed for xenial, for the phone. Best to have this in SRU, to avoid forking mesa to the overlay ppa

Timo Aaltonen (tjaalton)
Changed in mesa (Ubuntu):
assignee: nobody → Timo Aaltonen (tjaalton)
tags: added: qt5.6
Revision history for this message
Adam Conrad (adconrad) wrote : Please test proposed package

Hello Alberto, or anyone else affected,

Accepted mesa into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mesa/11.2.0-1ubuntu2.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in mesa (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mesa - 12.0.1-3ubuntu2

---------------
mesa (12.0.1-3ubuntu2) yakkety; urgency=medium

  * debian/rules: Work around gcc ICE on ppc64el by forcing -O2.

 -- Adam Conrad <email address hidden> Fri, 22 Jul 2016 16:46:48 -0600

Changed in mesa (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

mesa xenial update is still waiting for verification

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

We have tested this change using xenial-overlay PPA for the phone and the earlier issues with segmentation faults during tests is gone. Not sure if there's anything else we need to test, didn't see an explicit test-case here. But all in all the ubuntu-touch problems caused by this bug are now gone.

So feel free to switch to verification-done if our testing is sufficient.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

good enough for me

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mesa - 11.2.0-1ubuntu2.1

---------------
mesa (11.2.0-1ubuntu2.1) xenial; urgency=medium

  * control, rules: Enable llvm/opencl on arm64. (LP: #1585942)

 -- Timo Aaltonen <email address hidden> Tue, 19 Jul 2016 10:53:33 +0300

Changed in mesa (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Update Released

The verification of the Stable Release Update for mesa has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

The mesa update should fix the issue for phone sw. You may remove any workarounds now.

Changed in canonical-devices-system-image:
status: Fix Committed → Fix Released
Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Reopening ubuntu-system-settings-online-accounts so that its arm64 tests would be re-enabled.

Changed in webbrowser-app (Ubuntu Xenial):
status: New → Fix Released
no longer affects: ubuntu-system-settings-online-accounts (Ubuntu Xenial)
Changed in ubuntu-system-settings-online-accounts (Ubuntu):
status: Fix Released → Triaged
Alberto Mardegan (mardy)
Changed in ubuntu-system-settings-online-accounts (Ubuntu):
status: Triaged → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.