unity-system-compositor crashed with "Failed to schedule page flip", but only after nouveau crashed in glClear().

Bug #1623507 reported by Pablo Caviglia
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical System Image
New
Undecided
Unassigned
Mir
Triaged
Medium
Unassigned
libdrm (Ubuntu)
New
High
Unassigned
mesa (Ubuntu)
New
High
Unassigned
mir (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

16.10 first unity 8 run

ProblemType: Crash
DistroRelease: Ubuntu 16.10
Package: unity-system-compositor 0.7.1+16.10.20160824-0ubuntu1
ProcVersionSignature: Ubuntu 4.4.0-9136.55-generic 4.4.16
Uname: Linux 4.4.0-9136-generic x86_64
ApportVersion: 2.20.3-0ubuntu7
Architecture: amd64
Date: Wed Sep 14 10:34:45 2016
ExecutablePath: /usr/sbin/unity-system-compositor
GraphicsCard:
 Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [1462:7817]
 NVIDIA Corporation G92 [GeForce 9800 GT] [10de:0614] (rev a2) (prog-if 00 [VGA controller])
   Subsystem: eVga.com. Corp. G92 [GeForce 9800 GT] [3842:c988]
InstallationDate: Installed on 2016-09-14 (0 days ago)
InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Alpha amd64 (20160913)
ProcCmdline: /usr/sbin/unity-system-compositor --disable-inactivity-policy=true --on-fatal-error-abort --file /run/mir_socket --from-dm-fd 11 --to-dm-fd 14 --vt 7
ProcEnviron:

Signal: 6
SourcePackage: unity-system-compositor
StacktraceTop:
 mir::fatal_error_abort(char const*, ...) () from /usr/lib/x86_64-linux-gnu/libmircommon.so.6
 ?? () from /usr/lib/x86_64-linux-gnu/mir/server-platform/graphics-mesa-kms.so.10
 ?? () from /usr/lib/x86_64-linux-gnu/libmirserver.so.41
 ?? () from /usr/lib/x86_64-linux-gnu/libmirserver.so.41
 ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
Title: unity-system-compositor crashed with SIGABRT in mir::fatal_error_abort()
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

version.libdrm: libdrm2 2.4.70-1
version.lightdm: lightdm 1.19.4-0ubuntu1
version.mesa: libegl1-mesa-dev N/A

Revision history for this message
Pablo Caviglia (pablo-caviglia) wrote :
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 mir::fatal_error_abort(char const*, ...) (reason=0x7fe1e46e998c "Failed to schedule page flip") at ./src/common/fatal/fatal.cpp:43
 mir::graphics::mesa::DisplayBuffer::post() (this=0x5608093664b0) at ./src/platforms/mesa/server/kms/display_buffer.cpp:287
 mir::compositor::CompositingFunctor::operator()() (this=0x56080947b900) at ./src/server/compositor/multi_threaded_compositor.cpp:143
 operator() () at /usr/include/c++/6/functional:2136
 execute (this=0x7fe1de20cc70) at ./src/server/thread/basic_thread_pool.cpp:40

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : StacktraceSource.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Changed in unity-system-compositor (Ubuntu):
importance: Undecided → Medium
tags: removed: need-amd64-retrace
information type: Private → Public
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: unity-system-compositor crashed with SIGABRT in mir::fatal_error_abort()

It appears this crash is a secondary crash that's come after an earlier one.

We have two compositor threads (two monitors connected). One of them has crashed in the renderer during glClear, which would be a nouveau bug(?).

Then our crash handler has tried to clean up and change VTs back to their original state, which races with the other compositor thread that's still running. And once the active VT changes, that second compositor thread will fail to schedule a page flip, which is expected because Mir no longer has DRM mastership.

So there are two bugs here:
  (1) nouveau crashed in glClear. That's the main problem because it was the trigger for the second.
  (2) Our DRM page flipping code can't deal with flip failures without crashing. But we already know about that thanks to bug 1489689 and bug 1584894.

affects: unity-system-compositor (Ubuntu) → mir (Ubuntu)
Changed in mir:
importance: Undecided → Medium
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I'm marking the nouveau/Mesa crash as high severity. And the secondary Mir crash is only medium because that's not really the problem at all.

Here's the Mesa crash:

#9 <signal handler called>
No locals.
#10 0x00007fe1e4c4f06c in pushbuf_kref () from /tmp/apport_sandbox_hSYpvb/usr/lib/x86_64-linux-gnu/libdrm_nouveau.so.2
No symbol table info available.
#11 0x00007fe1e4c4f709 in pushbuf_validate () from /tmp/apport_sandbox_hSYpvb/usr/lib/x86_64-linux-gnu/libdrm_nouveau.so.2
No symbol table info available.
#12 0x00007fe1e2fb25ef in nv50_state_validate (nv50=nv50@entry=0x5608093d9c40, mask=mask@entry=4096, validate_list=validate_list@entry=0x7fe1e35ef660 <validate_list_3d>, size=size@entry=25, dirty=dirty@entry=0x5608093da018, bufctx=0x5608093dd6f0) at ../../../../../src/gallium/drivers/nouveau/nv50/nv50_state_validate.c:554
        state_mask = <optimized out>
        ret = <optimized out>
        i = <optimized out>
#13 0x00007fe1e2fb2797 in nv50_state_validate_3d (nv50=nv50@entry=0x5608093d9c40, mask=mask@entry=4096) at ../../../../../src/gallium/drivers/nouveau/nv50/nv50_state_validate.c:564
        ret = <optimized out>
#14 0x00007fe1e2fb4e01 in nv50_clear (pipe=0x5608093d9c40, buffers=4, color=0x7fe1ebbf5acc, depth=1, stencil=0) at ../../../../../src/gallium/drivers/nouveau/nv50/nv50_surface.c:528
        push = 0x56080930e500
        fb = 0x5608093dae28
        i = <optimized out>
        j = <optimized out>
        k = <optimized out>
        mode = 0
#15 0x00007fe1e2c7ea32 in st_Clear (ctx=0x7fe1ebbf4010, mask=2) at ../../../src/mesa/state_tracker/st_cb_clear.c:463
        depthRb = <optimized out>
        stencilRb = <optimized out>
        quad_buffers = <optimized out>
        clear_buffers = 4
        i = <optimized out>
#16 0x00007fe1eb9249f1 in mir::renderer::gl::Renderer::render(std::vector<std::shared_ptr<mir::graphics::Renderable>, std::allocator<std::shared_ptr<mir::graphics::Renderable> > > const&) const (this=0x7fe1d00008c0, renderables=...) at ./src/renderers/gl/renderer.cpp:246

Changed in mesa (Ubuntu):
importance: Undecided → High
summary: unity-system-compositor crashed with SIGABRT in mir::fatal_error_abort()
+ [nouveau crashed during glClear()]
tags: added: nouveau
summary: - unity-system-compositor crashed with SIGABRT in mir::fatal_error_abort()
- [nouveau crashed during glClear()]
+ unity-system-compositor crashed with "Failed to schedule page flip", but
+ only after nouveau crashed in glClear().
Changed in libdrm (Ubuntu):
importance: Undecided → High
tags: added: unity8-desktop
Changed in mir:
status: New → Triaged
Changed in mir (Ubuntu):
status: New → Triaged
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Aha!

Thank you for taking the time to report this bug and helping to make Ubuntu better. This particular bug has already been reported and is a duplicate of bug 1553328, so it is being marked as such. Please look at the other bug report to see if there is any missing information that you can provide, or to see if there is a workaround for the bug. Additionally, any further discussion regarding the bug should occur in the other report. Feel free to continue to report any other bugs you may find.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.