Mir

[regression] SwitchingBundle in framedropping mode can hang

Bug #1306464 reported by Alexandros Frantzis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
Critical
Alexandros Frantzis
mir (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

SwitchingBundle in framedropping mode can hang. To reproduce use

bin/mir_integration_tests --gtest_filter=ApplicationSession.stress_test_take_snapshot --gtest_repeat=1000

with lp:mir/devel 1545. It seems that r1545 exposes the problem, but my understanding is that it doesn't cause it itself.

A sample run hangs with:

T1= client thread
T2 = one of the compositing threads

T1 client_acquire: enter (0x8974d8,nfree=0,nclients=0,nready=0,ncompositors=3)
T1 entering framedropping

T2 compositor_release: enter (0x8974d8,nfree=0,nclients=0,nready=0,ncompositors=3)
T2 complete_client_acquire: enter (0x8974d8,nfree=3,nclients=0,nready=0,ncompositors=0)
T2 complete_client_acquire: leave (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)
T2 compositor_release: leave (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)

T2 compositor_acquire: enter (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)
T2 compositor_acquire: leave (0x8974d8,nfree=1,nclients=1,nready=0,ncompositors=1)

T2 compositor_release: enter (0x8974d8,nfree=1,nclients=1,nready=0,ncompositors=1)
T2 compositor_release: leave (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)

T1 enters the framedropping clause in client_acquire and blocks waiting for nready to become != 0 and hangs

There are two problems here:

1. T1 is not notified when a free buffer becomes available so it can unblock (nor does it check for it in the while loop after blocking).

2. Even if T1 managed to unblock, complete_client_acquire() could be called twice. Once by compositor_release (or force_requests_to_complete for that matter), like in the trace above, and a second time in client_acquire() when T1 gets unblocked.

Related branches

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

This seems to be the cause of https://jenkins.qa.ubuntu.com/job/mir-mediumtests-runner-mako/1055/console , and we will surely see more such failures.

Changed in mir:
assignee: nobody → Alexandros Frantzis (afrantzis)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir/devel at revision 1550, scheduled for release in mir, milestone Unknown

Changed in mir:
status: New → Fix Committed
Changed in mir:
milestone: none → 0.1.9
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Looks like both issues were introduced in:

------------------------------------------------------------
revno: 1398 [merge]
author: Alan Griffiths <email address hidden>
committer: Tarmac
branch nick: development-branch
timestamp: Fri 2014-02-14 19:14:15 +0000
message:
  compositor: non-blocking implementation of SwitchingBundle::client_acquire. Fixes: https://bugs.launchpad.net/bugs/1267323.

  Approved by PS Jenkins bot, Alexandros Frantzis, Andreas Pokorny, Kevin DuBois.
------------------------------------------------------------

summary: - SwitchingBundle in framedropping mode can hang
+ [regression] SwitchingBundle in framedropping mode can hang
tags: added: regression-update
Changed in mir (Ubuntu):
importance: Undecided → Critical
status: New → Triaged
kevin gunn (kgunn72)
Changed in mir (Ubuntu):
status: Triaged → Fix Committed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The fix is not committed to lp:mir yet.

Changed in mir (Ubuntu):
status: Fix Committed → Triaged
Changed in mir:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.1.9+14.10.20140430.1-0ubuntu1

---------------
mir (0.1.9+14.10.20140430.1-0ubuntu1) utopic; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.1.9 (https://launchpad.net/mir/+milestone/0.1.9)
    - mirclient ABI unchanged, still at 7. Clients do not need rebuilding.
    - mirserver ABI bumped to 19. Shells need rebuilding.
    - More libmirserver class changes and reorganization, including;
      . Moving things from shell:: to scene::
      . Rewriting/refactoring surface factories.
    - Added an id() to Renderable.
    - Scene/Renderer interfaces:
      . Scene is no longer responsible for its own iteration (no for_each
        any more). Instead you should iterate over the list returned by
        Scene::generate_renderable_list().
    - Bugs fixed:
      . Stale socket issue. (LP: #1285215)
      . Qt render gets blocked on EGLSwapBuffers. (LP: #1292306)
      . Lock order violated found in helgrind (potential deadlock).
        (LP: #1296544)
      . [regression] SwitchingBundle in framedropping mode can hang.
        (LP: #1306464)
      . [DPMS] Display backlight turns back on almost immediately after
        being turned off. (LP: #1231857)
      . Wrong frame is seen on wake up/resume/unlock. (LP: #1233564)
      . Nested platform is not testable (LP: #1299101)
      . [regression] mir_demo_server_shell crashes on display resume.
        (LP: #1308941)
      . Multi-threaded composition is actually mostly serialized by
        SurfaceStack::guard. (LP: #1234018)
      . Mirscreencast slows down compositing and makes it very jerky.
        (LP: #1280938)
      . Mirscreencast can cause clients to render faster than the screen
        refresh rate. (LP: #1294361)
      . Screen turns on when a new session/surface appears. (LP: #1297876)
      . mir-doc package is >56MB in size, expands to >100MB of files.
        (LP: #1304998)
      . [regression] Clang: 'mir::test::doubles::MockSurface::visible'
        hides overloaded virtual function [-Woverloaded-virtual].
        (LP: #1301135)
      . [regression] GLRenderer* unit tests have recently become noisy.
        (LP: #1308905)
      . FocusController::set_focus_to() no longer seems to raise a session
        to the top. (LP: #1302689)

  [ Ubuntu daily release ]
  * New rebuild forced
 -- Ubuntu daily release <email address hidden> Wed, 30 Apr 2014 13:26:58 +0000

Changed in mir (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.