[regression] SwitchingBundle in framedropping mode can hang

Bug #1306464 reported by Alexandros Frantzis on 2014-04-11
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fix Released
Alexandros Frantzis
mir (Ubuntu)

Bug Description

SwitchingBundle in framedropping mode can hang. To reproduce use

bin/mir_integration_tests --gtest_filter=ApplicationSession.stress_test_take_snapshot --gtest_repeat=1000

with lp:mir/devel 1545. It seems that r1545 exposes the problem, but my understanding is that it doesn't cause it itself.

A sample run hangs with:

T1= client thread
T2 = one of the compositing threads

T1 client_acquire: enter (0x8974d8,nfree=0,nclients=0,nready=0,ncompositors=3)
T1 entering framedropping

T2 compositor_release: enter (0x8974d8,nfree=0,nclients=0,nready=0,ncompositors=3)
T2 complete_client_acquire: enter (0x8974d8,nfree=3,nclients=0,nready=0,ncompositors=0)
T2 complete_client_acquire: leave (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)
T2 compositor_release: leave (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)

T2 compositor_acquire: enter (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)
T2 compositor_acquire: leave (0x8974d8,nfree=1,nclients=1,nready=0,ncompositors=1)

T2 compositor_release: enter (0x8974d8,nfree=1,nclients=1,nready=0,ncompositors=1)
T2 compositor_release: leave (0x8974d8,nfree=2,nclients=1,nready=0,ncompositors=0)

T1 enters the framedropping clause in client_acquire and blocks waiting for nready to become != 0 and hangs

There are two problems here:

1. T1 is not notified when a free buffer becomes available so it can unblock (nor does it check for it in the while loop after blocking).

2. Even if T1 managed to unblock, complete_client_acquire() could be called twice. Once by compositor_release (or force_requests_to_complete for that matter), like in the trace above, and a second time in client_acquire() when T1 gets unblocked.

Related branches

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

This seems to be the cause of https://jenkins.qa.ubuntu.com/job/mir-mediumtests-runner-mako/1055/console , and we will surely see more such failures.

Changed in mir:
assignee: nobody → Alexandros Frantzis (afrantzis)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir/devel at revision 1550, scheduled for release in mir, milestone Unknown

Changed in mir:
status: New → Fix Committed
Changed in mir:
milestone: none → 0.1.9
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Looks like both issues were introduced in:

revno: 1398 [merge]
author: Alan Griffiths <email address hidden>
committer: Tarmac
branch nick: development-branch
timestamp: Fri 2014-02-14 19:14:15 +0000
  compositor: non-blocking implementation of SwitchingBundle::client_acquire. Fixes: https://bugs.launchpad.net/bugs/1267323.

  Approved by PS Jenkins bot, Alexandros Frantzis, Andreas Pokorny, Kevin DuBois.

summary: - SwitchingBundle in framedropping mode can hang
+ [regression] SwitchingBundle in framedropping mode can hang
tags: added: regression-update
Changed in mir (Ubuntu):
importance: Undecided → Critical
status: New → Triaged
kevin gunn (kgunn72) on 2014-04-21
Changed in mir (Ubuntu):
status: Triaged → Fix Committed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The fix is not committed to lp:mir yet.

Changed in mir (Ubuntu):
status: Fix Committed → Triaged
Changed in mir:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.1.9+14.10.20140430.1-0ubuntu1

mir (0.1.9+14.10.20140430.1-0ubuntu1) utopic; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.1.9 (https://launchpad.net/mir/+milestone/0.1.9)
    - mirclient ABI unchanged, still at 7. Clients do not need rebuilding.
    - mirserver ABI bumped to 19. Shells need rebuilding.
    - More libmirserver class changes and reorganization, including;
      . Moving things from shell:: to scene::
      . Rewriting/refactoring surface factories.
    - Added an id() to Renderable.
    - Scene/Renderer interfaces:
      . Scene is no longer responsible for its own iteration (no for_each
        any more). Instead you should iterate over the list returned by
    - Bugs fixed:
      . Stale socket issue. (LP: #1285215)
      . Qt render gets blocked on EGLSwapBuffers. (LP: #1292306)
      . Lock order violated found in helgrind (potential deadlock).
        (LP: #1296544)
      . [regression] SwitchingBundle in framedropping mode can hang.
        (LP: #1306464)
      . [DPMS] Display backlight turns back on almost immediately after
        being turned off. (LP: #1231857)
      . Wrong frame is seen on wake up/resume/unlock. (LP: #1233564)
      . Nested platform is not testable (LP: #1299101)
      . [regression] mir_demo_server_shell crashes on display resume.
        (LP: #1308941)
      . Multi-threaded composition is actually mostly serialized by
        SurfaceStack::guard. (LP: #1234018)
      . Mirscreencast slows down compositing and makes it very jerky.
        (LP: #1280938)
      . Mirscreencast can cause clients to render faster than the screen
        refresh rate. (LP: #1294361)
      . Screen turns on when a new session/surface appears. (LP: #1297876)
      . mir-doc package is >56MB in size, expands to >100MB of files.
        (LP: #1304998)
      . [regression] Clang: 'mir::test::doubles::MockSurface::visible'
        hides overloaded virtual function [-Woverloaded-virtual].
        (LP: #1301135)
      . [regression] GLRenderer* unit tests have recently become noisy.
        (LP: #1308905)
      . FocusController::set_focus_to() no longer seems to raise a session
        to the top. (LP: #1302689)

  [ Ubuntu daily release ]
  * New rebuild forced
 -- Ubuntu daily release <email address hidden> Wed, 30 Apr 2014 13:26:58 +0000

Changed in mir (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers