[android] External monitor slows rendering

Bug #1532202 reported by Alan Griffiths
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Pocket Desktop
Fix Committed
High
kevin gunn
Mir
Fix Released
High
Alan Griffiths
mir (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Seen on mako (mir-0.18) - Gerry reports also on Flo.

Setup:

sudo stop lightdm
mirbacklight
$ mir_demo_server --file host --window-manager system-compositor --display-config sidebyside&
$ MIR_CLIENT_PERF_REPORT=log bin/mir_demo_server --host host --display-config sidebyside --launch-client mir_demo_client_egltriangle

Observe: triangle spins evenly with FPS around 60.

Test: Plug in external monitor

Expect: triangle spins evenly with FPS around 60.
Actual: triangle motion slows and jerks with FPS around 50.

Related branches

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Just for fun:

mir_demo_server --file host --window-manager system-compositor --custom-compositor adorning --background-color purple &
mir_demo_server --host host --display-config sidebyside --custom-compositor adorning --background-color blue&
mir_demo_client_egltriangle

Will also highlight when the two servers "own" the display

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Note: "--display-config clone" doesn't show this problem

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Using "mir_demo_standalone_render_surfaces --host $XDG_RUNTIME_DIR/mir_socket --no-file" as the client confirms the framerate drops from 60FPS to about 53FPS

Changed in mir:
status: New → Confirmed
importance: Undecided → High
kevin gunn (kgunn72)
Changed in canonical-pocket-desktop:
importance: Undecided → High
assignee: nobody → kevin gunn (kgunn72)
status: New → In Progress
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

First thing to check: Does the device have sufficient power to keep up and is our logic just holding it up?

You can sometimes answer that by checking CPU usage of the server+client processes when it's failing to keep up. Is it high?

summary: - External monitor slows rendering
+ [android] External monitor slows rendering
tags: added: performance
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

It's worth noting the Android code path is completely different for multi-monitor compared to desktop/mesa. kdub is the best person to ask about its design. I'm not sure how much parallelism there is built in allowing you to hit multiple frame deadlines per frame. Certainly we've put a lot of effort into getting it right on desktop/mesa already.

Revision history for this message
Kevin DuBois (kdub) wrote :

Android's set() function takes both displays at once and is guaranteed not to wait before returning from that call. Parallelism is achieved by setting fences for the incoming buffers.

IIRC, these devices will work the primary display with overlays, and the secondary display will be done via gles composition. That may shift around where we wait for fences, but needs some more investigation.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Notes from experimentation:

1. I've been unable to reproduce this effect without a nested server.

2. The slowdown manifests with the client on either the internal or the external display (or with clients on both).

3. There's a further (more dramatic) slowdown on the primary display if a second client connects to the host requesting the external display. Vis: bin/mir_demo_client_egltriangle -p -o2 -f

3. "--nbuffers = 0" doesn't help

4. "--disable-overlays on" reduces FPS further, but seems to cure the jerkiness

5. I sometimes see a crash on exiting the host server: *** Error in `mir_demo_server': corrupted double-linked list: 0x01e72d70 ***

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :
Download full text (5.9 KiB)

Something weird is happening in the host mir server. With the external display:

[1452530027.692778] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.696227] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=213µs
[1452530027.721833] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.738162] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.746433] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.767614] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=152µs
[1452530027.788246] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=152µs
[1452530027.802712] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=3326µs
[1452530027.819254] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=1800µs
[1452530027.823222] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.832897] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.838909] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=2807µs
[1452530027.855970] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.870468] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=2289µs
[1452530027.889451] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=1709µs
[1452530027.907306] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=2868µs
[1452530027.913562] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=213µs
[1452530027.934255] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=152µs
[1452530027.947898] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=518µs
[1452530027.960167] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=1495µs
[1452530027.969903] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=183µs
[1452530027.989436] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530027.996852] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=122µs
[1452530028.006070] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=152µs
[1452530028.018766] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=152µs
[1452530028.034850] frontend::MessageProcessor: mediator=0xb138413c: submit_buffer(), elapsed=152µs

Note the intermittent long (over 10 times) posts.

Without the external display:

[1452530459.993648] frontend::MessageProcessor: mediator=0xaf8030d4: submit_buffer(), elapsed=122µs
[1452530460.010556] frontend::MessageProcessor: mediator=0xaf8030d4: submit_buffer(), elapsed=122µs
[1452530460.027098] frontend::MessageProcessor: mediator=0xaf8030d4: submit_buffer(), elapsed=122µs
[1452530460.044067] frontend::MessageProcessor: mediator=0xaf8030d4: submit_buffer(), elapsed=213µs
[1452530460.060823] frontend:...

Read more...

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Yeah there are some hiccups there. Although the biggest one is 3326µs (3.3ms). That may make the difference between achieving 60Hz vs 50Hz.

Another thing to try is to disable the predictive bypass optimization:
   --composite-delay=0

I vaguely recall correct implementation of that optimization requires that it turn itself off when using multi-monitor. And it does so for mesa, but maybe we're missing the required multi-monitor detection on android:

    hwc_device.cpp: recommend_sleep = purely_overlays ? 10ms : 0ms;

That should be gated on "purely_overlays" and that only a single output is connected.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> Another thing to try is to disable the predictive bypass optimization:
> --composite-delay=0

Tried, no improvement

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :
Download full text (8.7 KiB)

The nested server sees a big increase in "render time" (from 1.5-2.5ms to 22.25ms for output #1) it also appears to be rendering to output #2 at a similar rate despite there being no updates visible on that output.

22.25ms is slow enough to be causing problems.

$ MIR_CLIENT_PERF_REPORT=log mir_demo_server --host host --display-config sidebyside&
$ MIR_CLIENT_PERF_REPORT=log mir_demo_client_egltriangle

[1452607758.598397] perf: mir_demo_client_egltriangle: 63.61 FPS, render time 1.58ms, buffer lag 46.22ms (3 buffers)
[1452607758.617656] perf: Mir nested display for output #1: 62.43 FPS, render time 4.52ms, buffer lag 44.11ms (3 buffers)
[1452607759.599679] perf: mir_demo_client_egltriangle: 59.94 FPS, render time 1.49ms, buffer lag 48.59ms (3 buffers)
[1452607759.619060] perf: Mir nested display for output #1: 59.94 FPS, render time 4.27ms, buffer lag 45.78ms (3 buffers)
[1452607760.600595] perf: mir_demo_client_egltriangle: 60.00 FPS, render time 1.71ms, buffer lag 48.28ms (3 buffers)
[1452607760.619975] perf: Mir nested display for output #1: 60.00 FPS, render time 4.38ms, buffer lag 45.66ms (3 buffers)
[1452607761.604715] perf: mir_demo_client_egltriangle: 59.76 FPS, render time 2.66ms, buffer lag 47.52ms (3 buffers)
[1452607761.621990] perf: Mir nested display for output #1: 59.94 FPS, render time 5.29ms, buffer lag 44.77ms (3 buffers)
[1452607762.605814] perf: mir_demo_client_egltriangle: 59.94 FPS, render time 2.59ms, buffer lag 47.42ms (3 buffers)
[1452607762.624675] perf: Mir nested display for output #1: 59.88 FPS, render time 5.57ms, buffer lag 44.52ms (3 buffers)
[1452607763.623363] perf: mir_demo_client_egltriangle: 59.98 FPS, render time 2.08ms, buffer lag 47.95ms (3 buffers)
[1452607763.640729] perf: Mir nested display for output #1: 60.03 FPS, render time 5.37ms, buffer lag 44.65ms (3 buffers)
[1452607764.639478] perf: mir_demo_client_egltriangle: 60.03 FPS, render time 2.33ms, buffer lag 47.68ms (3 buffers)
[1452607764.641645] perf: Mir nested display for output #1: 60.00 FPS, render time 5.36ms, buffer lag 44.73ms (3 buffers)
[1452607765.640454] perf: mir_demo_client_egltriangle: 60.00 FPS, render time 1.95ms, buffer lag 48.09ms (3 buffers)
[1452607765.644453] perf: Mir nested display for output #1: 59.88 FPS, render time 5.36ms, buffer lag 44.69ms (3 buffers)
[1452607766.658126] perf: mir_demo_client_egltriangle: 59.98 FPS, render time 1.92ms, buffer lag 48.15ms (3 buffers)
[1452607766.659530] perf: Mir nested display for output #1: 60.09 FPS, render time 5.28ms, buffer lag 44.69ms (3 buffers)
[1452607767.669327] perf: mir_demo_client_egltriangle: 59.34 FPS, render time 2.15ms, buffer lag 48.15ms (3 buffers)
[1452607767.675766] perf: Mir nested display for output #1: 59.05 FPS, render time 5.37ms, buffer lag 45.19ms (3 buffers)
[1452607768.676957] perf: mir_demo_client_egltriangle: 60.57 FPS, render time 1.53ms, buffer lag 48.18ms (3 buffers)
[1452607768.678330] perf: Mir nested display for output #1: 60.87 FPS, render time 4.59ms, buffer lag 44.94ms (3 buffers)
[1452607769.677109] perf: mir_demo_client_egltriangle: 60.00 FPS, render time 1.61ms, buffer lag 48.47ms (3 buffers)
[1452607769.679337] perf: Mir nested display f...

Read more...

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

OK, I understand at least part of what is going on:

If, in the nested server, I hack DefaultDisplayBufferCompositor::composite() to check for an empty renderable_list and not call the renderer in that case we stop posting to the external display and performance is maintained so long as the client surface remains on the internal output.

As soon as the client surface overlaps (or is entirely on) the external output performance drops. This is cured by applying the same hack in the host compositor.

So the question is: does this "don't render nothing" optimization belongs in the renderer or the compositor?

Changed in mir:
status: Confirmed → In Progress
assignee: nobody → Alan Griffiths (alan-griffiths)
milestone: none → 0.19.0
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I just noticed in comment #7:
    4. "--disable-overlays on" reduces FPS further, but seems to cure the jerkiness
which reminds me that multi-monitor and bypass/overlays are both scenarios that may each require an extra buffer (triple buffers). And in some cases using both at once necessitates quad-buffers to keep up.

So does this help?: --nbuffers=4

I'm thinking only in a nested setup would you be sure that your multiple monitors are all being overlaid/bypassed, so quad buffers may be required there.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

If you do find quad buffers are required then that's where my dynamic buffer scaling (presently disabled) comes in handy. As we can allow quad buffers but in most cases not use all of them.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

BTW, --composite-delay=0 needs to be tested in the system compositor. It's only used by the system compositor.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> So does this help?: --nbuffers=4

Worth trying, but no.

AIUI on each post by the client the nested server is scheduling a composite of both "display" surfaces - and both composites result in a post to the host server. And on each post to a nested "display" surface the host server is scheduling a composite of both "display" surfaces - so, although the external display never has anything on it, it can be drawn four times for each client post. At best that's inefficient, at worst it causes the slow rendering reported here.

But the "hack" testing for an empty renderable_list only addresses this for the artificial case of there being nothing on the external monitor. What should really be tested is whether any of the listed renderables has posted since the last render.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

It would be interesting to know if Kevin's thought is correct:

"Still thinking what could be the root cause. If the swapping (ie, producing gpu traffic) is slowing us down, maybe its a bus traffic jam issue. (and if thats the case, maybe the n7 wouldn't have the problem)"

Anyone got an N7 to try?

In any case, the underlying inefficiency lies in LegacySceneChangeNotification not detailing updates to the scene which leads to all updates scheduling changes to all compositors. If, for example, the system compositor knew which screen had been posted to it would only need to schedule the corresponding display buffer compositor instead of scheduling all.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Limiting screen redraws just means we're on the edge of what the hardware is capable of, so you need to get desperate and creative. That is assuming we're on the edge of the hardware's capabilities (please answer comment #4).

Regional (sub-screen) updates should be avoided because you need to do full screen buffer swaps to take advantage of the extra smoothness that triple/double buffering provides. Certainly in Compiz, avoiding regional updates and just doing fullscreen page flips actually improved frame rates in the majority of cases. However that's different to skipping rendering on specific monitors. You can do that without being penalized...

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

CPU usage doesn't seem to be the limiting factor:

Tasks: 215 total, 1 running, 214 sleeping, 0 stopped, 0 zombie
%Cpu(s): 15.9 us, 38.0 sy, 0.0 ni, 17.6 id, 28.4 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 1878412 total, 1055904 used, 822508 free, 46400 buffers
KiB Swap: 32764 total, 0 used, 32764 free. 799416 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 3676 phablet 20 0 189476 17368 11784 S 72.9 0.9 0:18.33 mir_demo_server
 3749 phablet 20 0 171896 14652 9532 S 15.8 0.8 0:05.08 mir_demo_server
...
 3762 phablet 20 0 60280 8676 6488 S 3.3 0.5 0:01.22 mir_demo_client

description: updated
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

The linked fix addresses the scenario in the description - so this bug can close accordingly.

As there are still opportunities to improve when using a custom compositor I've raised lp:1535894 to cover that.

Changed in mir:
milestone: 0.19.0 → 0.20.0
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir at revision None, scheduled for release in mir, milestone 0.20.0

Changed in mir:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.3 KiB)

This bug was fixed in the package mir - 0.20.0+16.04.20160219-0ubuntu1

---------------
mir (0.20.0+16.04.20160219-0ubuntu1) xenial; urgency=medium

  [ Alan Griffiths ]
  * New upstream release 0.20.0 (https://launchpad.net/mir/+milestone/0.20.0)
    - ABI summary: Only servers need rebuilding;
      . mirclient ABI unchanged at 9
      . mirserver ABI bumped to 38
      . mircommon ABI unchanged at 5
      . mirplatform ABI unchanged at 11
      . mirprotobuf ABI unchanged at 3
      . mirplatformgraphics ABI bumped to 8
      . mirclientplatform ABI unchanged at 4
      . mirinputplatform ABI unchanged at 5
    - Enhancements:
      . Allow screencasting to create a virtual output (for Miracast)
      . Separate the protocol version number from the client API version macros.
        They're not meant to be related concepts.
      . Add UBSanitizer to the list of build types.
      . logging: Human readable timestamps in DumbConsoleLogger.
      . examples: AdorningDisplayBufferCompositor::composite() no long ignores
        output boundaries and occlusions.
      . examples: Add -a <app name> option to eglapps.
      . common, client: a more flexible way to probe modules: once we've found
        a good current platform we don't even try to load an older one.
      . Fix build and test run with CMAKE_BUILD_TYPE=ThreadSanitizer (missing
        locks).
      . Add MIR_USE_LD_GOLD build option.
    - Bug fixes:
      . unity-system-compositor crashed with std::runtime_error in
        mir::compositor::CompositingFunctor::wait_until_started() from
        usc::MirScreen::set_screen_power_mode (mir_power_mode_on)
        (LP: #1528384)
      . Phone not usable while a call comes in - followed by "restart"
        (LP: #1532607)
      . ui freezes when simultaneously moving mouse & plug/unplug hdmi
        (LP: #1538632)
      . Mir fails to build on xenial today: android_graphic_buffer_allocator.h
        fatal error: hardware/hardware.h: No such file or directory
        (LP: #1539338)
      . [mali] egl_demo_client_flicker has graphics corruption on android
        (LP: #1517205)
      . [testsfail] Intermittent failure in
        TestClientCursorAPI.cursor_passed_through_nested_server (LP: #1525003)
      . [android] External monitor slows rendering (LP: #1532202)
      . Display::create_gl_context may create context with incorrect attributes
        (LP: #1539268)
      . unity-system-compositor locked up in __libc_do_syscall() (LP: #1543594)
      . NestedServer.client_sees_set_scaling_factor intermittent failure
        (LP: #1537798)
      . [android] External monitor slows rendering - part 2 (LP: #1535894)
      . scene: make sure not to set the swapinterval to 0 when an independent
        stream is created. The default should be 1 (like the stream created as
        part of surface creation).
      . Track the displays plugged state to avoid reporting configurations in
        case they are unplugged (LP #1531503). [Cherrypicked from 0.21]
      . mouse pointer support on emulator is broken (LP: #1517597).
        [Cherrypicked from 0.21]
      . move an android-only test that ended up in tests/unit-tests/graphics.
        (LP: ##154...

Read more...

Changed in mir (Ubuntu):
status: New → Fix Released
Changed in mir:
status: Fix Committed → Fix Released
Changed in canonical-pocket-desktop:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.