Mir

[regression] acceptance tests fails in ServerDisconnect.causes_client_to_terminate_by_default

Bug #1364772 reported by Daniel van Vugt
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mir
Fix Released
High
Mir development team
mir (Ubuntu)
Fix Released
High
Unassigned
mir (Ubuntu RTM)
Fix Released
High
Unassigned

Bug Description

This is happening in CI fairly frequently now:

8: [ RUN ] ServerDisconnect.causes_client_to_terminate_by_default
8: /tmp/buildd/mir-0.7.0+14.10.20140829bzr1884pkg0utopic412/tests/acceptance-tests/test_server_disconnect.cpp:229: Failure
8: Value of: client_results[0].signal
8: Actual: 9
8: Expected: 1
8: [ FAILED ] ServerDisconnect.causes_client_to_terminate_by_default (501 ms)

The offending EXPECT_EQ is one I added recently:
  EXPECT_EQ(SIGHUP, client_results[0].signal);

However the unreliability of the result shows the root cause of the problem really was pre-existing in wait_for_shutdown_client_processes().

Related branches

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Signal 9 suggests we're killing the client prematurely (9 = SIGKILL) instead of waiting for it to die naturally.

summary: - [regressopn] acceptance tests fails in
+ [regression] acceptance tests fails in
ServerDisconnect.causes_client_to_terminate_by_default
Changed in mir:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Weird. It looks like it should be using the default timeout of 60 seconds, yet it's receiving SIGKILL after only half a second.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Bumped priority to high as this problem is causing a significant number of CI failures.

Changed in mir:
importance: Medium → High
Revision history for this message
Kevin DuBois (kdub) wrote :

So I poked around this bug today, I think that the kill signal is something valgrind is doing to dislodge a thread that it perceives as being blocked.

I cannot reproduce on my desktop running without valgrind, but can easily reproduce under valgrind running:
/usr/bin/valgrind --error-exitcode=1 --trace-children=yes --leak-check=full --show-leak-kinds=definite --errors-for-leak-kinds=definite bin/mir_acceptance_tests --gtest_filter="ServerDisconnect.causes_client_to_terminate_by_default" --gtest_repeat=10

(similar to what the CI infrastructure is doing)

Revision history for this message
Kevin DuBois (kdub) wrote :

also, if you turn on signal tracing in valgrind, all the failures have:

--13098-- sigvgkill for lwp 13099 tid 2

in common. (SIGVGKILL seems to be something valgrind has invented)

Kevin DuBois (kdub)
Changed in mir:
status: Triaged → Confirmed
assignee: nobody → Kevin DuBois (kdub)
status: Confirmed → In Progress
Changed in mir:
assignee: Kevin DuBois (kdub) → Mir development team (mir-team)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir/devel at revision None, scheduled for release in mir, milestone Unknown

Changed in mir:
status: In Progress → Fix Committed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

P.S. the regression came from:

------------------------------------------------------------
revno: 1881 [merge]
author: Daniel van Vugt <email address hidden>
committer: Tarmac
branch nick: development-branch
timestamp: Mon 2014-09-01 04:53:25 +0000
message:
  When the server disconnects unexpectedly, give the client SIGHUP instead of
  the present SIGTERM. The former is more appropriate because it indicates:
    "Hangup detected on controlling terminal or death of controlling process"
    [signal(7)]
  whereas SIGTERM is reserved for more polite user-requested shutdowns.
  .

  Approved by PS Jenkins bot, Alexandros Frantzis, Kevin DuBois.
------------------------------------------------------------

Changed in mir (Ubuntu):
importance: Undecided → High
status: New → Triaged
Changed in mir:
status: Fix Committed → Fix Released
Changed in mir (Ubuntu RTM):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.5 KiB)

This bug was fixed in the package mir - 0.8.0+14.10.20141010-0ubuntu1

---------------
mir (0.8.0+14.10.20141010-0ubuntu1) utopic; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.8.0 (https://launchpad.net/mir/+milestone/0.8.0)
    - Enhancements:
      . Less sensitivity to ABI breaks - many headers unused by external
        projects are now hidden and not installed by -dev packaes. If you
        require any headers that are missing, just ask.
      . Touchspots: --enable-touchspots to servers; visually shows touch
        locations (warning: This affects performance LP: #1373692).
      . Client performance reporting: Any Mir client can now get accurate
        performance information (frame rate, render time, buffer lag etc)
        logged to stdout. Just set env MIR_CLIENT_PERF_REPORT=log
      . Further improved touch responsiveness, with less lag and smoother
        scrolling (so long as you don't enable touchspots).
      . Slightly faster builds using precompiled headers.
      . Turn hardware overlays on by default. When in use, this halves the
        CPU usage of a Mir server. Already enabled in unity-system-compositor.
      . More scripting to detect ABI breaks.
      . Improved src/ tree consistency (renamed "src/shared" to "src/common").
      . Improved fatal signal design: Changed from SIGTERM to SIGHUP delivered
        to clients on unexpected server disconnection.
      . Improved library/package design to allow concurrent installations
        of different Mir versions without conflicts.
      . Fd reception code is now common to client and server.
    - ABI summary: Servers need rebuilding, but clients do not;
      . Mirclient ABI unchanged at 8
      . Mircommon ABI bumped to 2
      . Mirplatform ABI bumped to 3
      . Mirserver ABI bumped to 26
    - API changes between Mir 0.7 and 0.8:
      . Lots of headers removed from the public SDK! We have only hidden
        headers not known to be used by any known projects. Please let us
        know if anything is missing - https://bugs.launchpad.net/mir/+filebug
      . graphics::Platform - interface changed significantly.
      . Lots of server API changes to support touchspots.
      . File descriptors now passed as type Fd instead of int32_t.
    - Bug fixes:
      . [regression] Mir deb packages with versioned names cannot be installed
        simultaneously any more (LP: #1293944)
      . A frozen client can hang the whole server (LP: #1350207)
      . QtMir FTBFS: fatal error: mir/input/input_channel.h: No such file or
        directory (LP: #1365934)
      . [regression] platform-api fails to build against Mir 0.8 (LP: #1368354)
      . Mir FTBFS with gcc 4.9.1-14 (utopic update):
        auto_unblock_thread.h:44:46: error: no matching function for call to
        ‘std::thread::thread(<brace-enclosed initializer list>)’ (LP: #1369389)
      . [regression] Compositing is jerky and stutters during touch events
        (LP: #1372850)
      . unit test fails: AndroidInputReceiverSetup.slow_raw_input_doesnt_cause_
        frameskipping (LP: #1373826)
      . intermittent hang in TestClientInput (LP: #1338612)
      . TestClientInput.scene_obscure_mo...

Read more...

Changed in mir (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package mir - 0.8.0+14.10.20141005-0ubuntu1

---------------
mir (0.8.0+14.10.20141005-0ubuntu1) 14.09; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.8.0 (https://launchpad.net/mir/+milestone/0.8.0)
    - Enhancements:
      . Less sensitivity to ABI breaks - many headers unused by external
        projects are now hidden and not installed by -dev packaes. If you
        require any headers that are missing, just ask.
      . Touchspots: --enable-touchspots to servers; visually shows touch
        locations (warning: This affects performance LP: #1373692).
      . Client performance reporting: Any Mir client can now get accurate
        performance information (frame rate, render time, buffer lag etc)
        logged to stdout. Just set env MIR_CLIENT_PERF_REPORT=log
      . Further improved touch responsiveness, with less lag and smoother
        scrolling (so long as you don't enable touchspots).
      . Slightly faster builds using precompiled headers.
      . Turn hardware overlays on by default. When in use, this halves the
        CPU usage of a Mir server. Already enabled in unity-system-compositor.
      . More scripting to detect ABI breaks.
      . Improved src/ tree consistency (renamed "src/shared" to "src/common").
      . Improved fatal signal design: Changed from SIGTERM to SIGHUP delivered
        to clients on unexpected server disconnection.
      . Improved library/package design to allow concurrent installations
        of different Mir versions without conflicts.
      . Fd reception code is now common to client and server.
    - ABI summary: Servers need rebuilding, but clients do not;
      . Mirclient ABI unchanged at 8
      . Mircommon ABI bumped to 2
      . Mirplatform ABI bumped to 3
      . Mirserver ABI bumped to 26
    - API changes between Mir 0.7 and 0.8:
      . Lots of headers removed from the public SDK! We have only hidden
        headers not known to be used by any known projects. Please let us
        know if anything is missing - https://bugs.launchpad.net/mir/+filebug
      . graphics::Platform - interface changed significantly.
      . Lots of server API changes to support touchspots.
      . File descriptors now passed as type Fd instead of int32_t.
    - Bug fixes:
      . [regression] Mir deb packages with versioned names cannot be installed
        simultaneously any more (LP: #1293944)
      . A frozen client can hang the whole server (LP: #1350207)
      . QtMir FTBFS: fatal error: mir/input/input_channel.h: No such file or
        directory (LP: #1365934)
      . [regression] platform-api fails to build against Mir 0.8 (LP: #1368354)
      . Mir FTBFS with gcc 4.9.1-14 (utopic update):
        auto_unblock_thread.h:44:46: error: no matching function for call to
        ‘std::thread::thread(<brace-enclosed initializer list>)’ (LP: #1369389)
      . [regression] Compositing is jerky and stutters during touch events
        (LP: #1372850)
      . unit test fails: AndroidInputReceiverSetup.slow_raw_input_doesnt_cause_
        frameskipping (LP: #1373826)
      . intermittent hang in TestClientInput (LP: #1338612)
      . TestClientInput.scene_obscure_mot...

Read more...

Changed in mir (Ubuntu RTM):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.