Mir

valgrind integration-tests: [ FAILED ] BespokeDisplayServerTestFixture.*

Bug #1212516 reported by Daniel van Vugt
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mir
Fix Released
Undecided
Alexandros Frantzis
mir (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

valgrind bin/integration-tests
...
[ FAILED ] 3 tests, listed below:
[ FAILED ] BespokeDisplayServerTestFixture.display_change_notification_reaches_all_clients
[ FAILED ] BespokeDisplayServerTestFixture.starting_display_server_starts_input_manager
[ FAILED ] BespokeDisplayServerTestFixture.client_drm_auth_magic_calls_platform

Summarized details:

[ RUN ] BespokeDisplayServerTestFixture.display_change_notification_reaches_all_clients
unknown file: Failure
C++ exception with description "Poll on readfd for pipe timed out" thrown in the test body.
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::runtime_error> >'
  what(): Poll on readfd for pipe timed out
[ FAILED ] BespokeDisplayServerTestFixture.display_change_notification_reaches_all_clients (62353 ms)

[ RUN ] BespokeDisplayServerTestFixture.starting_display_server_starts_input_manager
.../home/dan/bzr/mir/trunk/tests/mir_test_framework/testing_process_manager.cpp:191: Failure
Value of: result.succeeded()
  Actual: false
Expected: true
process::Result(child_terminated_by_signal, signal(15), )
[ FAILED ] BespokeDisplayServerTestFixture.starting_display_server_starts_input_manager (156 ms)

[ RUN ] BespokeDisplayServerTestFixture.client_drm_auth_magic_calls_platform
==6057== Invalid read of size 8
==6057== at 0x675DCE1: mir::protobuf::DisplayServer_Stub::drm_auth_magic(google::protobuf::RpcController*, mir::protobuf::DRMMagic const*, mir::protobuf::DRMAuthMagicStatus*, google::protobuf::Closure*) (mir_protobuf.pb.cc:6987)
==6057== by 0x5AFB517: MirConnection::drm_auth_magic(unsigned int, void (*)(int, void*), void*) (mir_connection.cpp:238)
==6057== by 0x5AF7DBC: mir_connection_drm_auth_magic (mir_client_library.cpp:348)
==6057== by 0x74C945: BespokeDisplayServerTestFixture_client_drm_auth_magic_calls_platform_Test::TestBody()::Client::exec() (test_drm_auth_magic.cpp:125)
==6057== by 0x76A944: mir_test_framework::TestingProcessManager::launch_client_process(mir_test_framework::TestingClientConfiguration&) (testing_process_manager.cpp:111)
==6057== by 0x768300: mir_test_framework::BespokeDisplayServerTestFixture::launch_client_process(mir_test_framework::TestingClientConfiguration&) (display_server_test_fixture.cpp:62)
==6057== by 0x74CAB9: BespokeDisplayServerTestFixture_client_drm_auth_magic_calls_platform_Test::TestBody() (test_drm_auth_magic.cpp:132)
==6057== by 0x7AA53F: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x7A5927: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x78D8BE: testing::Test::Run() (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x78E07D: testing::TestInfo::Run() (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x78E6FE: testing::TestCase::Run() (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==6057==
==6057==
==6057== Process terminating with default action of signal 11 (SIGSEGV)
==6057== Access not within mapped region at address 0x0
==6057== at 0x675DCE1: mir::protobuf::DisplayServer_Stub::drm_auth_magic(google::protobuf::RpcController*, mir::protobuf::DRMMagic const*, mir::protobuf::DRMAuthMagicStatus*, google::protobuf::Closure*) (mir_protobuf.pb.cc:6987)
==6057== by 0x5AFB517: MirConnection::drm_auth_magic(unsigned int, void (*)(int, void*), void*) (mir_connection.cpp:238)
==6057== by 0x5AF7DBC: mir_connection_drm_auth_magic (mir_client_library.cpp:348)
==6057== by 0x74C945: BespokeDisplayServerTestFixture_client_drm_auth_magic_calls_platform_Test::TestBody()::Client::exec() (test_drm_auth_magic.cpp:125)
==6057== by 0x76A944: mir_test_framework::TestingProcessManager::launch_client_process(mir_test_framework::TestingClientConfiguration&) (testing_process_manager.cpp:111)
==6057== by 0x768300: mir_test_framework::BespokeDisplayServerTestFixture::launch_client_process(mir_test_framework::TestingClientConfiguration&) (display_server_test_fixture.cpp:62)
==6057== by 0x74CAB9: BespokeDisplayServerTestFixture_client_drm_auth_magic_calls_platform_Test::TestBody() (test_drm_auth_magic.cpp:132)
==6057== by 0x7AA53F: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x7A5927: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x78D8BE: testing::Test::Run() (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x78E07D: testing::TestInfo::Run() (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== by 0x78E6FE: testing::TestCase::Run() (in /home/dan/bzr/mir/trunk/build/bin/integration-tests)
==6057== If you believe this happened as a result of a stack
==6057== overflow in your program's main thread (unlikely but
==6057== possible), you can try to increase the size of the
==6057== main thread stack using the --main-stacksize= flag.
==6057== The main thread stack size used in this run was 8388608.
==6057==
==6057== HEAP SUMMARY:
==6057== in use at exit: 56,749 bytes in 829 blocks
==6057== total heap usage: 3,081 allocs, 2,252 frees, 229,994 bytes allocated
==6057==
==6057== LEAK SUMMARY:
==6057== definitely lost: 0 bytes in 0 blocks
==6057== indirectly lost: 0 bytes in 0 blocks
==6057== possibly lost: 14,595 bytes in 257 blocks
==6057== still reachable: 42,154 bytes in 572 blocks
==6057== suppressed: 0 bytes in 0 blocks
==6057== Rerun with --leak-check=full to see details of leaked memory
==6057==
==6057== For counts of detected and suppressed errors, rerun with: -v
==6057== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
/home/dan/bzr/mir/trunk/tests/mir_test_framework/testing_process_manager.cpp:127: Failure
Value of: result.succeeded()
  Actual: false
Expected: true
client terminate error=process::Result(child_terminated_by_signal, signal(11), )
/home/dan/bzr/mir/trunk/tests/integration-tests/test_drm_auth_magic.cpp:100: Failure
Actual function call count doesn't match EXPECT_CALL(*platform, drm_auth_magic(magic))...
         Expected: to be called once
           Actual: never called - unsatisfied and active
==6056==
==6056== HEAP SUMMARY:
==6056== in use at exit: 440 bytes in 10 blocks
==6056== total heap usage: 3,439 allocs, 3,429 frees, 257,864 bytes allocated
==6056==
==6056== LEAK SUMMARY:
==6056== definitely lost: 0 bytes in 0 blocks
==6056== indirectly lost: 0 bytes in 0 blocks
==6056== possibly lost: 130 bytes in 4 blocks
==6056== still reachable: 310 bytes in 6 blocks
==6056== suppressed: 0 bytes in 0 blocks
==6056== Rerun with --leak-check=full to see details of leaked memory
==6056==
==6056== For counts of detected and suppressed errors, rerun with: -v
==6056== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
/home/dan/bzr/mir/trunk/tests/mir_test_framework/testing_process_manager.cpp:191: Failure
Value of: result.succeeded()
  Actual: false
Expected: true
process::Result(child_terminated_normally, failure(1))
[ FAILED ] BespokeDisplayServerTestFixture.client_drm_auth_magic_calls_platform (1922 ms)

Related branches

summary: - integration-tests: BespokeDisplayServerTestFixture failing under
- valgrind
+ valgrind integration-tests: [ FAILED ] BespokeDisplayServerTestFixture.*
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Very frustrating. This bug has now vanished on *both* the machines I saw it on yesterday. I have tried bisecting and can't even reproduce it with older revisions.

Perhaps system updates resolved something...

Changed in mir:
status: New → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I take that back. I just got one failure on a much older revision than expected. Bisecting again...

Changed in mir:
status: Incomplete → In Progress
assignee: nobody → Daniel van Vugt (vanvugt)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Update:
I only have one machine reproducing the bug now. And it's still not reliable. The test results are (+ pass, X fail):

979 +++++
978
977 ++++++++
976 XXX+++X
975 +XX+X+X
974 +
973
972 X
971
970 X++

So it looks like the issue was resolved in r977. The only problem is; comparing the valgrind output and the diff of r977 I cannot tell if the bug was fixed, or if we just changed the test cases sufficiently to accidentally stop triggering a side-effect.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

"exception with description "Poll on readfd for pipe timed out""

That looks like the unhelpful error message from CrossProcessSync I was looking at yesterday. I think that message could be improved, and that the default timeout for synchronization is probably too fast (for some machines under valgrind).

C.f. https://code.launchpad.net/~alan-griffiths/mir/fix-1212518/+merge/180313/comments/407686

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I tested the timeout yesterday and found increasing it (in r976) did indeed solve the problem. However I'm still concerned about how such benign things could lead to the side-effect-failure of client_drm_auth_magic_calls_platform. Needs more investigation.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK, I understand this issue well enough to no longer be concerned. I will however propose test case changes so that worrying valgrind errors like in the bug description cannot happen in future.

Fix committed to lp:mir at revision 977.

Changed in mir:
assignee: Daniel van Vugt (vanvugt) → nobody
assignee: nobody → Alexandros Frantzis (afrantzis)
status: In Progress → Fix Committed
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir at revision None, scheduled for release in mir, milestone 0.0.10

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.0.9+13.10.20130821.1-0ubuntu1

---------------
mir (0.0.9+13.10.20130821.1-0ubuntu1) saucy; urgency=low

  [ Daniel van Vugt ]
  * Make compositor::Scene lockable. This allows us to do multiple
    operations on a scene atomically using a nice simple:
    std::lock_guard<Scene> lock(scene); In the short term, this is
    required by the bypass branch. In the longer term it will also be
    useful if/when Scene gets an iterator.
  * Check a connection is valid (not NULL) before trying to dereference
    it. Such a NULL dereference led to worrying valgrind errors seen in
    LP: #1212516. (LP: #1212516)

  [ Alan Griffiths ]
  * graphics: hard-wire nested Mir to create an output for every host
    output.

  [ Alexandros Frantzis ]
  * Allow clients to specify the output they want to place a surface in.
    Only fullscreen placements are supported for now, but the policy is
    easy to change. This MP breaks the client API/ABI, so I bumped the
    client ABI version. I took this opportunity to rename some fields in
    MirDisplayConfiguration to improve consistency. .

  [ Ubuntu daily release ]
  * Automatic snapshot from revision 993
 -- Ubuntu daily release <email address hidden> Wed, 21 Aug 2013 14:05:07 +0000

Changed in mir (Ubuntu):
status: New → Fix Released
Changed in mir:
milestone: none → 0.0.10
status: Fix Committed → Fix Released
Revision history for this message
Eleni Maria Stea (hikiko) wrote :

I got a failure of BespokeDisplayServerTextFixture* again today: https://bugs.launchpad.net/mir/+bug/1231341

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.