Mir

[testfail] Intermittent "Invalid read" in MirSurfaceSwapBuffersTest.swap_buffers_does_not_block_when_surface_is_not_composited

Bug #1334287 reported by Alan Griffiths on 2014-06-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
Medium
Alan Griffiths
mir (Ubuntu)
Medium
Unassigned

Bug Description

Looks like a race during shutdown

https://jenkins.qa.ubuntu.com/job/mir-team-mir-development-branch-utopic-amd64-ci/511/console

[ RUN ] MirSurfaceSwapBuffersTest.swap_buffers_does_not_block_when_surface_is_not_composited
==1490== Thread 2:
==1490== Invalid read of size 4
==1490== at 0x50B8F57: std::_Function_handler<void (), (anonymous namespace)::TimeoutFrameDroppingPolicy::TimeoutFrameDroppingPolicy(std::shared_ptr<mir::time::Timer> const&, std::chrono::duration<long, std::ratio<1l, 1000l> >, std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (timeout_frame_dropping_policy_factory.cpp:58)
==1490== by 0x507ADE8: std::function<void ()>::operator()() const (functional:2471)
==1490== by 0x50880C4: boost::asio::detail::wait_handler<(anonymous namespace)::AlarmImpl::update_timer()::{lambda(boost::system::error_code const&)#1}>::do_complete(boost::asio::detail::task_io_service*, boost::asio::detail::task_io_service_operation*, boost::system::error_code const&, unsigned long) (asio_main_loop.cpp:357)
==1490== by 0x508D604: boost::asio::detail::task_io_service::do_run_one(boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex>&, boost::asio::detail::task_io_service_thread_info&, boost::system::error_code const&) (task_io_service_operation.hpp:38)
==1490== by 0x5092020: boost::asio::detail::task_io_service::run(boost::system::error_code&) (task_io_service.ipp:153)
==1490== by 0x50876FD: mir::AsioMainLoop::run() (io_service.ipp:59)
==1490== by 0x50795A5: mir::DisplayServer::run() (display_server.cpp:212)
==1490== by 0x507498E: mir::run_mir(mir::ServerConfiguration&, std::function<void (mir::DisplayServer&)>) (run_mir.cpp:90)
==1490== by 0x71D974: std::thread::_Impl<std::_Bind_simple<mir_test_framework::ServerRunner::start_mir_server()::{lambda()#1} ()> >::_M_run() (server_runner.cpp:85)
==1490== by 0x683ADDF: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==1490== by 0x5CDE164: start_thread (pthread_create.c:309)
==1490== by 0x6DA14DC: clone (clone.S:111)
==1490== Address 0xd08d538 is 24 bytes inside a block of size 32 free'd
==1490== at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1490== by 0x50B8E1A: (anonymous namespace)::TimeoutFrameDroppingPolicy::~TimeoutFrameDroppingPolicy() (timeout_frame_dropping_policy_factory.cpp:32)
==1490== by 0x50BE92A: mir::compositor::BufferQueue::~BufferQueue() (unique_ptr.h:67)
==1490== by 0x50A9AE4: std::_Sp_counted_ptr_inplace<mir::compositor::BufferQueue, std::allocator<mir::compositor::BufferQueue>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (new_allocator.h:124)
==1490== by 0x50AA299: mir::compositor::BufferStreamSurfaces::~BufferStreamSurfaces() (shared_ptr_base.h:144)
==1490== by 0x50A9A94: std::_Sp_counted_ptr_inplace<mir::compositor::BufferStreamSurfaces, std::allocator<mir::compositor::BufferStreamSurfaces>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (new_allocator.h:124)
==1490== by 0x5116FB0: mir::scene::BasicSurface::~BasicSurface() (shared_ptr_base.h:144)
==1490== by 0x5130C95: std::_Sp_counted_ptr_inplace<mir::scene::BasicSurface, std::allocator<mir::scene::BasicSurface>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (new_allocator.h:124)
==1490== by 0x5114AE0: mir::scene::ApplicationSession::destroy_surface(mir::IntWrapper<mir::frontend::detail::SessionsSurfaceIdTag, int>) (shared_ptr_base.h:144)
==1490== by 0x50C6D4B: mir::frontend::SessionMediator::release_surface(google::protobuf::RpcController*, mir::protobuf::SurfaceId const*, mir::protobuf::Void*, google::protobuf::Closure*) (session_mediator.cpp:266)
==1490== by 0x50CE323: void mir::frontend::detail::invoke<mir::frontend::detail::ProtobufMessageProcessor, mir::frontend::detail::DisplayServer, mir::protobuf::DisplayServer, mir::protobuf::SurfaceId, mir::protobuf::Void>(mir::frontend::detail::ProtobufMessageProcessor*, mir::frontend::detail::DisplayServer*, void (mir::protobuf::DisplayServer::*)(google::protobuf::RpcController*, mir::protobuf::SurfaceId const*, mir::protobuf::Void*, google::protobuf::Closure*), mir::frontend::detail::Invocation const&) (template_protobuf_message_processor.h:73)
==1490== by 0x50CBC5A: mir::frontend::detail::ProtobufMessageProcessor::dispatch(mir::frontend::detail::Invocation const&) (protobuf_message_processor.cpp:152)
==1490==
[ OK ] MirSurfaceSwapBuffersTest.swap_buffers_does_not_block_when_surface_is_not_composited (986 ms)

Related branches

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

First a surface containing a TimeoutFrameDroppingPolicy is destroyed:

==1490== Address 0xd08d538 is 24 bytes inside a block of size 32 free'd
...
==1490== by 0x50C6D4B: mir::frontend::SessionMediator::release_surface(google::protobuf::RpcController*, mir::protobuf::SurfaceId const*, mir::protobuf::Void*, google::protobuf::Closure*) (session_mediator.cpp:266)

Then an alarm fires that tries to execute the lambda created in the TimeoutFrameDroppingPolicy constructor...

BANG!

Changed in mir:
assignee: nobody → Alan Griffiths (alan-griffiths)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Which is weird since:

    /**
     * \note Destruction of the Alarm guarantees that the callback will not subsequently be called
     */
    virtual ~Alarm() = default;

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

The problem is that AlarmImpl doesn't meet the above guarantee.

So, IIUC AlarmImpl::update_timer()'s lambda grabs a "data" pointer fiddles about and *then releases its lock*.

Then the TimeoutFrameDroppingPolicy is deleted which destroys "this" which calls cancel() and sets data->state to "cancelled".

Regardless of ~AlarmImpl() and AlarmImpl::cancel() having been called, data->callback() is invoked and chaos ensues.

Changed in mir:
status: New → In Progress
Changed in mir:
status: In Progress → Fix Committed
Changed in mir:
importance: Undecided → Medium
milestone: none → 0.4.0
Changed in mir:
status: Fix Committed → Fix Released
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

mir (0.4.0+14.10.20140701.1-0ubuntu1) utopic; urgency=medium

Changed in mir (Ubuntu):
importance: Undecided → Medium
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers