Mir

Mir client callbacks are not thread-safe

Bug #1194384 reported by Daniel van Vugt
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mir
Triaged
Medium
Unassigned
mir (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

Mir clients may not be aware, but the callbacks made by libmirclient are not thread safe in any way.

An apparently simple single-threaded Mir client has 3 threads, which can execute concurrently:

1. The "main" thread, where the initial connection was made.
2. The socket thread, where the protocol is handled (mir_connected_callback, mir_surface_callback, MirSurfaceEvent)
3. The input thread, where input events are received (MirMotionEvent, MirKeyEvent)

Yes, you could receive a MirSurfaceEvent and MirMotionEvent simultaneously in different threads. Neither of which are the main thread.

This is how a Mir client works. Unfortunately the existence of these threads is not yet documented. I will update the docs. But this bug is really about finding a way to make the client API more thread safe for simple toolkits and clients that normally don't care about threads. UPDATE: Client API docs improved in r789.

To make matters worse, the client API functions are generally not thread safe either: mir_surface_* mir_connection_*. So libmirclient forces you to run in multiple threads and yet doesn't protect you at all against races and corruption that could arise from calling libmirclient from varying threads. UPDATE: Fixed in r827

Related branches

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Another alternative solution would be to implement a mir_main_loop_wait() type function, which blocks and provides a mutual exclusion guarantee with input callbacks.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

If we had a kind of mir_main_loop_wait() function, which waits for events then that would be triply useful:
  1. We could use it to do event handling without callbacks, and with thread-safety guaranteed; and
  2. We could use it as a launching point to call the callback in the correct (main) thread; and
  3. We could eliminate the ugly sleep() calls in our demo clients.

summary: - Input event callbacks are made in a different thread
+ Event callbacks are made in a different thread
summary: - Event callbacks are made in a different thread
+ Event callbacks are not single-thread-safe
Changed in mir:
assignee: nobody → Daniel van Vugt (vanvugt)
Revision history for this message
Robert Carr (robertcarr) wrote : Re: Event callbacks are not single-thread-safe

I don't think this is bug. Toolkits have their own main loops and synchronization primitives. Which thread mirclient invokes the input handler callback on doesn't really have anything to do with which thread an application or toolkit ends up processing the event on.

I think 'Simple apps' should:
1. Use platform-api not libmirclient
2. Synchronize events themselves if required.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I'm talking about existing apps written for existing toolkits. GTK for example.

It is a very important issue. If you allow multiple threads to run concurrently without any locking, operating on the same data, they will race. Hence bugs.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Though the workaround is potentially simple:
Be aware that Mir is not thread safe and do your own locking in a toolkit port. It's a bit painful for apps and toolkits that don't have or yet need threading primitives though.

Changed in mir:
status: New → In Progress
Changed in mir:
importance: High → Medium
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK, I agree it's not ideal to reproduce a mutex API in Mir. So I'll hold off on that.

I will certainly improve our docs to highlight the issue, so that fewer people will get a rude shock in future. And if I can think of a more elegant solution than replicating chunks of PThread in Mir then I'll propose it.

Changed in mir:
milestone: 0.0.4 → 0.0.5
Changed in mir:
milestone: 0.0.5 → 0.0.6
summary: - Event callbacks are not single-thread-safe
+ Mir client callbacks are not thread-safe
description: updated
Changed in mir:
status: In Progress → Triaged
description: updated
description: updated
tags: added: clientapi
Changed in mir:
status: Triaged → In Progress
description: updated
description: updated
Changed in mir:
status: In Progress → Triaged
Changed in mir:
milestone: 0.0.6 → 0.0.7
Changed in mir:
milestone: 0.0.7 → none
milestone: none → 0.0.8
Changed in mir:
milestone: 0.0.8 → 0.0.9
description: updated
description: updated
Changed in mir:
assignee: Daniel van Vugt (vanvugt) → nobody
Changed in mir:
milestone: 0.0.9 → 0.0.10
Changed in mir:
milestone: 0.0.10 → 0.0.11
Revision history for this message
kevin gunn (kgunn72) wrote :

raising priority altho this will need to be a topic addressed via architecture followup sprint (tbd)

Changed in mir:
importance: Medium → High
Changed in mir:
milestone: 0.0.11 → 0.0.13
Revision history for this message
Chris Gagnon (chris.gagnon) wrote :

This is blocking automation in QA can this be given a higher priority?

tags: added: qa-touch
Changed in mir:
milestone: 0.0.13 → none
Changed in mir:
assignee: nobody → Daniel van Vugt (vanvugt)
Changed in mir:
status: Triaged → In Progress
milestone: none → 0.1.2
Changed in mir:
milestone: 0.1.2 → 0.1.3
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Linked an experimental client API enhancement which at least gets us half way there... working around the problem but not changing any existing logic: lp:~vanvugt/mir/event-queue

Changed in mir:
status: In Progress → Triaged
milestone: 0.1.3 → none
tags: added: input
Revision history for this message
kevin gunn (kgunn72) wrote :

daniel & I discussed this bug specifically, going to see if we can devote some attention to this.

Changed in mir:
assignee: Daniel van Vugt (vanvugt) → nobody
kevin gunn (kgunn72)
Changed in mir:
importance: High → Critical
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

As we already know, non-trivial toolkits can live with this issue. It's only native Mir apps and simple toolkits that need some help.

I have a solid plan in mind for solving this now, but we've lived with it so long it's clearly not critical.

Changed in mir:
importance: Critical → High
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

If I recall correctly, the aforementioned "plan" involved making the dequeuing of input events something the client API requests, rather than living in its own thread. I think I found this to be reasonably easy to do.

Don't remember anything else about the "solid plan" I had in July :(

Changed in xmir:
status: New → Confirmed
importance: Undecided → Wishlist
affects: xmir → xorg-server (Ubuntu)
tags: added: xmir
no longer affects: xorg-server (Ubuntu)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Dropped severity. Having been over two years now, we're obviously used to living with this.

Changed in mir:
importance: High → Medium
Revision history for this message
Alberto Aguirre (albaguirre) wrote :
Revision history for this message
Michał Sawicz (saviq) wrote :

Syncing task from Mir.

Changed in mir (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.