Loading libmirclient.so twice leads to a segfault in libmirprotobuf.so
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | Mir |
Fix Released
|
High
|
Daniel van Vugt | |
| | 0.14 |
Fix Released
|
High
|
Daniel van Vugt | |
| | mir (Ubuntu) |
High
|
Unassigned | ||
Bug Description
Can be reproduced with: load_twice libmircommon.so.1 (or .2)
For recent versions of mir use: load_twice libmirclient.so.X (currently .8)
load_twice.c:
#include <stdio.h>
#include <dlfcn.h>
int main(int argc, char** argv)
{
void *dl;
int i;
for (i = 0; i < 2; i++)
{
dl = dlopen (argv[1], RTLD_LAZY);
printf ("%d open dl: %p\n", i, dl);
if (dl)
dlclose (dl);
}
}
Related branches
- PS Jenkins bot: Approve (continuous-integration) on 2015-06-25
- Kevin DuBois (community): Approve on 2015-06-23
- Alexandros Frantzis (community): Approve on 2015-06-23
-
Diff: 117 lines (+85/-0)4 files modifieddebian/mir-test-tools.install (+1/-0)
tests/CMakeLists.txt (+1/-0)
tests/loader-tests/CMakeLists.txt (+34/-0)
tests/loader-tests/test_reload.c (+49/-0)
- PS Jenkins bot: Approve (continuous-integration) on 2015-06-25
- Robert Carr (community): Approve on 2015-06-24
- Kevin DuBois (community): Approve on 2015-06-23
- Alexandros Frantzis (community): Approve on 2015-06-23
-
Diff: 54 lines (+13/-5)3 files modifiedsrc/protobuf/CMakeLists.txt (+1/-1)
src/protobuf/google_protobuf_guard.cpp (+9/-0)
tests/loader-tests/CMakeLists.txt (+3/-4)
- Andreas Pokorny (community): Approve on 2015-06-30
- Kevin DuBois (community): Approve on 2015-06-30
-
Diff: 169 lines (+96/-1)7 files modifieddebian/changelog (+2/-0)
debian/mir-test-tools.install (+1/-0)
src/protobuf/CMakeLists.txt (+1/-1)
src/protobuf/google_protobuf_guard.cpp (+9/-0)
tests/CMakeLists.txt (+1/-0)
tests/loader-tests/CMakeLists.txt (+33/-0)
tests/loader-tests/test_reload.c (+49/-0)
| Daniel van Vugt (vanvugt) wrote : | #1 |
| Alexandros Frantzis (afrantzis) wrote : | #2 |
> Is there any more realistic use case that will trigger it?
Yes. The problem occurs even if libmircommon is indirectly loaded/
./load_twice libclutter-
because clutter-
| Changed in mir: | |
| assignee: | nobody → Alexandros Frantzis (afrantzis) |
| importance: | Undecided → High |
| Daniel van Vugt (vanvugt) wrote : | #3 |
Workaround(?):
Intentionally leak a handle to one of the libraries up the chain (even protobuf itself). That should ensure libprotobuf never gets re-inited and no crash.
| Daniel van Vugt (vanvugt) wrote : | #4 |
I mean if we simply: dlopen(
| Changed in mir: | |
| status: | New → Confirmed |
| Stu (stu-axon) wrote : | #5 |
Hi,
This is affecting me in a python program that doesn't directly use mir .. but calls into it via pypy -> pgi > mir *crash* !!
https:/
S
| summary: |
- Loading libmircommon.so twice leads to a segfault in protobuf code + Loading libmircommon.so twice leads to a segfault in libprotobuf.so |
| Changed in mir (Ubuntu): | |
| importance: | Undecided → High |
| status: | New → Confirmed |
| Alexandros Frantzis (afrantzis) wrote : Re: Loading libmircommon.so twice leads to a segfault in libprotobuf.so | #6 |
Analysis
--------
The core of the problem is that it's not safe to reuse libprotobuf after
calling google:
reloading the library first. Note that libraries/
(e.g. libmirprotobuf) use libprotobuf code also during static initialization,
that's why we see the issue when just loading/unloading these libraries.
libprotobuf uses flags to ensure initialization code is run once (see
GoogleInitOnce() and friends), but these flags are not reset during shutdown.
So, if the library is not reloaded and we try to call libprotobuf code
after shutdown, the code thinks that everything is already initialized and
ends up accessing data structures that have been deallocated.
In the scenario described in the bug, libprotobuf is not unloaded when
unloading the mir libraries because libstdc++ is bound to (i.e., needs a symbol from)
libprotobuf. libprotobuf contains an instantiation of a string template
function [1] from the standard c++ library and exports that instantiation as a
weak symbol (as expected). libstdc++ itself also needs the same template
function instantiation, and the linker resolves the symbol by choosing the
instantiation from libprotobuf (which is acceptable behavior).
Proposed solution
-----------------
Ensure libprotobuf can be reused after shutdown by properly resetting "once"
flags on ShutdownProtobu
flags are used extensively in the code base and also in generated code, so
great care needs to be taken. It's also not known if upstream is willing to
accept such patches.
[1] _ZNSs12_
| description: | updated |
| Alexandros Frantzis (afrantzis) wrote : | #7 |
Packages (for 2.6.1, vivid) with an _experimental_ patch that allows protobuf to re-initialize properly after google:
https:/
| Alexandros Frantzis (afrantzis) wrote : | #8 |
I have started experimenting with an alternative solution which is much simpler than what I originally proposed. The solution is to ensure libprotobuf exposes only the symbols it has to (i.e. google protobuf related), to avoid binding to libstdc++, and thus working around the problem.
I have pushed package protobuf-
https:/
This is an issue with SDL2. As with SDL2 it needs to load the OpenGL drivers to check if 3d acceleration is enabled. Once it checks it unloads the library. Then the app starts which needs to load the library again annnnd seg fault.
| Daniel van Vugt (vanvugt) wrote : | #10 |
The trivial one-line workaround proposed in comments #3/#4 might work. I think I've used it before as a workaround on other Unixes too.
| Changed in mir: | |
| status: | Confirmed → Triaged |
| Changed in mir (Ubuntu): | |
| status: | Confirmed → Triaged |
| Daniel van Vugt (vanvugt) wrote : | #11 |
Rather than running load_twice on libmirclient.so.* it should be run on libmirprotobuf.so.* directly. That's the source of the problem and does crash, but only if you change RTLD_LAZY to RTLD_NOW.
Loading libmirclient.so.* is just an indirect fudge to force it to try and resolve more symbols, but not necessary if you use RTLD_NOW on libmirprotobuf.so.* directly.
| summary: |
- Loading libmircommon.so twice leads to a segfault in libprotobuf.so + Loading libmirclient.so twice leads to a segfault in libmirprotobuf.so |
| Changed in mir: | |
| assignee: | Alexandros Frantzis (afrantzis) → Daniel van Vugt (vanvugt) |
| milestone: | none → 0.14.0 |
| status: | Triaged → In Progress |
| Changed in mir: | |
| milestone: | 0.14.0 → 0.15.0 |
| PS Jenkins bot (ps-jenkins) wrote : | #12 |
Fix committed into lp:mir at revision None, scheduled for release in mir, milestone 0.15.0
| Changed in mir: | |
| status: | In Progress → Fix Committed |
| Changed in mir: | |
| status: | Fix Committed → In Progress |
| Changed in mir: | |
| status: | In Progress → Fix Committed |
| Launchpad Janitor (janitor) wrote : | #14 |
This bug was fixed in the package mir - 0.14.0+
---------------
mir (0.14.0+
[ Andreas Pokorny ]
* Fix missing ABI renaming in Mirplatform
* Bump Mirserver platform graphics to 3
* Fix mirprotobuf ABI break
[ CI Train Bot ]
* New rebuild forced.
-- CI Train Bot <email address hidden> Wed, 22 Jul 2015 18:01:49 +0000
| Changed in mir (Ubuntu): | |
| status: | Triaged → Fix Released |
| Changed in mir: | |
| status: | Fix Committed → Fix Released |

Is it a use case we need to worry about right now? We already know the singleton init logic in protobuf is troublesome. That's why it got separated out of mircommon for Mir 0.9.
Obviously with a single driver loaded we have no issues (yet). But multi-driver support in future would probably hit this. Is there any more realistic use case that will trigger it?