Intermittent crash at startup

Bug #1088724 reported by Peter Clifton
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gEDA
Fix Released
Critical
Unassigned

Bug Description

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff6891700 (LWP 12865)]
__GI_getenv (name=0x36451798c0 "NGUAGE") at getenv.c:90
90 getenv.c: No such file or directory.
(gdb) bt
#0 __GI_getenv (name=0x36451798c0 "NGUAGE") at getenv.c:90
#1 0x0000003645030b8c in guess_category_value (category=5, categoryname=<optimised out>)
    at dcigettext.c:1359
#2 __dcigettext (domainname=0x326808c892 "glib20", msgid1=0x3268d1062c "Exit on close",
    msgid2=0x0, plural=0, n=0, category=5) at dcigettext.c:575
#3 0x0000003268cbbfcf in g_dbus_connection_class_init (klass=0x7ffff0004040)
    at /build/buildd/glib2.0-2.34.1/./gio/gdbusconnection.c:969
#4 g_dbus_connection_class_intern_init (klass=0x7ffff0004040)
    at /build/buildd/glib2.0-2.34.1/./gio/gdbusconnection.c:523
#5 0x000000326842e926 in type_class_init_Wm (pclass=0x6b9e20, node=0x7ffff0003db0)
    at /build/buildd/glib2.0-2.34.1/./gobject/gtype.c:2217
#6 g_type_class_ref (type=type@entry=140737219935664)
    at /build/buildd/glib2.0-2.34.1/./gobject/gtype.c:2924
#7 0x0000003268416ecd in g_object_new_valist (
    object_type=object_type@entry=140737219935664,
    first_property_name=first_property_name@entry=0x3268cf81a0 "address",
    var_args=var_args@entry=0x7ffff6890978)
    at /build/buildd/glib2.0-2.34.1/./gobject/gobject.c:1796
#8 0x0000003268417374 in g_object_new (object_type=140737219935664,
    first_property_name=first_property_name@entry=0x3268cf81a0 "address")
    at /build/buildd/glib2.0-2.34.1/./gobject/gobject.c:1550
#9 0x0000003268cba067 in get_uninitialized_connection (bus_type=<optimised out>,
    cancellable=cancellable@entry=0x0, error=error@entry=0x7ffff6890b18)
    at /build/buildd/glib2.0-2.34.1/./gio/gdbusconnection.c:6805
#10 0x0000003268cc15ab in g_bus_get_sync (bus_type=<optimised out>, cancellable=0x0,
    error=0x7ffff6890b18) at /build/buildd/glib2.0-2.34.1/./gio/gdbusconnection.c:6878
#11 0x00007ffff6898555 in ?? ()
   from /usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so
#12 0x00007ffff689869d in ?? ()
---Type <return> to continue, or q <return> to quit---
   from /usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so
#13 0x0000003268047ab5 in g_main_dispatch (context=0x72aab0)
    at /build/buildd/glib2.0-2.34.1/./glib/gmain.c:2715
#14 g_main_context_dispatch (context=context@entry=0x72aab0)
    at /build/buildd/glib2.0-2.34.1/./glib/gmain.c:3219
#15 0x0000003268047de8 in g_main_context_iterate (context=context@entry=0x72aab0,
    block=block@entry=1, dispatch=dispatch@entry=1, self=<optimised out>)
    at /build/buildd/glib2.0-2.34.1/./glib/gmain.c:3290
#16 0x0000003268047ea4 in g_main_context_iteration (context=0x72aab0, may_block=1)
    at /build/buildd/glib2.0-2.34.1/./glib/gmain.c:3351
#17 0x00007ffff68984ad in ?? ()
   from /usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so
#18 0x000000326806b645 in g_thread_proxy (data=0x724450)
    at /build/buildd/glib2.0-2.34.1/./glib/gthread.c:797
#19 0x0000003645807e9a in start_thread (arg=0x7ffff6891700) at pthread_create.c:308
#20 0x00000036450f3cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#21 0x0000000000000000 in ?? ()

Tags: libgeda
Revision history for this message
Peter Clifton (pcjc2) wrote :

Having searched around for similar stack-traces, I see this one:

https://bugs.launchpad.net/ubuntu/+source/unity/+bug/817691

and:
https://bugs.launchpad.net/ubuntu/+source/epiphany-browser/+bug/1016923

The former suggests that there is a race between two threads, one calling setenv and one calling getenv.

See notes on a gio bug here:

https://bugzilla.gnome.org/show_bug.cgi?id=659326

We do call setenv in libgeda, to set the GEDADATA and GEDADATARC variables.

(in libgeda/src/s_basic.c)

We should _probably_ not be doing that... It is not as if we are launching child processes which need the env-var, is it?

Revision history for this message
Peter Clifton (pcjc2) wrote :

Ok - removing our setting of those env-vars breaks config, as our scheme file libgeda/scheme/geda/os.scm retrieves them from the environment.

We should probably expose a scheme API (from libgeda's C code) to retrieve those directories using the s_path_sys_config() and s_path_sys_data() APIs.

Something feels very chicken / egg about this situation :)

Still, I don't think we ought to be modifying the environment. I can confirm that removing our setenv calls, and fixing up my installed copy of os.scm with the correct paths seems so far to avoid the intermittent crash at startup.

Revision history for this message
Peter TB Brett (peter-b) wrote :

Oh eww. A lot of code depends on GEDADATA and GEDADATARC being set. :-(

Changed in geda:
importance: Undecided → Critical
milestone: none → 1.8.2
tags: added: libgeda
Changed in geda:
status: New → Confirmed
Revision history for this message
Peter Clifton (pcjc2) wrote :

An alternative might be to ensure we get those variables set before we initialise glib / GIO / ...

That is probably a fragile solution though, and we'd have to stop using g_* calls before this point in time.

OR.. we could wait and hope they mitigate the issue in glib with some locking. (But I wouldn't hold my breath). We probably ought to do something, as git HEAD is somewhat crashy on latest Ubuntu due to this.

Peter TB Brett (peter-b)
Changed in geda:
assignee: nobody → Peter TB Brett (peter-b)
status: Confirmed → In Progress
Revision history for this message
Peter TB Brett (peter-b) wrote :

Attaching patch that makes os.scm *not* use getenv.

Revision history for this message
Peter TB Brett (peter-b) wrote :

Add a patch that stops gEDA app default installations from using GEDADATA or GEDADATARC environment variables directly.

Revision history for this message
Peter Clifton (pcjc2) wrote :

The patches look good (to avoid our dependance on the environment variable we set), but will not fix the race, which is between our setenv (g_setenv) call and getenv calls within GLIB's threads.

(Reading our environment variables is not the problem, setting them is).

With the above patches, should config "just work" if we skip setting GEDADATA and GEDADATARC ?

Revision history for this message
Peter TB Brett (peter-b) wrote :

Well, for better or for worse, ensuring that GEDADATA and GEDADATARC are both set is effectively part of the libgeda API at this point. Which is a massive PITA, because it implies that the fix would break stable ABI / API. Grrrr.

I'm pretty cross with the GNOME devs for unilaterally introducing multithreading unavoidably to apps that were always previously single-threaded and have been written based on the assumption of a single thread. And if this is broken, who knows what other stuff they've carelessly trampled on?

What is the Right Thing for the 1.8 branch? 'master' is not currently stabilisable for a 1.10.0 release. :-(

Revision history for this message
gpleda.org commit robot (gpleda-launchpad-robot) wrote :

A commit was made which affects this bug
git master commit 143c46d43b7b25a4008c888f9c3f614d4a885b6a
http://git.geda-project.org/geda-gaf/commit/?id=143c46d43b7b25a4008c888f9c3f614d4a885b6a

commit 143c46d43b7b25a4008c888f9c3f614d4a885b6a
Author: Peter TB Brett <email address hidden>
Commit: Peter TB Brett <email address hidden>

    Avoid using getenv for GEDADATA/GEDADATARC in rc files.

    Affects-bug: lp-1088724

Revision history for this message
gpleda.org commit robot (gpleda-launchpad-robot) wrote :

A commit was made which affects this bug
git master commit a70681bbe264f9cb90ec0a7447c9587583aefc98
http://git.geda-project.org/geda-gaf/commit/?id=a70681bbe264f9cb90ec0a7447c9587583aefc98

commit a70681bbe264f9cb90ec0a7447c9587583aefc98
Author: Peter TB Brett <email address hidden>
Commit: Peter TB Brett <email address hidden>

    libgeda: Add %sys-data-dirs and %sys-config-dirs Scheme functions.

    Reduces the chances of encountering a possible GLib race condition
    involving environment variables.

    Reported-by: Peter Clifton <email address hidden>
    Affects-bug: lp-1088724

Revision history for this message
gpleda.org commit robot (gpleda-launchpad-robot) wrote :

A commit was made which affects this bug
git master commit b79c7ed32c222ae0b462f1b044f6642eb63f2188
http://git.geda-project.org/geda-gaf/commit/?id=b79c7ed32c222ae0b462f1b044f6642eb63f2188

commit b79c7ed32c222ae0b462f1b044f6642eb63f2188
Author: Peter TB Brett <email address hidden>
Commit: Peter TB Brett <email address hidden>

    libgeda: Don't set GEDADATA/GEDADATARC environment variables.

    Workaround for a GLib regression race condition that has been causing
    random crashes on recent versions of Ubuntu.

    Reported-by: Peter Clifton <email address hidden>
    Affects-bug: lp-1088724

Peter TB Brett (peter-b)
Changed in geda:
milestone: 1.8.2 → 1.9.1
status: In Progress → Fix Committed
Revision history for this message
Peter Clifton (pcjc2) wrote :

Regarding the 1.8.0 branch, I'm not sure.. perhaps we will have to break that (unwritten) part of the ABI.

OTOH..

Can we try and figure out the variables and set them early? (Before we fire up GUI stuff which may wake GIO into action).

The main thing we could do to reduce exposure would be to ensure we don't set the variable every time s_path_sys_data() or s_path_sys_config() are called. They should only need setting once.

Peter TB Brett (peter-b)
Changed in geda:
assignee: Peter TB Brett (peter-b) → nobody
Peter TB Brett (peter-b)
Changed in geda:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.