Comment 23 for bug 967416

Revision history for this message
Mark Harmer (drivehappy) wrote :

I thought I might take a crack at this since it looked interesting. I apologize for the length, I've added debugging details so hopefully it's clear. I'm new to the codebase and GTK, so I've added a lot so it can be double-checked.

I don't think this is a threading problem. Rather, it looks like a reentrant issue with with running a script. I've attached two full stacktraces that give some more details prior to the actual crash, I reference them below:

The first, stacktrace_remove_desktop.txt, shows the backtrace at the point where the last desktop is destroyed, SP_ACTIVE_DESKTOP is effectively set to NULL.

For brevity I'm posting a snippet:

#0 Inkscape::Application::remove_desktop
#1 sp_desktop_widget_dispose
~snip~
#14 sp_ui_close_all
~snip~
#34 g_main_loop_run
~snip~
#35 Inkscape::Extension::Implementation::Script::execute
#36Inkscape::Extension::Implementation::Script::save
#37 Inkscape::Extension::Output::save
#38 Inkscape::Extension::save
#39 file_save
     at file.cpp:666
#40 sp_file_save_dialog
~snip~
#60 g_main_loop_run
#61 IA__gtk_main
#62 sp_main_gui
#63 __libc_start_main

At this point it starts becoming clear that the Extension::Script::execute has launched another main loop while it waits on the child python script to complete execution, or at least Script::cancelProcessing is called.

I believe the script has not completed, therefore the input handler is run in the "context" of this new main loop, the UI saves and deletes the desktops (including SP_ACTIVE_DESKTOP). At some point the inner g_main_loop_run then returns, and eventually unwinds back to file_save, this then continues and hits the failure at the added debug messages and segfaults when dereferencing in file_save.

I've also added a stacktrace_after_initial_save_xcf.txt where I simply saved to a XCF file, waited a second then grabbed a backtrace to verify that a second g_main_loop was still executing. Again, I'm not familiar enough with GTK, but I suspect this second loop is handling the window events, so it actually appears that the UI thread has returned from the file save and is handling event in the "real" loop, but in fact the stack is still sitting deep within the file_save call.

I think this explains the problems where it happens to only occur when initially saving with certain formats (those that run scripts). I believe it also explains what appeared to be a threading issue since it was not 100% reproducible, actually it's more of a process-race on whether the script can complete before exiting.

I'm not really sure of the fix here, again I'm new to the codebase and I don't know the Extension/Scripts requirements. I'm not entirely sure why there's a need for a Glib::MainLoop in the script - as I would think that any script work would be done in the spawn_async_with_pipes. Or, maybe this intentionally/cleverly hides blocking on the UI thread for script results.