segfault in g_io_add_watch on PPC

Bug #127424 reported by Alessandro Decina
4
Affects Status Importance Assigned to Milestone
ndesk-dbus-glib (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

All mono apps that use ndesk-dbus-glib in gutsy segfault like this:

$ tomboy
[DEBUG]: NoteManager created with note path "/home/dale/.tomboy".
Trying Plugin: Backlinks.dll ... BacklinksPlugin. [DEBUG]: Done.
Trying Plugin: Bugzilla.dll ... BugzillaPlugin. [DEBUG]: Done.
Trying Plugin: Evolution.dll ... EvolutionPlugin. [DEBUG]: Done.
Trying Plugin: ExportToHTML.dll ... ExportToHTMLPlugin. [DEBUG]: Done.
Trying Plugin: FixedWidth.dll ... FixedWidthPlugin. [DEBUG]: Done.
Trying Plugin: NoteOfTheDay.dll ... NoteOfTheDayPlugin. [DEBUG]: Done.
Trying Plugin: PrintNotes.dll ... PrintPlugin. [DEBUG]: Done.
Trying Plugin: StickyNoteImport.dll ... StickyNoteImporter. [DEBUG]: Done.
[DEBUG]: StickyNoteImporter: Sticky Notes XML file does not exist or is invalid!
Stacktrace:

  at (wrapper managed-to-native) NDesk.GLib.IO.g_io_add_watch (NDesk.GLib.IOChannel,NDesk.GLib.IOCondition,NDesk.GLib.IOFunc,intptr) <0xffffffff>
  at (wrapper managed-to-native) NDesk.GLib.IO.g_io_add_watch (NDesk.GLib.IOChannel,NDesk.GLib.IOCondition,NDesk.GLib.IOFunc,intptr) <0x000a4>
  at NDesk.GLib.IO.AddWatch (NDesk.GLib.IOChannel,NDesk.GLib.IOCondition,NDesk.GLib.IOFunc) <0x00064>
  at NDesk.DBus.BusG.Init (NDesk.DBus.Connection,NDesk.GLib.IOFunc) <0x00080>
  at NDesk.DBus.BusG.Init (NDesk.DBus.Connection) <0x000cc>
  at NDesk.DBus.BusG.Init () <0x00044>
  at Tomboy.RemoteControlProxy.Register (Tomboy.NoteManager) <0x00030>
  at Tomboy.Tomboy.RegisterRemoteControl (Tomboy.NoteManager) <0x0003c>
  at Tomboy.Tomboy.Main (string[]) <0x001c0>
  at (wrapper runtime-invoke) System.Object.runtime_invoke_void_string[] (object,intptr,intptr,intptr) <0x00080>

Native stacktrace:

        mono [0x10166c74]
        mono [0x101404b0]
        [0x100350]
        [(nil)]
        /usr/lib/libglib-2.0.so.0(g_io_add_watch_full+0x5c) [0xfeed20c]
        [0x313c11a4]
        [0x313c1048]
        [0x313c0d2c]
        [0x313c0bb8]
        [0x31369738]
        [0x313694d4]
        [0x31369100]
        [0x309cf814]
        [0x309cf0dc]
        mono [0x101402d8]
        mono(mono_runtime_invoke+0x1c) [0x10056488]
        mono(mono_runtime_exec_main+0x14c) [0x1005ba5c]
        mono(mono_runtime_run_main+0x2a4) [0x1005bd50]
        mono(mono_jit_exec+0xe0) [0x1001350c]
        mono [0x10013648]
        mono(mono_main+0x1714) [0x10014ff0]
        mono [0x100120f4]
        /lib/libc.so.6 [0xfc32fc0]
        /lib/libc.so.6 [0xfc33210]

Debug info from gdb:

(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 805445344 (LWP 10519)]
[New Thread 822379696 (LWP 10532)]
[New Thread 816796848 (LWP 10521)]
[New Thread 815551664 (LWP 10520)]
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
0x0fcee1f8 in select () from /lib/libc.so.6
  4 Thread 815551664 (LWP 10520) 0x0fe6cce0 in ?? () from /lib/libpthread.so.0
  3 Thread 816796848 (LWP 10521) 0x0fe682b4 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/libpthread.so.0
  2 Thread 822379696 (LWP 10532) 0x0fce4cc0 in read () from /lib/libc.so.6
  1 Thread 805445344 (LWP 10519) 0x0fcee1f8 in select () from /lib/libc.so.6

Thread 4 (Thread 815551664 (LWP 10520)):
#0 0x0fe6cce0 in ?? () from /lib/libpthread.so.0
#1 0x0fe6cccc in ?? () from /lib/libpthread.so.0
#2 0x100cb5e8 in ?? ()
#3 0x0fe62944 in start_thread () from /lib/libpthread.so.0
#4 0x0fcf6464 in clone () from /lib/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 3 (Thread 816796848 (LWP 10521)):
#0 0x0fe682b4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x100d1d24 in ?? ()
#2 0x100d215c in ?? ()
#3 0x100d1efc in ?? ()
#4 0x100e9778 in ?? ()
#5 0x10071e50 in ?? ()
#6 0x10090f70 in ?? ()
#7 0x100e71bc in ?? ()
#8 0x1010d7a8 in ?? ()
#9 0x0fe62944 in start_thread () from /lib/libpthread.so.0
#10 0x0fcf6464 in clone () from /lib/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 2 (Thread 822379696 (LWP 10532)):
#0 0x0fce4cc0 in read () from /lib/libc.so.6
#1 0x30f41fd8 in ?? ()
#2 0x30f41d60 in ?? ()
#3 0x30f41c10 in ?? ()
#4 0x30f1cab0 in ?? ()
#5 0x101402d8 in ?? ()
#6 0x10056488 in mono_runtime_invoke ()
#7 0x1005679c in mono_runtime_delegate_invoke ()
#8 0x10090fb4 in ?? ()
#9 0x100e71bc in ?? ()
#10 0x1010d7a8 in ?? ()
#11 0x0fe62944 in start_thread () from /lib/libpthread.so.0
#12 0x0fcf6464 in clone () from /lib/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (Thread 805445344 (LWP 10519)):
#0 0x0fcee1f8 in select () from /lib/libc.so.6
#1 0x0ff357f0 in g_spawn_sync () from /usr/lib/libglib-2.0.so.0
#2 0x0ff35be0 in g_spawn_command_line_sync () from /usr/lib/libglib-2.0.so.0
#3 0x10166d58 in ?? ()
#4 0x101404b0 in ?? ()
#5 <signal handler called>
#6 0x0feea14c in g_io_create_watch () from /usr/lib/libglib-2.0.so.0
#7 0x0feed20c in g_io_add_watch_full () from /usr/lib/libglib-2.0.so.0
#8 0x313c11a4 in ?? ()
#9 0x313c1048 in ?? ()
#10 0x313c0d2c in ?? ()
#11 0x313c0bb8 in ?? ()
#12 0x31369738 in ?? ()
#13 0x313694d4 in ?? ()
#14 0x31369100 in ?? ()
#15 0x309cf814 in ?? ()
#16 0x309cf0dc in ?? ()
#17 0x101402d8 in ?? ()
#18 0x10056488 in mono_runtime_invoke ()
#19 0x1005ba5c in mono_runtime_exec_main ()
#20 0x1005bd50 in mono_runtime_run_main ()
#21 0x1001350c in mono_jit_exec ()
#22 0x10013648 in ?? ()
#23 0x10014ff0 in mono_main ()
#24 0x100120f4 in ?? ()
#25 0x0fc32fc0 in generic_start_main () from /lib/libc.so.6
#26 0x0fc33210 in __libc_start_main () from /lib/libc.so.6
#27 0x00000000 in ?? ()
#0 0x0fcee1f8 in select () from /lib/libc.so.6

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

Aborted (core dumped)
$

Revision history for this message
Alessandro Decina (alessandro.decina) wrote :

This seems a bug in mono on ppc actually, see https://bugs.launchpad.net/ubuntu/+source/mono/+bug/122496/comments/5 for details.

Attached is a patch to workaround the bug. This makes tomboy and f-spot start again here.
Note that this is just a workaround, a better way to fix this directly in mono is described in the comment linked above.

Revision history for this message
Alessandro Decina (alessandro.decina) wrote :

for the brave ppc users that want to try the patch:

apt-get source ndesk-dbus-glib
cd ndesk-dbus-glib-0.3
sudo apt-get build-dep ndesk-dbus-glib
wget http://launchpadlibrarian.net/8542716/ndesk_glib_workaround.diff
patch -p0 <ndesk_glib_workaround.diff
sudo debian/rules binary
sudo dpkg -i ../libndesk-dbus-glib1.0-cil_0.3-1_all.deb

Now tomboy and f-spot should start again.

Revision history for this message
Sebastian Dröge (slomo) wrote :

Hi,
I already fixed this in Debian, essentially your patch plus a few places with the same issue. It's really a bug in ndesk-dbus-glib, the GLib functions require a native GIOChannel struct (which is the IOChannel.Handle variable) and not a managed IOChannel struct.

The version from Debian should be synced in the next few days...

Changed in ndesk-dbus-glib:
status: New → Fix Committed
Revision history for this message
Alessandro Decina (alessandro.decina) wrote :

Hi slomo,
thanks for looking at this.
AFAIU the IOChannel struct in GLib.IO.cs has a StructLayout (LayoutKind.Sequential) attribute so that passing IOChannel by value or passing IOChannel.Handle *should* have the same result. And indeed this works on x86. This might be a regression introduced by http://bugs.ximian.com/show_bug.cgi?id=77968.

Anyway, I don't really care as long as I can use tomboy ;)

Revision history for this message
Sebastian Dröge (slomo) wrote :

That's all correct, LayoutKind.Sequential will care that the struct is passed to native functions with that layout, i.e. not reordered. At least that's my understanding of that attribute.

But the GLib functions don't want this managed struct or the native equivalent to it, instead they want a "native" GIOChannel struct, which is saved in IOChannel.Handle. For !ppc it will get this by accident because the first thing in the struct is the Handle but on PPC it fails because of the different argument passing.

Revision history for this message
Alessandro Decina (alessandro.decina) wrote :

What i'm saying is that imo it's not by accident. I think they used that attribute to be able to pass the IOChannel struct where a GIOChannel pointer is needed. IOChannel is a value-type and its layout is sequential, hence there should be no padding or reordering in the fields. So passing an IOChannel by value should result in passing a copy of its fields, that is a copy of IOChannel::Handle, which should be exactly the same as passing the Handle explicitly.

Revision history for this message
Wouter Stomp (wouterstomp-deactivatedaccount) wrote :

This was fixed in 0.3-2

Changed in ndesk-dbus-glib:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.