Comment 10 for bug 1303649

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

It turns out the write() is not a part of the dbus transaction with cgmanager, but actually a part of libnih's mainloop exiting code:

(gdb) where
#0 0x00007f70ea509700 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007f70eae47377 in nih_main_loop_interrupt () at main.c:630
#2 0x00007f70eaa0d64b in _dbus_transport_queue_messages (transport=transport@entry=0x2329290) at ../../dbus/dbus-transport.c:1157
#3 0x00007f70eaa0df8e in do_reading (transport=0x2329290) at ../../dbus/dbus-transport-socket.c:851
#4 0x00007f70eaa0e626 in socket_do_iteration (transport=0x2329290, flags=6, timeout_milliseconds=<optimized out>) at ../../dbus/dbus-transport-socket.c:1162
#5 0x00007f70eaa0d3ff in _dbus_transport_do_iteration (transport=0x2329290, flags=3940863025, flags@entry=6, timeout_milliseconds=1,
    timeout_milliseconds@entry=25000) at ../../dbus/dbus-transport.c:976
#6 0x00007f70ea9f79dc in _dbus_connection_do_iteration_unlocked (connection=connection@entry=0x232d580, pending=pending@entry=0x232bf90, flags=flags@entry=6,
    timeout_milliseconds=timeout_milliseconds@entry=25000) at ../../dbus/dbus-connection.c:1234
#7 0x00007f70ea9f8389 in _dbus_connection_block_pending_call (pending=0x232bf90) at ../../dbus/dbus-connection.c:2415
#8 0x00007f70eaa0772a in dbus_pending_call_block (pending=<optimized out>) at ../../dbus/dbus-pending-call.c:748
#9 0x00007f70ea9f894d in dbus_connection_send_with_reply_and_block (connection=0x232d580, message=0x23056e0, timeout_milliseconds=-1, error=0x7fff22097b70)
    at ../../dbus/dbus-connection.c:3530
#10 0x00007f70eb05b4f5 in cgmanager_create_sync () from /lib/x86_64-linux-gnu/libcgmanager.so.0
#11 0x0000000000422520 in ?? ()
#12 0x00000000004164a4 in ?? ()
#13 0x000000000041655d in ?? ()
#14 0x000000000040f3ed in ?? ()
#15 0x00000000004107dc in ?? ()
#16 0x000000000040c4d3 in ?? ()
#17 0x00007f70eaa06e26 in _dbus_object_tree_dispatch_and_unlock (tree=0x22f92a0, message=message@entry=0x22fb9d0, found_object=found_object@entry=0x7fff22098114)
    at ../../dbus/dbus-object-tree.c:862
#18 0x00007f70ea9f9a01 in dbus_connection_dispatch (connection=0x22f8870) at ../../dbus/dbus-connection.c:4672
#19 0x0000000000409467 in ?? ()
#20 0x000000000040622c in ?? ()
#21 0x00007f70ea43eec5 in __libc_start_main (main=0x4060d0, argc=1, argv=0x7fff220982d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fff220982c8) at libc-start.c:287
#22 0x000000000040637c in ?? ()

This is happening in this part of nih/main.c:

/**
 * nih_main_loop_interrupt:
 *
 * Interrupts the current (or next) main loop iteration because of an
 * event that potentially needs immediate processing, or because some
 * condition of the main loop has been changed.
 **/
void
nih_main_loop_interrupt (void)
{
        nih_main_loop_init ();

        if (interrupt_pipe[1] != -1)
                while (write (interrupt_pipe[1], "", 1) < 0)
                        ;
}

Why the interrupt_pipe[0] woudl be closed is beyond me.

A simple fix would be to add a check for errno == EAGAIN in the while loop to avoid this condition. However, we should figure out why this is happening and hopefully we can prevent it happening at all.