Ubuntu
systemd package

Bug #1303649
Comment #10

Comment 10 for bug 1303649

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2014-04-10:

#10

It turns out the write() is not a part of the dbus transaction with cgmanager, but actually a part of libnih's mainloop exiting code:

(gdb) where
#0 0x00007f70ea509700 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007f70eae47377 in nih_main_loop_interrupt () at main.c:630
#2 0x00007f70eaa0d64b in _dbus_transport_queue_messages (transport=transport@entry=0x2329290) at ../../dbus/dbus-transport.c:1157
#3 0x00007f70eaa0df8e in do_reading (transport=0x2329290) at ../../dbus/dbus-transport-socket.c:851
#4 0x00007f70eaa0e626 in socket_do_iteration (transport=0x2329290, flags=6, timeout_milliseconds=<optimized out>) at ../../dbus/dbus-transport-socket.c:1162
#5 0x00007f70eaa0d3ff in _dbus_transport_do_iteration (transport=0x2329290, flags=3940863025, flags@entry=6, timeout_milliseconds=1,
    timeout_milliseconds@entry=25000) at ../../dbus/dbus-transport.c:976
#6 0x00007f70ea9f79dc in _dbus_connection_do_iteration_unlocked (connection=connection@entry=0x232d580, pending=pending@entry=0x232bf90, flags=flags@entry=6,
    timeout_milliseconds=timeout_milliseconds@entry=25000) at ../../dbus/dbus-connection.c:1234
#7 0x00007f70ea9f8389 in _dbus_connection_block_pending_call (pending=0x232bf90) at ../../dbus/dbus-connection.c:2415
#8 0x00007f70eaa0772a in dbus_pending_call_block (pending=<optimized out>) at ../../dbus/dbus-pending-call.c:748
#9 0x00007f70ea9f894d in dbus_connection_send_with_reply_and_block (connection=0x232d580, message=0x23056e0, timeout_milliseconds=-1, error=0x7fff22097b70)
    at ../../dbus/dbus-connection.c:3530
#10 0x00007f70eb05b4f5 in cgmanager_create_sync () from /lib/x86_64-linux-gnu/libcgmanager.so.0
#11 0x0000000000422520 in ?? ()
#12 0x00000000004164a4 in ?? ()
#13 0x000000000041655d in ?? ()
#14 0x000000000040f3ed in ?? ()
#15 0x00000000004107dc in ?? ()
#16 0x000000000040c4d3 in ?? ()
#17 0x00007f70eaa06e26 in _dbus_object_tree_dispatch_and_unlock (tree=0x22f92a0, message=message@entry=0x22fb9d0, found_object=found_object@entry=0x7fff22098114)
    at ../../dbus/dbus-object-tree.c:862
#18 0x00007f70ea9f9a01 in dbus_connection_dispatch (connection=0x22f8870) at ../../dbus/dbus-connection.c:4672
#19 0x0000000000409467 in ?? ()
#20 0x000000000040622c in ?? ()
#21 0x00007f70ea43eec5 in __libc_start_main (main=0x4060d0, argc=1, argv=0x7fff220982d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fff220982c8) at libc-start.c:287
#22 0x000000000040637c in ?? ()

This is happening in this part of nih/main.c:

/**
* nih_main_loop_interrupt:
*
* Interrupts the current (or next) main loop iteration because of an
* event that potentially needs immediate processing, or because some
* condition of the main loop has been changed.
**/
void
nih_main_loop_interrupt (void)
{
nih_main_loop_init ();

        if (interrupt_pipe[1] != -1)
                while (write (interrupt_pipe[1], "", 1) < 0)
                        ;
}

Why the interrupt_pipe[0] woudl be closed is beyond me.

A simple fix would be to add a check for errno == EAGAIN in the while loop to avoid this condition. However, we should figure out why this is happening and hopefully we can prevent it happening at all.

It turns out the write() is not a part of the dbus transaction with cgmanager, but actually a part of libnih's mainloop exiting code:

(gdb) where             
#0  0x00007f70ea509700 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f70eae47377 in nih_main_loop_interrupt () at main.c:630
#2  0x00007f70eaa0d64b in _dbus_transport_queue_messages (transport=transport@entry=0x2329290) at ../../dbus/dbus-transport.c:1157
#3  0x00007f70eaa0df8e in do_reading (transport=0x2329290) at ../../dbus/dbus-transport-socket.c:851
#4  0x00007f70eaa0e626 in socket_do_iteration (transport=0x2329290, flags=6, timeout_milliseconds=<optimized out>) at ../../dbus/dbus-transport-socket.c:1162
#5  0x00007f70eaa0d3ff in _dbus_transport_do_iteration (transport=0x2329290, flags=3940863025, flags@entry=6, timeout_milliseconds=1, 
    timeout_milliseconds@entry=25000) at ../../dbus/dbus-transport.c:976
#6  0x00007f70ea9f79dc in _dbus_connection_do_iteration_unlocked (connection=connection@entry=0x232d580, pending=pending@entry=0x232bf90, flags=flags@entry=6, 
    timeout_milliseconds=timeout_milliseconds@entry=25000) at ../../dbus/dbus-connection.c:1234
#7  0x00007f70ea9f8389 in _dbus_connection_block_pending_call (pending=0x232bf90) at ../../dbus/dbus-connection.c:2415
#8  0x00007f70eaa0772a in dbus_pending_call_block (pending=<optimized out>) at ../../dbus/dbus-pending-call.c:748
#9  0x00007f70ea9f894d in dbus_connection_send_with_reply_and_block (connection=0x232d580, message=0x23056e0, timeout_milliseconds=-1, error=0x7fff22097b70)
    at ../../dbus/dbus-connection.c:3530
#10 0x00007f70eb05b4f5 in cgmanager_create_sync () from /lib/x86_64-linux-gnu/libcgmanager.so.0
#11 0x0000000000422520 in ?? ()
#12 0x00000000004164a4 in ?? ()
#13 0x000000000041655d in ?? ()
#14 0x000000000040f3ed in ?? ()
#15 0x00000000004107dc in ?? ()
#16 0x000000000040c4d3 in ?? ()
#17 0x00007f70eaa06e26 in _dbus_object_tree_dispatch_and_unlock (tree=0x22f92a0, message=message@entry=0x22fb9d0, found_object=found_object@entry=0x7fff22098114)
    at ../../dbus/dbus-object-tree.c:862
#18 0x00007f70ea9f9a01 in dbus_connection_dispatch (connection=0x22f8870) at ../../dbus/dbus-connection.c:4672
#19 0x0000000000409467 in ?? ()
#20 0x000000000040622c in ?? ()
#21 0x00007f70ea43eec5 in __libc_start_main (main=0x4060d0, argc=1, argv=0x7fff220982d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fff220982c8) at libc-start.c:287
#22 0x000000000040637c in ?? ()

This is happening in this part of nih/main.c:

/**
 * nih_main_loop_interrupt:
 *
 * Interrupts the current (or next) main loop iteration because of an
 * event that potentially needs immediate processing, or because some
 * condition of the main loop has been changed.
 **/
void
nih_main_loop_interrupt (void)
{
        nih_main_loop_init ();

if (interrupt_pipe[1] != -1)
                while (write (interrupt_pipe[1], "", 1) < 0)
                        ;
}

Why the interrupt_pipe[0] woudl be closed is beyond me.

A simple fix would be to add a check for errno == EAGAIN in the while loop to avoid this condition.  However, we should figure out why this is happening and hopefully we can prevent it happening at all.

Ubuntusystemd package

Comment 10 for bug 1303649

Ubuntu
systemd package