I run into the same issue on Ubuntu 13.10, and I have lots of evidence for you to take a look, including an Apport report and a core dump (because I didn't trust apport), and a small clue about what's happening. Apparently, init - corrupts the memory allocation structures - later, some code tries to allocate memory using malloc (#19 below) - malloc takes a lock (I'm inferring this) - then, it notices memory allocation structures are corrupted (#16 below; note "corrupted double-linked list") - it tries to report an error about that - the reporting code invokes malloc again, without even releasing the lock first (#2) - malloc tries to acquire the lock (#1-#0) and gets stuck; if the lock had been released, probably malloc would fail because of the corruption. While I'm quite rusty, I have quite some patches in the Linux kernel, so I hope looking at my analysis shouldn't be a waste of time. Backtrace: (gdb) bt #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 #1 0x00007fe58016ef1c in _L_lock_11850 () at malloc.c:5151 #2 0x00007fe58016c4c5 in __GI___libc_malloc (bytes=36) at malloc.c:2856 #3 0x00007fe580f32c37 in local_strdup (s=0x7fe5811264a5 "/lib/x86_64-linux-gnu/libgcc_s.so.1") at dl-load.c:162 #4 _dl_map_object (loader=loader@entry=0x7fe58112f000, name=name@entry=0x7fe58026bb26 "libgcc_s.so.1", type=type@entry=2, trace_mode=trace_mode@entry=0, mode=mode@entry=-1879048191, nsid=) at dl-load.c:2510 #5 0x00007fe580f3dd54 in dl_open_worker (a=a@entry=0x7fff350930c8) at dl-open.c:239 #6 0x00007fe580f396e6 in _dl_catch_error (objname=objname@entry=0x7fff350930b8, errstring=errstring@entry=0x7fff350930c0, mallocedp=mallocedp@entry=0x7fff350930b0, operate=operate@entry=0x7fe580f3dc00 , args=args@entry=0x7fff350930c8) at dl-error.c:177 #7 0x00007fe580f3d809 in _dl_open (file=0x7fe58026bb26 "libgcc_s.so.1", mode=-2147483647, caller_dlopen=, nsid=-2, argc=2, argv=0x7fff350952b8, env=0x7fff350952d0) at dl-open.c:667 #8 0x00007fe580220da2 in do_dlopen (ptr=ptr@entry=0x7fff350932d0) at dl-libc.c:87 #9 0x00007fe580f396e6 in _dl_catch_error (objname=0x7fff350932b0, errstring=0x7fff350932c0, mallocedp=0x7fff350932a0, operate=0x7fe580220d60 , args=0x7fff350932d0) at dl-error.c:177 #10 0x00007fe580220e62 in dlerror_run (args=0x7fff350932d0, operate=0x7fe580220d60 ) at dl-libc.c:46 #11 __GI___libc_dlopen_mode (name=name@entry=0x7fe58026bb26 "libgcc_s.so.1", mode=mode@entry=-2147483647) at dl-libc.c:163 #12 0x00007fe5801fb175 in init () at ../sysdeps/x86_64/backtrace.c:52 #13 0x00007fe57fed9370 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103 #14 0x00007fe5801fb294 in __GI___backtrace (array=array@entry=0x7fff35093590, size=size@entry=64) at ../sysdeps/x86_64/backtrace.c:103 #15 0x00007fe58015d515 in __libc_message (do_abort=2, fmt=fmt@entry=0x7fe580271240 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:178 #16 0x00007fe580168e1d in malloc_printerr (ptr=0x7fe582600ec0, str=0x7fe58026d1e8 "corrupted double-linked list", action=) at malloc.c:4923 #17 malloc_consolidate (av=av@entry=0x7fe5804aa740 ) at malloc.c:4094 #18 0x00007fe58016a0e1 in _int_malloc (av=0x7fe5804aa740 , bytes=8240) at malloc.c:3379 #19 0x00007fe58016c4d0 in __GI___libc_malloc (bytes=8240) at malloc.c:2859 #20 0x00007fe580d16e6d in nih_alloc (parent=parent@entry=0x7fe5826426d0, size=size@entry=8192) at alloc.c:158 #21 0x00007fe580d170a2 in nih_realloc (ptr=, parent=parent@entry=0x7fe5826426d0, size=size@entry=8192) at alloc.c:202 #22 0x00007fe580d1b82d in nih_io_buffer_resize (buffer=0x7fe5826426d0, grow=grow@entry=80) at io.c:315 #23 0x00007fe580d1cd4d in nih_io_watcher_read (watch=0x7fe582642340, io=0x7fe582642570) at io.c:1079 #24 nih_io_watcher (io=0x7fe582642570, watch=0x7fe582642340, events=NIH_IO_READ) at io.c:933 #25 0x00007fe580d1b67a in nih_io_handle_fds (readfds=readfds@entry=0x7fff35093f50, writefds=writefds@entry=0x7fff35093fd0, exceptfds=exceptfds@entry=0x7fff35094050) at io.c:237 #26 0x00007fe580d1f64c in nih_main_loop () at main.c:586 #27 0x00007fe58115816a in main (argc=, argv=) at main.c:772 Scenario: while logged at the console, I stopped by chance dbus, hence pulseaudio started respawning and failing in an infinite loop for ~15 minutes until I restarted dbus. Nothing looked wrong on the console, so I went on for a while. I noticed something was wrong only when top was taking 20% of CPU time instead of 2%, apparently because it's not happy to deal with ~10000 processes. Apart from this, the host stayed completely functional, copying ~1TB of data across two USB 2 disks; I'm writing this bug report from the machine itself. Analysis: Those are (mostly) pulseaudio processes hanging off init --user, which seems to be deadlocked because the malloc implementation tries to allocate memory while trying to report about a "corrupted double-linked list" through malloc_printerr. Relevant entry from gdb backtrace below: #16 0x00007fe580168e1d in malloc_printerr (ptr=0x7fe582600ec0, str=0x7fe58026d1e8 "corrupted double-linked list", action=) at malloc.c:4923 Hence, without looking at the sources indicated, I seem to see: - a deadlock in an error path of glibc (using malloc to tell me that malloc is broken, without releasing the lock, sounds no good) (if this is a deadlock indeed, but the guy is hanging on a lock) - malloc is trying to say the heap is corrupted, so there's probably some Valgrinding to do. - why are all those processes getting started without the older ones being reaped *first*? - at least, it's clear why they die right away: I killed dbus by mistake by running "restart networking", which dbus.conf aliases in fact to "break almost everything". (It says: start on local-filesystems stop on deconfiguring-networking That is, networking itself might be fine, but dbus won't be auto-restarted. But certainly I made a mistake by trying to fix a networking problem by blindly restarting networking (though it doesn't sound *so* unreasonable, does it?), and whether the configuration is too error-prone is an interesting but separate issue.) Below there's all the supporting evidence for my analysis, with relevant excerpts of program output - complete output are attached, in most cases (including the answers to most or all the questions you asked - /proc/meminfo, service lists, etc.). The commands include the redirection command I used, so you also have the filename. After capturing all info I could think of (including what you asked for), I've also captured a core and then forced an apport dump with kill -ILL (apparently, apport doesn't think I might want to include stacktraces for hangs, so I need to spoil the core with a spurious signal and then explain what happened). However, I'm not very familiar with apport; saving a core from the running program was easier than getting apport to save it, and I found no way to attach apport information to this bug (I probably can't), or to find/edit the bug report which apport submitted for me (I see that seems to be by design). # strace -p 1275 Process 1275 attached futex(0x7fe5804aa740, FUTEX_WAIT_PRIVATE, 2, NULL^CProcess 1275 detached $ sudo gdb $(which init) -p 1275 2>&1|tee gdb-1275-transcript-v3.txt [...] (gdb) bt #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 #1 0x00007fe58016ef1c in _L_lock_11850 () at malloc.c:5151 #2 0x00007fe58016c4c5 in __GI___libc_malloc (bytes=36) at malloc.c:2856 #3 0x00007fe580f32c37 in local_strdup (s=0x7fe5811264a5 "/lib/x86_64-linux-gnu/libgcc_s.so.1") at dl-load.c:162 #4 _dl_map_object (loader=loader@entry=0x7fe58112f000, name=name@entry=0x7fe58026bb26 "libgcc_s.so.1", type=type@entry=2, trace_mode=trace_mode@entry=0, mode=mode@entry=-1879048191, nsid=) at dl-load.c:2510 #5 0x00007fe580f3dd54 in dl_open_worker (a=a@entry=0x7fff350930c8) at dl-open.c:239 #6 0x00007fe580f396e6 in _dl_catch_error (objname=objname@entry=0x7fff350930b8, errstring=errstring@entry=0x7fff350930c0, mallocedp=mallocedp@entry=0x7fff350930b0, operate=operate@entry=0x7fe580f3dc00 , args=args@entry=0x7fff350930c8) at dl-error.c:177 #7 0x00007fe580f3d809 in _dl_open (file=0x7fe58026bb26 "libgcc_s.so.1", mode=-2147483647, caller_dlopen=, nsid=-2, argc=2, argv=0x7fff350952b8, env=0x7fff350952d0) at dl-open.c:667 #8 0x00007fe580220da2 in do_dlopen (ptr=ptr@entry=0x7fff350932d0) at dl-libc.c:87 #9 0x00007fe580f396e6 in _dl_catch_error (objname=0x7fff350932b0, errstring=0x7fff350932c0, mallocedp=0x7fff350932a0, operate=0x7fe580220d60 , args=0x7fff350932d0) at dl-error.c:177 #10 0x00007fe580220e62 in dlerror_run (args=0x7fff350932d0, operate=0x7fe580220d60 ) at dl-libc.c:46 #11 __GI___libc_dlopen_mode (name=name@entry=0x7fe58026bb26 "libgcc_s.so.1", mode=mode@entry=-2147483647) at dl-libc.c:163 #12 0x00007fe5801fb175 in init () at ../sysdeps/x86_64/backtrace.c:52 #13 0x00007fe57fed9370 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103 #14 0x00007fe5801fb294 in __GI___backtrace (array=array@entry=0x7fff35093590, size=size@entry=64) at ../sysdeps/x86_64/backtrace.c:103 #15 0x00007fe58015d515 in __libc_message (do_abort=2, fmt=fmt@entry=0x7fe580271240 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:178 #16 0x00007fe580168e1d in malloc_printerr (ptr=0x7fe582600ec0, str=0x7fe58026d1e8 "corrupted double-linked list", action=) at malloc.c:4923 #17 malloc_consolidate (av=av@entry=0x7fe5804aa740 ) at malloc.c:4094 #18 0x00007fe58016a0e1 in _int_malloc (av=0x7fe5804aa740 , bytes=8240) at malloc.c:3379 #19 0x00007fe58016c4d0 in __GI___libc_malloc (bytes=8240) at malloc.c:2859 #20 0x00007fe580d16e6d in nih_alloc (parent=parent@entry=0x7fe5826426d0, size=size@entry=8192) at alloc.c:158 #21 0x00007fe580d170a2 in nih_realloc (ptr=, parent=parent@entry=0x7fe5826426d0, size=size@entry=8192) at alloc.c:202 #22 0x00007fe580d1b82d in nih_io_buffer_resize (buffer=0x7fe5826426d0, grow=grow@entry=80) at io.c:315 #23 0x00007fe580d1cd4d in nih_io_watcher_read (watch=0x7fe582642340, io=0x7fe582642570) at io.c:1079 #24 nih_io_watcher (io=0x7fe582642570, watch=0x7fe582642340, events=NIH_IO_READ) at io.c:933 #25 0x00007fe580d1b67a in nih_io_handle_fds (readfds=readfds@entry=0x7fff35093f50, writefds=writefds@entry=0x7fff35093fd0, exceptfds=exceptfds@entry=0x7fff35094050) at io.c:237 #26 0x00007fe580d1f64c in nih_main_loop () at main.c:586 #27 0x00007fe58115816a in main (argc=, argv=) at main.c:772 (This is after installing all needed debugging symbols). I attached ps -efly's output: $ ps -efly > ps-efly.txt $ xz ps-efly.txt Some stats from it on the zombie pulseaudios: $ zcat ps-ely.txt.gz |fgrep 'pulseaudio '|wc -l 10227 $ zcat ps-ely.txt.gz |fgrep 'pulseaudio '|head -2 Z 1000 301 1275 0 80 0 0 0 exit ? 00:00:00 pulseaudio Z 1000 302 1275 0 80 0 0 0 exit ? 00:00:00 pulseaudio Some stats on the alive inits: $ ps -efly|grep init S root 1 0 0 80 0 1764 6806 poll_s 00:34 ? 00:00:34 /sbin/init S paolo 1275 1081 0 80 0 432 9084 futex_ 00:35 ? 00:00:00 init --user S paolo 10808 8289 0 80 0 980 4160 pipe_w 22:49 pts/12 00:00:00 grep --color=auto init S paolo 28778 28673 0 80 0 1064 9091 poll_s 12:22 ? 00:00:00 init --user S paolo 28941 28778 0 80 0 200 1110 wait 12:22 ? 00:00:00 /bin/sh /etc/xdg/xfce4/xinitrc -- /etc/X11/xinit/xserverrc Notice the two init --user alive (I might have created the second one by restarting the desktop session, I'm not sure). initctl list works fine (for both user and system), but I'm betting it attaches to the second lively one, PID 28778. To verify that, let's strace the lively server: $ sudo strace -p 28778 2>&1| tee strace-28778.txt Process 28778 attached select(25, [3 5 6 7 8 9 10 11 14 19 20 24], [], [7 8 9 10 14 20], NULL Then try running initctl list - note the 28778 in the socket name: $ strace initctl list 2>&1|tee strace-initctl-list.txt ... socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3 connect(3, {sa_family=AF_LOCAL, sun_path=@"/com/ubuntu/upstart-session/1000/28778"}, 41) = 0 ... Meanwhile, PID 28778 gets busy answering the query: select(25, [3 5 6 7 8 9 10 11 14 19 20 24], [], [7 8 9 10 14 20], NULL) = 1 (in [7]) accept4(7, {sa_family=AF_LOCAL, NULL}, [2], SOCK_CLOEXEC) = 13 fcntl(13, F_GETFL) = 0x2 (flags O_RDWR) fcntl(13, F_SETFL, O_RDWR|O_NONBLOCK) = 0 getsockname(13, {sa_family=AF_LOCAL, sun_path=@"/com/ubuntu/upstart-session/1000/28778"}, [41]) = 0 [...]