upstart-udev-bridge at 100% CPU

Bug #850935 reported by Steve Magoun
62
This bug affects 14 people
Affects Status Importance Assigned to Milestone
upstart (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I noticed that upstart-udev-bridge was using 100% of one of my CPU cores. I noticed this after a compiz crash; I'm not sure if the problem was happening before the crash.

Attached is an strace log:
sudo strace -f -o upstart-udev-bridge.strace -v -p 291

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: upstart 1.3-0ubuntu8
ProcVersionSignature: Ubuntu 3.0.0-11.18-generic 3.0.4
Uname: Linux 3.0.0-11-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 1.23-0ubuntu1
Architecture: amd64
Date: Thu Sep 15 09:30:34 2011
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Beta amd64 (20100901.1)
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: upstart
UpgradeStatus: Upgraded to oneiric on 2011-09-02 (12 days ago)

Revision history for this message
Steve Magoun (smagoun) wrote :
Revision history for this message
Steve Magoun (smagoun) wrote :
Revision history for this message
Steve Magoun (smagoun) wrote :
Download full text (3.2 KiB)

Backtrace via GDB:

steve@steve-laptop:~$ sudo gdb
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>.
(gdb) attach 291
Attaching to process 291
Reading symbols from /sbin/upstart-udev-bridge...(no debugging symbols found)...done.
Reading symbols from /lib/libnih.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnih.so.1
Reading symbols from /lib/libnih-dbus.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnih-dbus.so.1
Reading symbols from /lib/x86_64-linux-gnu/libdbus-1.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libdbus-1.so.3
Reading symbols from /lib/x86_64-linux-gnu/libudev.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libudev.so.0
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007fee20388bf0 in write () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) thread apply all bt

Thread 1 (Thread 0x7fee210c0720 (LWP 291)):
#0 0x00007fee20388bf0 in write () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fee203285f3 in _IO_file_write () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fee203284ba in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fee20328f6e in _IO_file_xsputn () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007fee203015f6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007fee202fc6ee in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007fee203ac0db in __fprintf_chk () from /lib/x86_64-linux-gnu/libc.so.6
#7 0x00007fee20887e34 in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#8 0x00007fee2087add2 in dbus_message_iter_append_basic () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#9 0x00007fee210effb0 in ?? ()
#10 0x00007fee210ed195 in ?? ()
#11 0x00007fee20cb843b in nih_io_handle_fds () from /lib/libnih.so.1
#12 0x00007fee20cbc365 in nih_main_loop () from /lib/libnih.so.1
#13 0x00007fee202d630d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#14 0x00007fee210eccb1 in ?? ()
#15 0x00007fffcb412578 in ?? ()
#16 0x000000000000001c in ?? ()
#17 0x0000000000000002 in ?? ()
#18 0x00007fffcb413f17 in ?? ()
---Type <return> to continue, or q <return> to quit---
#19 0x00007fffcb413f2b in ?? ()
#20...

Read more...

Revision history for this message
Steve Magoun (smagoun) wrote :

After rebooting the system, upstart-udev-bridge is at 100% CPU on one of my cores - so this is reproducible, and not necessarily related to compiz. Let me know if there's anything I can do to help debug.

The backtrace implies dbus is involved, so I tried shutting down all applications and watching both the system and session busses with dbus-monitor. There was no traffic that seemed related. I used 'sudo dbus-monitor --system' and 'dbus-monitor --session', I'm happy to try other invocations too.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Could you add "-s 1024" to an strace as the strings are truncated in the attached log. I think we'll find the log will then show lines like this:

    arguments to dbus_message_iter_append_basic() were incorrect, assertion "%s" failed in file %s line %d.
    This is normally a bug in some application using the D-Bus library.

Revision history for this message
Steve Magoun (smagoun) wrote :

With -s 1024:

318 write(2, "process 318: ", 13) = 13
318 write(2, "arguments to dbus_message_iter_append_basic() were incorrect, assertion \"_dbus_check_is_valid_utf8 (*string_p)\" failed in file ../../dbus/dbus-message.c line 2534.\nThis is normally a bug in some application using the D-Bus library.\n", 232) = 232
318 write(2, "process 318: ", 13) = 13
318 write(2, "arguments to dbus_message_iter_append_basic() were incorrect, assertion \"_dbus_check_is_valid_utf8 (*string_p)\" failed in file ../../dbus/dbus-message.c line 2534.\nThis is normally a bug in some application using the D-Bus library.\n", 232) = 232

Revision history for this message
James Hunt (jamesodhunt) wrote :

Could you install the ddebs for upstart + libnih so we can get a proper stack trace? dbus_message_iter_append_basic() is called in a lot of places. I suspect that either upstart-udev-bridge (or more likely libnih-dbus*) is calling NIH_MUST() or NIH_SHOULD() which will spin forever until they get the answer they expect back from a call.

Revision history for this message
Steve Magoun (smagoun) wrote :

I don't see ddebs for libnih, but I did install upstart-dbgsym and nih-dbus-tool-dbgsym. Here's the resulting stack:

Thread 1 (Thread 0x7f9d72f09720 (LWP 318)):
#0 0x00007f9d72172be2 in _IO_do_write () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f9d72171d95 in _IO_file_xsputn () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f9d7214a5f6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007f9d721456ee in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f9d721f50db in __fprintf_chk () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007f9d726d0e34 in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#6 0x00007f9d726c3dd2 in dbus_message_iter_append_basic () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#7 0x00007f9d72f38fb0 in upstart_emit_event (proxy=0x7f9d74dae2f0, name=0x7f9d74db2aa0 "power_supply-device-added",
    env=<optimized out>, wait=0, handler=0, error_handler=0x7f9d72f35d7c <emit_event_error>, data=0x0, timeout=2147483647)
    at com.ubuntu.Upstart.c:1018
#8 0x00007f9d72f36195 in udev_monitor_watcher (udev_monitor=<optimized out>, watch=<optimized out>,
    events=<optimized out>) at upstart-udev-bridge.c:276
#9 0x00007f9d72b0143b in nih_io_handle_fds () from /lib/libnih.so.1
#10 0x00007f9d72b05365 in nih_main_loop () from /lib/libnih.so.1
#11 0x00007f9d7211f30d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#12 0x00007f9d72f35cb1 in _start ()

Revision history for this message
Steve Magoun (smagoun) wrote :
Download full text (5.1 KiB)

Additional backtrace with more detail. Each bt I take (I've done about 10) shows that we're emitting a power_supply-device-added, is the kernel spamming us with events?

(gdb) attach 318
Attaching to process 318
Reading symbols from /sbin/upstart-udev-bridge...Reading symbols from /usr/lib/debug/sbin/upstart-udev-bridge...done.
done.
Reading symbols from /lib/libnih.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnih.so.1
Reading symbols from /lib/libnih-dbus.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnih-dbus.so.1
Reading symbols from /lib/x86_64-linux-gnu/libdbus-1.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libdbus-1.so.3
Reading symbols from /lib/x86_64-linux-gnu/libudev.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libudev.so.0
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007f9d721d1bf0 in write () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0 0x00007f9d721d1bf0 in write () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007f9d721715f3 in _IO_file_write () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2 0x00007f9d721714ba in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#3 0x00007f9d72171f6e in _IO_file_xsputn () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#4 0x00007f9d7214a5f6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#5 0x00007f9d721456ee in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#6 0x00007f9d721f50db in __fprintf_chk () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#7 0x00007f9d726d0e34 in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
No symbol table info available.
#8 0x00007f9d726c3dd2 in dbus_message_iter_append_basic () from /lib/x86_64-linux-gnu/libdbus-1.so.3
No symbol table info available.
#9 0x00007f9d72f38fb0 in upstart_emit_event (proxy=0x7f9d74dae2f0, name=0x7f9d74db2aa0 "power_supply-device-added",
    env=<optimized out>, wait=0, handler=0, error_handler=0x7f9d72f35d7c <emit_event_error>, data=0x0, timeout=2147483647)
    at com.ubuntu.Upstart.c:1018
        env_element = 0x7f9d74db10b0 "POWER_SUPPLY_MODEL_NAME=ASMB012ch\024\001\364", <incomplete sequence \340>
        method_call = 0x7f9d74db09e0
---Type <return> to continue, or q <return> to quit---
        iter = {dummy1 = 0x7f9d74db09e0, dummy2 = 0x1e00000, dummy3 = 1917452396, dummy4 = 32669, dummy5 = 1960513824,
          dummy6 = 32669, dum...

Read more...

Revision history for this message
James Hunt (jamesodhunt) wrote :

I don't think it's spamming us. The code that is causing the cpu to max out I think is in upstart-udev-bridge.c:

276 pending_call = NIH_SHOULD (upstart_emit_event (upstart,
277 name, env, FALSE,
278 NULL, emit_event_error, NULL,
279 NIH_DBUS_TIMEOUT_NEVER));

Here, Upstart is expecting the auto-generated call upstart_emit_event() to succeed, but it clearly isn't so is spinning forever (but not actually emitting the "power_supply-device-added" event).

btw - libnih ddebs are here: http://ddebs.ubuntu.com/pool/main/libn/libnih/

Revision history for this message
James Hunt (jamesodhunt) wrote :

I wonder if the event name itself is the problem. I've never seen a udev event which contained underscores:

  power_supply-device-added

              ^
              |
      is this normal?

If this system is a laptop and you can boot it with the battery removed on ac power, that would be an interesting test.

Revision history for this message
James Hunt (jamesodhunt) wrote :

I love proportional fonts. Don't you? :)

Revision history for this message
Steve Magoun (smagoun) wrote :

I booted the system w/ no battery and AC power. upstart-udev-bridge was not using any CPU according to 'top'. CPU usage shot to 100% after inserting the battery.

I see this in dmesg when I add the battery. I assume it's not related based on comment #10:
[ 126.010934] ACPI: EC: GPE storm detected, transactions will use polling mode

Revision history for this message
James Hunt (jamesodhunt) wrote :

I'm not now convinced the kernel isn't spamming us. Can you do the following:

1) sudo stop upstart-udev-bridge
2) Plonk the attached bug-850935.conf into /etc/init/.
3) Reboot with the battery attached.

bug-850935.conf will create a file /tmp/bug-850935.log with 1 line for every "power_supply-device-added" event emitted, but will stop logging if it sees >10 such events. If that file contains >1 entry, the bug is with udev or the kernel.

There is a secondary bug which is triggering the dbus assert failure, but I'm not clear yet if its in Upstart/nih or in dbus itself.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in upstart (Ubuntu):
status: New → Confirmed
Revision history for this message
Steve Magoun (smagoun) wrote :

Since I can't reply to the private bug of which this is a duplicate...

I tried the test in comment 14, there is no tmp file:

steve@steve-laptop:~$ ls /etc/init/bug-850935.conf
/etc/init/bug-850935.conf
steve@steve-laptop:~$ ls /tmp/bu*
ls: cannot access /tmp/bu*: No such file or directory

Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Steve - please could you attach your /var/log/udev to bug 829980 so I can check my fix will work in all cases. Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.