dbus-daemon eats 100% cpu and is not responsive

Bug #680444 reported by Alexey Khoroshilov
156
This bug affects 33 people
Affects Status Importance Assigned to Milestone
dbus (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: dbus

strace shows the following:

sudo strace -p 1464
Process 1464 attached - interrupt to quit
clock_gettime(CLOCK_MONOTONIC, {349519, 662737789}) = 0
poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15, events=POLLIN}, {fd=17, events=POLLIN}, {fd=16, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, events=POLLIN}, {fd=14, events=POLLIN}, {fd=22, events=POLLIN}, {fd=21, events=POLLIN}, {fd=25, events=POLLIN}, {fd=24, events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, events=POLLIN}, {fd=29, events=POLLIN}, {fd=23, events=POLLIN}, {fd=31, events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, events=POLLIN}, {fd=34, events=POLLIN}, {fd=35, events=POLLIN}, {fd=36, events=POLLIN}, {fd=39, events=POLLIN}, {fd=38, events=POLLIN}, {fd=37, events=POLLIN}, {fd=42, events=POLLIN}, ...], 1018, -1) = 1 ([{fd=3, revents=POLLIN}])
clock_gettime(CLOCK_MONOTONIC, {349519, 664920388}) = 0
accept4(3, 0xbfcd9aac, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files)
fcntl64(-1, F_GETFD) = -1 EBADF (Bad file descriptor)
clock_gettime(CLOCK_MONOTONIC, {349519, 665068808}) = 0
poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15, events=POLLIN}, {fd=17, events=POLLIN}, {fd=16, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, events=POLLIN}, {fd=14, events=POLLIN}, {fd=22, events=POLLIN}, {fd=21, events=POLLIN}, {fd=25, events=POLLIN}, {fd=24, events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, events=POLLIN}, {fd=29, events=POLLIN}, {fd=23, events=POLLIN}, {fd=31, events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, events=POLLIN}, {fd=34, events=POLLIN}, {fd=35, events=POLLIN}, {fd=36, events=POLLIN}, {fd=39, events=POLLIN}, {fd=38, events=POLLIN}, {fd=37, events=POLLIN}, {fd=42, events=POLLIN}, ...], 1018, -1) = 1 ([{fd=3, revents=POLLIN}])
clock_gettime(CLOCK_MONOTONIC, {349519, 667222733}) = 0
accept4(3, 0xbfcd9aac, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files)
fcntl64(-1, F_GETFD) = -1 EBADF (Bad file descriptor)
.................

System info:
Kubuntu 10.10
ubuntu-bug does not work by the way because it tries to connect to dbus

Revision history for this message
cd311 (cd311) wrote :

hi

i had the same problem here.
i tried the solution mentioned in http://ubuntuforums.org/showthread.php?t=1101473 comment #4
and just after nmbd and smbd stopped the dbus-deamon process went from 100% to something lower then 1%.

so apprently this has really something to to with samba.

cd311

Revision history for this message
cd311 (cd311) wrote :

hi

removing samba was no real solution.
 after starting kmail the dbus-deamon process of my user went back to 100%
kmail works sometimes. especial adding attachements to mails takes quit some time.
and trying to open a pdf in okular fails.

cd311

Revision history for this message
Arnau Alcázar (arnau) wrote :

I ahve the same problem but with Ubuntu 10.10

Revision history for this message
Michael Weishaar (michael-weishaar) wrote :

I've seen dbus-daemon run amok on several occasions.
I have nebomuk disabled, but I've seen these running anyway:
/usr/bin/nepomukserver
/usr/bin/akonadi_nepomuk_contact_feeder --identifier akonadi_nepomuk_contact_feeder

On one occasion, killing these brought the CPU back down to normal
On another occasion, it didn't.

Logging out and back in always corrects it.

I am running Kubuntu Maverick 64-bit on a quad-core CPU

Revision history for this message
Martin Pitt (pitti) wrote :

This should be fixed by 1.4.6 according to upstream NEWS.

dbus (1.4.6-1ubuntu1) natty; urgency=low

  * Merge with Debian unstable. Remaining Ubuntu changes:
    - Install into / rather than /usr.
    - debian/dbus.postinst: Use upstart call instead of invoking the init.d
      script for checking if we are already running.
    - Add debian/dbus.upstart.
    - 0001-activation-allow-for-more-variation-than-just-system.patch,
      0002-bus-change-systemd-activation-to-activation-systemd.patch,
      0003-upstart-add-upstart-as-a-possible-activation-type.patch,
      0004-upstart-add-UpstartJob-to-service-desktop-files.patch,
      0005-activation-implement-upstart-activation.patch: Patches from Scott
      James Remnant to implement Upstart service activation. Not upstream.
    - 20_system_conf_limit.patch: Increase max_match_rules_per_connection for
      the system bus to 5000 (LP #454093)
    - 81-session.conf-timeout.patch: Raise the service startup timeout from 25
      to 60 seconds. It may be too short on the live CD with slow machines.
  * debian/rules: Fix creation of /usr/lib/libdbus-1.so symlink.
  * debian/libdbus-1-dev.install: Put back .pc.

 -- Martin Pitt <email address hidden> Thu, 24 Feb 2011 16:41:36 +0100

Changed in dbus (Ubuntu):
status: New → Fix Released
Revision history for this message
S4qFBxkFFg (s4qfbxkffg) wrote :

On my system, killing the process for the kpackagekit icon (that appears in the taskbar when updates are available) will return dbus-daemon to its usual CPU usage.

This is still happening in natty as of a day or two ago.

Revision history for this message
Mitch Oliver (mitch-oliver) wrote :

I'm seeing this with greater frequency in Natty. I can usually return dbus-daemon to normal by killing a number of processes. I have, as yet, been unable to diving a pattern to which process to kill. Sometimes it is gvim, others it is firefox. This suggests (to me) that there is some sort of infinite loop within dbus itself.

Revision history for this message
Mitch Oliver (mitch-oliver) wrote :

I've run dbus-monitor trying to catch this defect in the act. There does not appear to be any process spamming dbus.

Revision history for this message
Adam Porter (alphapapa) wrote : Re: [Bug 680444] Re: dbus-daemon eats 100% cpu and is not responsive

Could this bug be related to dbus-daemon using over 300 MB of memory,
or would that be a separate bug?

On Mon, May 23, 2011 at 14:15, Mitch Oliver <email address hidden> wrote:
> I've run dbus-monitor trying to catch this defect in the act.  There
> does not appear to be any process spamming dbus.
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/680444
>
> Title:
>  dbus-daemon eats 100% cpu and is not responsive
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/680444/+subscribe
>

Revision history for this message
Ariel Faigon (ariel.faigon) wrote :

I used to see this in maverick occasionally.
After upgrading to natty, it happens much more frequently, every 2 days or so.

Revision history for this message
AHagge (nospam-ah) wrote :

This bug is still present as of 06/11 and 1.4.6-1ubuntu6. I concur with Ariel that the runaway dbus is happening every few days (more than once a week).

Revision history for this message
S4qFBxkFFg (s4qfbxkffg) wrote :

As this bug is reproducible (at least once every few days for me), is there anything we can do to get some useful information for the developers so they can fix it?

If someone tells me what logs to post (or what I need to do to create useful logs), I should be able to help.

Revision history for this message
John Charonko (john-charonko) wrote :

I agree with LiSrt. This bug is very reproducible, though unfortunately not on cue. I have reported it as well in https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/713157, though I just marked that as a duplicate of this one. If someone would let me know what to harvest, I'd be willing to help. The bug is VERY annoying, basically every couple of days I have to shut down all my open work, log out, log back in, then re-open everything I was working on. Quite a time killer.

Since it's basically impossible (or at least I can't figure out how) to view bugs marked as "Fix Released" unless you already know about it, I'm changing it back to "Incomplete." It definitely isn't fixed! Would a status of "New" be more appropriate? I'm not very familiar with this process.

Revision history for this message
John Charonko (john-charonko) wrote :

Okay, clearly I don't have permission to edit the status since it isn't my bug. Could someone update this to a status that isn't hidden? I'm unlinking my bug from this one until this changes (since duplicates are hidden on search too), but will point to it in the comments. Hopefully this will help get the issue some attention.

A couple of other thoughts:

I run a Windows installation in VirtualBox. Does anyone else?

Because of VirtualBox and because Firefox and Thunderbird are hogs, I frequently run very high memory usage. For example, right now I'm at 90% of 8GB. The swap usage stays low, though. After a restart, the numbers are lower, but creep back up with time. I'm not sure it's connected, but I don't think I've ever seen the dbus problems when the memory usage hasn't already climbed above 90% of system memory first. Anyone else notice this?

One of the first symptoms of impending doom (even before dbus locks up) seems to be the auto-update notification for KPackageKit disappearing from the system tray. Is this related, a separate bug, or just coincidence?

I have also tried to dbus-monitor, (see Mitch Oliver's comments here and in #737170) but I don't really know what to look for. Before problems, dbus-monitor happily spits out its messages, but once the problem starts no more messages are posted.

Revision history for this message
Daniel Franke (franke-daniel) wrote :

Same for me. Every few days, dbus-daemon goes wild. Often it helps to close kontact. At least for a while. Finally only rebooting (possibly X relogin) will help. Problem appeared first after the upgrade to natty.

Bug #779849 seems to be a dupe of this one. Most notable information there: the corresponding KDE report (http://bugs.kde.org/show_bug.cgi?id=261180) has been fixed.

Revision history for this message
Zach Welch (zwelch) wrote :

Contrary to the current status, this bug does not appear to have been fixed in Natty. It is readily reproducible, though apparently non-deterministic. Exiting some applications that use dbus will help temporarily, but the symptoms invariably reappear when given enough uptime.

Based on the strace log provided in the initial report (and my own investigation), it looks like dbus leaks file descriptors in some circumstances, leading to a livelock situation when calling accept.

I am running dbus-1.4.6ubuntu6, so the conclusion that this was fixed in 1.4.6ubuntu was incorrect. This bug needs to be reopened and reexamined.

Revision history for this message
NJC (njc) wrote :

This has started for me too recently on Unbunt 10.10 - dbus-daemon using currently 44% CPU, and system monitor using 46%. It is July 26th 2011 and it appears to have started in the last week. Hmmm, the only thing different is that I've been booting with a music CD in tray.

Yep, taking the CD out of tray solved it! Putting the CD back in the tray did not spike CPU usage - but it happens when booting with a CD present.

Revision history for this message
jenna (giantpune) wrote :

I have this issue as well. kubuntu 11.04 64bit. On My dual-core, dbus-daemon is using 100% of one core. If I open the system monitor and kill the process for the kpackage update checker, the CPU usage drops down to normal levels. However, I am still unable to mount any samba shares until I logout and back in.

Revision history for this message
Dominik Holland (dominik-holland) wrote :

I have the same issue... After the pc is running for 5 days it was really slow. After killing skype it is getting better, so my load is only 1.4. But dbus-daemon still uses 100% and is unresponsive

Revision history for this message
Peter (jynyl) wrote :

I have similar symptoms ... running Kubuntu 11.04 64 bit on quad core.
Sometimes dbus maxes out a CPU when kmail is checking for mail, but other times it doesn't seem to be associated with kmail.

Sometimes, killing the dbus process that is using 100% of CPU causes screen to blank and keyboard to be unresponsive. In those cases, I could ssh in from another box, and shut down the machine.

Usually when dbus has maxed out a CPU, it is difficult to bring up a konsole terminal, and it times out instead of opening.

Revision history for this message
Justin (justinr) wrote :

Been having this problem recently as well. Sometimes come in to work to find dbus using 100% and the system tray area looking as if it was stopped halfway through a draw (corrupted looking icons). As per above I killed /usr/lib/kde4/libexec/kpackagekitsmarticon and everything goes back to normal.
Haven't suffered the same problem with remounting samba shares as Jenna (#18) mentioned above.

Revision history for this message
Tom Cloyd (tomcloyd) wrote :

This problem occurs for me on KB 11.04 (fully updated) quite reliably whenever I insert a flash-drive. I run three different boxes, one an HP Mini 110 netbook, one a Dell desktop, and the last an HP workstation. I keep the OS setup as identical as possible across the boxes. The problem occurs on all 3 machines. I can only get out of it by rebooting.

I currently have no applicatin programs running on my HP workstation, and CPU's running at 85%, with dbus-daemon leading the stampede. This is what typically happens. This is simply nuts. How many months does this have to keep messing us up before someone bothers to notice and fix the problem?

Revision history for this message
mr.goose (editor-garfnet) wrote :

@ Tom Cloyd. I have a lot of sympathy for your comment. I have been trying for months to find a fix for this. I have several 64bit KDE machines and they all do the same after a day or two - dbus snaffles large amounts if CPU and the machine grinds to a halt.

Seems the KDE team has actually found a fix for this problem, way back in June 2011. However, it would appear that the fix has not been incorporated into Ubuntu yet...
http://bugs.kde.org/show_bug.cgi?id=261180

Anyone any idea why this should be? Perhaps it's in a PPA somewhere?

Best wishes, G

Revision history for this message
Alexia Death (alexiade) wrote :

apparently there is a very easy way to make it stopp.

killall kpackagekitsmarticon

Thats the real culprit. Its apparently leaving connections to dbus open. If you kill it, everything will return to normal.

Revision history for this message
mr.goose (editor-garfnet) wrote :

@Alexia, that's a very useful tip. I have tried it on two of my Kubuntu Natty boxes and it worked both times.

One just needs to remember to leave a terminal open because sometimes dbus locks up so badly that one cannot actually launch a new application.

As a matter of interest, where did you find out about kpackagekitsmarticon?

Best wishes, G.

Revision history for this message
mr.goose (editor-garfnet) wrote :

Further to my last, seems kpackagekitsmarticon is not always the culprit. I'm finding that killing kpackagekitsmarticon does not always solve the problem. It would appear that a number of other applications, including Firefox and Thunderbird can lead to dbus going bonkers too.

Ho hum. Back to the proverbial drawing board...

Best wishes, G.

Revision history for this message
Arthur Kevin McGrath (akm-u) wrote :

I believe I just experienced this bug on a SuSE Linux 11.4 system. (kernel-desktop-2.6.37.6-0.7.1.x86_64) It is quite intermittent. I believe you should check the number of open file handles ( lsof | wc -l ) and also compare the open file handle count you get as root vs the number you get as an unprivileged user. On my system, that difference swells to 20,000 to 25,000 vs a normal number of 3,000 to 5,000 when DBus process is not consuming 100% CPU. What my gut tells me is happening is that something attached to the DBus -- and probably running at root privilege -- cannot get a file handle for whatever reason and goes into robo attempt to open another file handle mode. It may turn out that the actual problem is some common component to several packages. I cannot prove this because I did not see it. I would watch both the absolute number of file handles when DBus goes pathological (60,000 to 75,000 on my system) and the difference. I would also watch the number of open processes ( ps -ef | wc -l ). For whatever it is worth.

Revision history for this message
John Charonko (john-charonko) wrote :

@mr. goose

Information about kpackagekitsmarticon is discussed in http://bugs.kde.org/show_bug.cgi?id=261180, the linked upstream bug.

Incidentally, if killing kpackagekitsmarticon doesn't help, you might want to try this command, suggested in the KDE bug report. It will report how many files are open on each process.

ls /proc/| awk '{if($1+0==0) print " "; else system("echo `ls /proc/"$1+0"/fd|wc -l` "$1" `cat /proc/"$1+0"/cmdline` ")}' | sort -nr | head

When I run it, for example, I get (after a bunch of permission denied messages):

598 1505 //bin/dbus-daemon--fork--print-pid5--print-address7--session
518 15435 /usr/lib/kde4/libexec/kpackagekitsmarticon
107 6094 /usr/lib/thunderbird-6.0.2/thunderbird-bin
89 1688 akonadiserver
78 11625 /usr/lib/firefox-6.0.2/firefox-bin
72 1690 /usr/sbin/mysqld--defaults-file=/home/xxxxxxx/.local/share/akonadi//mysql.conf--datadir=/home/jcharonk/.local/share/akonadi/db_data/--socket=/home/jcharonk/.local/share/akonadi/socket-xxxxxxx/mysql.socket
68 1647 kdeinit4: ksmserver [kdeinit]
65 1731 /usr/bin/krunner
51 1686 /usr/bin/akonadi_control
47 1728 /usr/bin/nepomukservicestubnepomukstorage

As you can see, dbus-daemon is already up to 598 files, and most of these belong to kpackagekitsmarticon (518). I'm well on my way to needing to kill it again. See if you can spot another program that's behaving badly this way - maybe there's more than one culprit and the fix suggested for KStatusNotifierItem in kdelibs needs to be adapted to that program as well.

Or, if you don't have too many open files, it could be that you've found a different problem entirely with dbus.

Revision history for this message
kaloyan (kpenev) wrote :

dbus-daemon goes crazy for me as well (Ubuntu 10.11) and I don't see anything unreasonable in the number of open files:

115 2844 //bin/dbus-daemon--fork--print-pid5--print-address7--session
96 3566 gnome-terminal
71 3165 /usr/lib/firefox-6.0/firefox-bin
36 2901 /usr/bin/pulseaudio--start--log-target=syslog
32 2906 compiz
28 2792 gnome-session--session=ubuntu
27 2915 nautilus-n
27 2867 /usr/lib/gnome-settings-daemon/gnome-settings-daemon
24 3101 update-notifier
24 3084 /usr/bin/python/usr/lib/ubuntuone-client/ubuntuone-syncdaemon

It is making my computer near unusable (or at least extremely annoying to use)

Revision history for this message
kcstrom (kcstrom) wrote :
Download full text (3.7 KiB)

I'm seeing something very similar as well. Every time I try to open/close/delete a message in Thunderbird, DBUS take up ~100% CPU and thunderbird takes up about the other 100% of my other CPU for about 10-15 seconds.

Number of files total and per process looks similar to when it isn't happening.

Restarted didn't seem to make any difference.

I see a massive plethora of the following with strace -p

recvmsg(24, {msg_name(0)=NULL, msg_iov(1)=[{"a11y.atspi.Component\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 2048
recvmsg(24, {msg_name(0)=NULL, msg_iov(1)=[{"e\0\0\0\30\0\0\0org.a11y.atspi.Component"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 2048
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=9, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=14, events=POLLIN}, {fd=22, events=POLLIN}, {fd=23, events=POLLIN}, {fd=25, events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, events=POLLIN}, {fd=24, events=POLLIN}, {fd=28, events=POLLIN}, {fd=29, events=POLLIN}], 26, 108294) = 1 ([{fd=24, revents=POLLIN}])
recvmsg(24, {msg_name(0)=NULL, msg_iov(1)=[{"52867\0\0\0\0\0\0\0=\0\0\0\31\0\0\0org.a11y.ats"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 2048
recvmsg(24, {msg_name(0)=NULL, msg_iov(1)=[{"7\0\0\0!\0\0\0/org/a11y/atspi/accessib"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 2048
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=9, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=14, events=POLLIN}, {fd=22, events=POLLIN}, {fd=23, events=POLLIN}, {fd=25, events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, events=POLLIN}, {fd=24, events=POLLIN}, {fd=28, events=POLLIN}, {fd=29, events=POLLIN}], 26, 108293) = 1 ([{fd=24, revents=POLLIN}])
recvmsg(24, {msg_name(0)=NULL, msg_iov(1)=[{"\37\0\0\0/org/a11y/atspi/accessible/r"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 2048
recvmsg(24, {msg_name(0)=NULL, msg_iov(1)=[{"/a11y/atspi/accessible/672136\0\0\0"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 2048
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=9, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=14, events=POLLIN}...

Read more...

Revision history for this message
John Charonko (john-charonko) wrote :

kaloyan and kcstrom -

It's possible you are experiencing a different dbus bug from those with the too many open files problem. For me, the update to Kubuntu 11.10 resolved the problem. It appears the kdelibs patch that was included in that release fixed it, as expected. I'm just frustrated it took so long, but at least it's done now.

What release are you on?

Have you tried to look at dbus-monitor? Maybe you can see if a particular program is behaving badly.

In the case of the kdelibs problem I was having, dbus-monitor was completely silent, like it was locked up or unable to respond, probably because it was waiting to open a new file.

Revision history for this message
kcstrom (kcstrom) wrote :

I'm running oneiric - ubuntu 11.10. dbus-monitor is slient. I'm pretty sure for me it's something to do with thunderbird since it started happening on cue if open/close/delete an email. Running strace -p on thunderbird, I see

stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0

about 45000 times and then an increasing number of sendmsg calls that all look similar to

sendmsg(38, {msg_name(0)=NULL, msg_iov(2)=[{"l\4\1\1\24\1\0\0002\304\23\0x\0\0\0\1\1o\0\25\0\0\0/org/a11"..., 136}, {"\5\0\0\0:1.48\0\0\0!\0\0\0/org/a11y/atspi/"..., 276}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 412

That goes on until line 356014 and then some polls, reads, a ton of

madvise(0x7fa56ccbc000, 4096, MADV_DONTNEED) = 0

and I had killed strace at that point while thunderbird and dbus were still going nuts. Perhaps this should be filed with thunderbird.

Revision history for this message
kcstrom (kcstrom) wrote :

Oh, and the files they are send and receiving to/from:

lrwx------ 1 ssss ssss 64 2011-12-19 16:52 /proc/1618/fd/24 -> socket:[113043]
ross@ross-HP-630-Notebook-PC:~$ ls -l /proc/10138/fd/38
lrwx------ 1 ssss ssss 64 2011-12-20 14:16 /proc/10138/fd/38 -> socket:[113042]

Revision history for this message
kcstrom (kcstrom) wrote :

I filed my bug over at mozilla: https://bugzilla.mozilla.org/show_bug.cgi?id=712471 since it seemed to be provoked by closing an email tab in Thunderbird.

Revision history for this message
BlessedGeek (blessedgeek) wrote :

Problem solved. Or did I? At first, I was baffled what was consuming 60% of one of my four cpus, as shown by system monitor. System monitor itself was 10%.

So I increased the number of columns to display in system monitor. The d-bus daemon usage for that cpu went up to 80%. I tried to exit from system monitor. It won't exit. I had to kill system monitor with system monitor.

Being an ancient unix admin/programmer, I began to doubt the programming finesse of new-fangled programmers. I used top.

To my surprise, top showed my top process using only 2%. Something is wrong with top. Or with system monitor.
I fired up system monitor next to top. Immediately, d-bus daemon started to show up on top at 64% and system monitor at 2nd position of 8%. I killed system monitor and top again showed top process using only 2%.

Congratulations, system monitor and authors -- you managed to redefine effective programming -- from an old school programmer.

To be fair, dbus-monitor was showing gnome-system-monitor spewing globs of messages in text that appears to be scripts instantiating arrays. Using scripts and passing scripts around like a localised script-rpc in a windowing system? Just because you could use the desktop bus to pass scripts or passing script-based data structures around appears elegant does not mean you should do it. The desktop is not a REST service, for goodness' sake.

I have a strong impression that gnome-system-monitor is not the only guilty party in this erroneous attitude in the new fangled hi falutin script structure passing orgy party in the linux realm. In fact, the whole gnome desktop is passing these inefficient data structures around. The menu system, the email system, the Mozilla browser, etc. Holy Moly, please stop this json/xml madness. This is a desktop UI where endian issues are not encountered by using C structs, because they are compiled to the local machine. And even if you wish use the d-bus across two architecturally incompatible machines, you should provide it as a different option, so as not to cripple the performance of 99.6% of all normal users. And even so, you should use some ancient-but-established ONC XDR technologies.

Perhaps, my opinions are too strong, and may be a tad technologically off-centred - so, I beg your pardon, but y'all should know what I mean here.

C'mon people, get back to using C and binary data structures. Before linux sinks beneath microsoft. There is no cross-architecture compatibility issues to overcome, within a desktop bus. Or have programmers forgotten how to use a C struct? May be, y'all should consider using Qt/C++.

Revision history for this message
Eloy Paris (peloy-chapus) wrote :

I have run into this issue. Also when using Thunderbird. I thought it was the 14 to 15 Thunderbird upgrade in 12.04 since the problem started basically overnight, and coincidentally after the 14 to 15 Thunderbird upgrade. I was pulling my hair out since Thunderbird was basically unusable -- it would run normally but pause for a few seconds after opening a new message, deleting messages, or just clicking to select messages. It would also pause while writing a new message, so whatever you typed didn't show up right away but instead be displayed all of the sudden a few seconds later. Totally annoying.

Another symptom was Unity dimming the Thunderbird window at some points because it thought that TB was unresponsive (hung), though it would undim it a few seconds later -- I would notice this behavior when TB initially ran and it was establishing the connection to my IMAP server. I would also notice this same behavior at some other times, like opening a new message, though these were rare.

I downgraded to TB 14 and it didn't help. I ran TB without extensions; in safe mode, etc. to no avail. It was completely puzzling because I had not been having this problem before.

Running strace on the Thunderbird executable revealed one of the same symptoms mentioned by people above, i.e. lots of these:

sendmsg(38, {msg_name(0)=NULL, msg_iov(2)=[{"l\4\1\1\24\1\0\0002\304\23\0x\0\0\0\1\1o\0\25\0\0\0/org/a11"..., 136}, {"\5\0\0\0:1.48\0\0\0!\0\0\0/org/a11y/atspi/"..., 276}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 412

Then, somehow I found this bug and confirmed in top that each time I had a pause, both thunderbird and dbus-daemon would be hogging the CPU. I did notice that the dbus-daemon that was hogging the CPU was not the system DBus daemon, but one associated with at-spi2 (accessability).

Fortunately, I don't need accessability features, so I looked for ways to prevent all the at-spi2 stuff from auto-starting upon logging into my Unity session to see if that fixed the problem I was having with Thunderbird. Turns out that in Ubuntu 12.04 you can disable at-spi2 by using gsettings like this:

shell$ gsettings set org.gnome.desktop.interface toolkit-accessibility false

After logging out and logging back in, the at-spi2 processes were not running. I then fired up Thunderbird and the problem was completely gone. In fact, Thunderbird feels now a lot snappier -- everything (opening messages, deleting messages, replying, etc.) seems faster. But, the most important thing is that the annoying pauses/temporary freezes seem to be completely gone.

What is very puzzling is that this problem happened overnight for me. I could have been an unrelated 12.04 package update, but if that is the case, I don't know what package it was. In any case, if you end up reading this bug, and your symptoms are similar to the ones described here then try disabling at-spi2.

Revision history for this message
Eloy Paris (peloy-chapus) wrote :

Sorry, I just realized that this bug is marked as fixed, and that the comments in the bug are probably referring to different issues. The issue of high CPU in dbus-daemon associated with at-spi2 and Thunderbird probably deserves a separate bug for tracking. If someone files that bug please provide here the bug number for easy reference.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.