Runaway leak in composer

Bug #51662 reported by jmspeex
12
Affects Status Importance Assigned to Milestone
evolution (Ubuntu)
Invalid
Low
Ubuntu Desktop Bugs

Bug Description

Binary package hint: evolution

Every once in a while, the evolution composer becomes out of control while leaking memory. This usually makes my system completely unresponsive within about 10-30 seconds. Evolution's memory footprints grows at a very fast rate until it gets killed by the OOM or I hit the power button (whichever comes first depending on the amount of swap space I have). I have noticed that this tends to happen just after I paste some text in the window, but I have not found a way to reliably reproduce the problem. This is quite a severe problem because the leak happens so fast that I don't have time to react and kill evolution before the system becomes unresponsive. On any machine with more than 1 GB swap, the result is a complete lockup -- there's no way out of this than to reboot the machine because the swapping goes on forever.

Revision history for this message
Sebastien Bacher (seb128) wrote :

Thanks for your bug. What version of Ubuntu do you use? Do you use some special plugin like the exchange connector? Could you run evolution with valgrind (apt-get install valgrind; valgrind evolution), that makes it really slow but should detect any issue with the memory management on your installation

Changed in evolution:
assignee: nobody → desktop-bugs
status: Unconfirmed → Needs Info
Revision history for this message
jmspeex (jean-marc-valin) wrote :

I'm running Dapper, but I remember seeing the problem on Breezy (and IIRC earlier) as well. AFAIK, I don't use any special plugin and I don't use the exchange connector. I tried running valgrind but not only is that incredibly slow, but there are hundreds of errors and leaks reported. The summary is:
==17420== ERROR SUMMARY: 10757 errors from 583 contexts (suppressed: 211 from 1)
==17420== malloc/free: in use at exit: 19,248,789 bytes in 292,773 blocks.
==17420== malloc/free: 4,834,032 allocs, 4,541,259 frees, 280,995,543 bytes allocated.
...
==17420== LEAK SUMMARY:
==17420== definitely lost: 26,492 bytes in 541 blocks.
==17420== indirectly lost: 67,047 bytes in 1,625 blocks.
==17420== possibly lost: 375,893 bytes in 557 blocks.
==17420== still reachable: 18,779,357 bytes in 290,050 blocks.
==17420== suppressed: 0 bytes in 0 blocks.

As I mentioned earlier, the problem only happens once every few weeks (but crashes my machine 50% of the time), so it's hard to reproduce. Unfortunately, it's just not feasible to run evolution under valgrind all the time.

Revision history for this message
Sebastien Bacher (seb128) wrote :

and without debug log it's not easy to figure what happens on your configuration, I don't have a such issue for my part...

Letting the bug as Need Info for now, feel free to reopen if you can get debug informations on your issue

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Well, any suggestion as to how to obtain debug info? It's not like using valgrind+evolution as my main mail client for weeks is a realistic option. Not to mention the fact that even if that was possible, valgrind would have reported so many errors already that the real one would be buried in Megabytes of data. I'm open to other methods, though.

Revision history for this message
Sebastien Bacher (seb128) wrote :

not easy thing to figure if that doesn't happen often, maybe upstream would know better since they work on the code and are used to debug it, maybe you could consider opening an upstream bug rather than a distro one

Revision history for this message
jmspeex (jean-marc-valin) wrote :

How do I do that? Sorry, not familiar with that bug tracking system.

Revision history for this message
Sebastien Bacher (seb128) wrote :

run bug-buddy by example or use http://bugzilla.gnome.org website

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

I have just had this happen to me in Edgy. I just so happen to run my system in "leave a core dump behind mode" and in desperation I killed evolution with a SIGSEV, leaving a core dump 956M behind and around 200M of free space left on the disk. Apport then started up and proceeded to chew up the rest of the free disk space before I manually killed it off...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote : gdb backtrace
Download full text (7.7 KiB)

Program terminated with signal 11, Segmentation fault.
#0 0xb7595d72 in html_text_op_copy_helper ()
   from /usr/lib/libgtkhtml-3.8.so.15
(gdb) thread apply all bt

Thread 8 (process 4318):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7222321 in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb72ddf1c in e_msgport_wait () from /usr/lib/libedataserver-1.2.so.7
#3 0xb72de5e9 in e_msgport_reply () from /usr/lib/libedataserver-1.2.so.7
#4 0xb70c1504 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#5 0xb722951e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 7 (process 4319):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7222321 in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb72ddf1c in e_msgport_wait () from /usr/lib/libedataserver-1.2.so.7
#3 0xb72de5e9 in e_msgport_reply () from /usr/lib/libedataserver-1.2.so.7
#4 0xb70c1504 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#5 0xb722951e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 6 (process 4320):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7222321 in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb72ddf1c in e_msgport_wait () from /usr/lib/libedataserver-1.2.so.7
#3 0xb72de5e9 in e_msgport_reply () from /usr/lib/libedataserver-1.2.so.7
#4 0xb70c1504 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#5 0xb722951e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 5 (process 4330):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7222321 in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb72ddf1c in e_msgport_wait () from /usr/lib/libedataserver-1.2.so.7
#3 0xb72de5e9 in e_msgport_reply () from /usr/lib/libedataserver-1.2.so.7
#4 0xb70c1504 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#5 0xb722951e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 4 (process 4331):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7222321 in select () from /lib/tls/i686/cmov/libc.so.6
#2 0xb72ddf1c in e_msgport_wait () from /usr/lib/libedataserver-1.2.so.7
#3 0xb72de5e9 in e_msgport_reply () from /usr/lib/libedataserver-1.2.so.7
#4 0xb70c1504 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#5 0xb722951e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 3 (process 4861):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb721f803 in poll () from /lib/tls/i686/cmov/libc.so.6
#2 0xb75e8813 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#3 0xb75e8b89 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
---Type <return> to continue, or q <return> to quit---
#4 0xb7300e62 in e_book_get_type () from /usr/lib/libebook-1.2.so.9
#5 0xb760338f in g_thread_create_full () from /usr/lib/libglib-2.0.so.0
#6 0xb70c1504 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7 0xb722951e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 2 (process 4864):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb721f803 in poll () from /lib/tls/i686/cmov/libc.so.6
#2 0xb75e8813 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#3 0xb75e8b89 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#4 0xb7cf57e0 in link_set_io_thread () from /usr/lib/libORBi...

Read more...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

I've removed the base64 core from the part of the crash that apport had started writing and attached it here.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Setting back to confirmed.

Changed in evolution:
status: Needs Info → Confirmed
Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

I am going to have to delete the core soon because it is taking up so much space so if there's anything more that needs to be done with it let me know. This problem occurred while I was typing a mail in the evolution mail compose. I had used the right mouse button to correct some mistakes and was deleting and retyping some text when the tale tale sign of slowness had me racing off to virtual terminal one so I could kill evolution off before it ate all the memory.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Installed a few debug symbols and redid the backtrace.

Revision history for this message
Sebastien Bacher (seb128) wrote :

Sitsofe, your crasher looks like http://bugzilla.gnome.org/show_bug.cgi?id=347558. It looks a different issue than the launchpad bug you are commenting on though, why did you decide to comment on that bug instead of opening a new one?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Sebastien:

Whenever I go to file a bug I will always try to look to see if it has been filed already and to add my voice (or CC) to that bug rather than generating new noise. The whole process can take 20 or so minutes as I slowly make my way through various search results (which coincidently sometimes leads me to finding duplicate bugs which are not yet in the same product and adding a comment) whereupon I decide whether to file a new bug or append to an existing one (but I am biased towards not opening a new bug). My reasoning is that more people are likely to see an addition to an existing report since there are people looking at it already.

I honestly thought this was the same issue as I have had this sort of problem with the dapper version of Evolution too. In that case I quickly learned about how to manually call the OOM killer.

If it would help I can try and spin this off into a new issue. Apologies for any inconvenience caused.

Revision history for this message
Lachlan (lachlan) wrote :

My work around for Evolution is to have a terminal window open with a kill command set up to kill it. As soon as the memory starts to run away, as fast as I can I kill the app. If I am not fast enough then whole desktop freezes. I never should have upgraded to Edgy... ug.

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Actually, I originally filed this bug against Dapper... In any case, installing thunderbird solved the problem permanently for me and I'm not looking back (took about an hour to copy mail around and configure). It's not perfect, but it's much more stable and has never crashed my machine.

Revision history for this message
Sebastien Bacher (seb128) wrote :

No problem with the comment Sitsofe, the bug is about a "runaway leak in composer" which looks different from a crasher which is what your backtrace is about. Could anybody get a valgrind log for the bug?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Sebastien:

Evolution didn't crash per se - I forcefully killed it off with a signal that would make it leave a core because it was chewing up memory and I didn't know what else to do (but hoped a stacktrace might give some insight). When this problem strikes it causes so much swap thrashing the machine becomes practically unusable as it is bogged down...

Revision history for this message
Sebastien Bacher (seb128) wrote :

ok, that is clearer now. Maybe a valgrind log (https://wiki.ubuntu.com/Valgrind) would be useful

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Sebastien:
I don't have steps that can reproduce the problem on demand and it is very intermittent. Further, my computer is too slow to always run evolution under valgrind all the time so I'm going to set this bug back to needinfo for now...

Changed in evolution:
status: Confirmed → Needs Info
Revision history for this message
Sebastien Bacher (seb128) wrote :

that bug is probably not going easy to work on if there is no special way to trigger it and if it doesn't happen often

Revision history for this message
Daniel Holbach (dholbach) wrote :

Did the problem happen to you again?

Changed in evolution:
importance: Undecided → Low
Revision history for this message
jmspeex (jean-marc-valin) wrote :

I didn't observe the problem again because I'm no longer using evolution (mainly because of this bug). All I can remember is that the bug was often triggered by pasting some text into the compose window. Also, if I saved the message before killing evolution (it was sometimes possible), then trying to edit it again after restarting evolution would often trigger the bug again.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Daniel:
Not so far in Feisty but I haven't been using evo as much as usual during this beta testing period.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Setting back to confirmed following replys.

Changed in evolution:
status: Needs Info → Confirmed
Revision history for this message
Nicolas Fesselet (nicolas-fesselet) wrote :

Just a note to say that I got the same memory eating thing on my Feisty. I copied a link about 5 seconds before it happened but I'm not sure that it's what triggered it.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :
Download full text (34.7 KiB)

This happened to me again in Feisty.
evolution 2.10.1-0ubuntu2

I was editing some text in an email while copying and pasting into the email window from a few different programs. Occasionally I would use undo. I noticed the on set of the problem and managed to send evolution and SIGSTOP. Here's some output from ps and free
x 21611 1.2 68.6 1379524 442952 ? Dl 05:58 1:24 evolution --com
:~$ free
             total used free shared buffers cached
Mem: 645628 639928 5700 0 876 24036
-/+ buffers/cache: 615016 30612
Swap: 1566296 1168552 397744

I was eventually forced to kill evolution off because I had so little swap free...

Backtraces while the problem was happening:
Thread 11 (Thread -1254192240 (LWP 21620)):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb70275c6 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7110dbd in pthread_cond_wait () from /lib/tls/i686/cmov/libc.so.6
#3 0xb753dc42 in g_async_queue_pop_intern_unlocked (queue=0x8139810,
    try=<value optimized out>, end_time=0x0) at gasyncqueue.c:334
#4 0xb71c37e5 in e_msgport_wait (msgport=0x814ad18) at e-msgport.c:684
#5 0xb71c3ed8 in thread_dispatch (din=0x81397b0) at e-msgport.c:1048
#6 0xb702331b in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7 0xb710457e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 10 (Thread -1262584944 (LWP 21621)):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb70275c6 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7110dbd in pthread_cond_wait () from /lib/tls/i686/cmov/libc.so.6
#3 0xb753dc42 in g_async_queue_pop_intern_unlocked (queue=0x8139810,
    try=<value optimized out>, end_time=0x0) at gasyncqueue.c:334
#4 0xb71c37e5 in e_msgport_wait (msgport=0x814ad18) at e-msgport.c:684
#5 0xb71c3ed8 in thread_dispatch (din=0x81397b0) at e-msgport.c:1048
#6 0xb702331b in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#7 0xb710457e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 9 (Thread -1271014512 (LWP 21623)):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb70fa893 in poll () from /lib/tls/i686/cmov/libc.so.6
#2 0xb755de03 in g_main_context_iterate (context=0x8228af8, block=1,
    dispatch=1, self=0x8222b30) at gmain.c:2979
#3 0xb755e179 in IA__g_main_loop_run (loop=0x8228bf0) at gmain.c:2881
#4 0xb752a744 in libnm_glib_dbus_worker (user_data=0x8212170)
    at libnm_glib.c:423
#5 0xb7578b7f in g_thread_create_proxy (data=0x8222b30) at gthread.c:591
#6 0xb702331b in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7 0xb710457e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 8 (Thread -1301726320 (LWP 21627)):
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb70275c6 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7110dbd in pthread_cond_wait () from /lib/tls/i686/cmov/libc.so.6
#3 0xb753dc42 in g_async_queue_pop_intern_unlocked (queue=0x81396e0,
    try=<value optimized out>, end_time=0x0) at gasyncqueue.c:334
#4 0xb71c37e5 in e_msgpo...

Revision history for this message
Ian Redfern (ian-redfern) wrote :

I've just had this in a current Gutsy with evolution_2.12.0-0ubuntu3_i386.deb - it happens about once a fortnight to me.

Revision history for this message
Patrick Koppenburg (patrick-koppenburg) wrote :

I see this bug quite often. And it's very annoying when I am on my laptop as I really have to switch it off to get out. ctrl-alt-backspace won't work. I just see the memory raising and raising.

A general pattern is that it only happens when
1) I have a paragraph containing names of C++ classes with capitals in the middle of the word and
2) I then try to change the text in the middle of the paragraph. I believe it's when it tries to do the line wrapping that it fails.

An example is what I was typing just now:

     Because it's an instance of FilterDesktop and not CombineParticles. In FilterDesktop there's only one filter applied while in CombineParticles there are three.

Then I wanted to come back to "In FilterDesktop" and change it to "FilterDesktop does only filter and therefore". I believe it's when I accidentally removed a space and then had "FilterDesktop does only filterthere's only one" (no space between filter and there) that evolution started to leak.

One additional hint may be that I have english, french and german spellcheck on all time.

Revision history for this message
Patrick Koppenburg (patrick-koppenburg) wrote :

Just happened again! I was typing

It is always possible to find out as myFitterParticle->daughters() will return the phi and the psi, while myFitterParticle->outgoingParticle()

and wanted to change it to

It is always possible to find out as myFitterParticle->daughters() will return the phi and the psi, while myFitterParticle->vertex()->outgoingParticle()

... and it bombed :-(

Revision history for this message
Tom Funk (tdfunk) wrote :

I'm currently seeing this same behavior about every second or third day in Gutsy Kubuntu (Linux 2.6.22-14-386 #1 Tue Feb 12 07:12:19 UTC 2008 i686 GNU/Linux) with Evolution 2.12.1.

It does seem to be limited to the composer. It also seems to be triggered by some key stroke combination. Ctrl-C, maybe.

It does NOT seem to be specific to the text being edited.

It slows down my laptop considerably, but some thing keep working, albeit slowly.

To recover, I've set KSysGuard to start when I click the System Monitor taskbar applet. Whenever this happens, I fire up KSysGuard, filter on 'evo' and kill it dead.

By the time I have KSysGuard running, the VmRss column reads over 1.5GB. The VmSize is usually well over 2GB. user% and system% are reasonable, though the disk seems maxed out.

Is there anything I can do to assist with data collection?

I'd prefer to keep using Evolution, as I do have to connect to an Exchange server.

Thanks.

-- tom

Revision history for this message
Pedro Villavicencio (pedro) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. You reported this bug a while ago and there hasn't been any activity in it recently. We were wondering is this still an issue for you? If it's can you get a valgring log and submit a bug at bugzilla.gnome.org with it? thanks.

Changed in evolution:
status: Confirmed → Incomplete
Revision history for this message
Pedro Villavicencio (pedro) wrote :

Closing this bug report as no further information has been provided. Please feel free to reopen this bug if you can provide the information asked for. Thanks!.

Changed in evolution:
status: Incomplete → Invalid
Revision history for this message
jmspeex (jean-marc-valin) wrote :

I think the main reason you're getting no more information is that:
1) No information has been provided by the developers
2) Most people really affected by this bug have probably switched to another client just like I've done.
You need to understand that the bug's been around for years now and that it makes evolution so much of a pain to use when you're effected by it, that switching is pretty much the only viable option (I resisted about 9 months).

Revision history for this message
Sebastien Bacher (seb128) wrote :

it's rather than almost nobody run into those issues and those who have not be able to figure what they do which trigger the bug, the bug could be due to some plugin used for example

Revision history for this message
Ian Redfern (ian-redfern) wrote :

I used to get this problem on Ubuntu Gutsy every couple of weeks, but it vanished on Ubuntu Hardy, so from my point of view it's fixed. I never worked out how to reproduce it, and running Evolution under valgrind for weeks wasn't a practical option.

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Sebastien, I *did* try valgrind. The problem was that evolution generates so many warnings that valgrind's useless. It's also incredibly slow (evo+valgrind) that you can't run that all the time. Some people also gave stack traces, so don't say there was no info provided. Also, in my case, I wasn't using any plugin or anything like that when I had the problem. I'm glad the bug seems to be fixed with Hardy, though I'm staying with Thunderbird anyway.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Sebastien:
Believe me I tried to provide more information and work out what caused it (to me it seemed to be triggered by copy and paste). It's not a super common bug but when it hits the results are brutal. No one followed up after my Feisty posting and nowadays I doubt I use evolution enough to trigger it with any regularity. I did follow this one through several Ubuntu releases though.

Revision history for this message
jmspeex (jean-marc-valin) wrote :

I also believe it's related to copy-paste, because it always triggers a bit after a paste.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.