codebrowse rapidly eating memory and driving into swap

Bug #620207 reported by Steve McInerney
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
Medium
Unassigned
loggerhead
New
Undecided
Unassigned

Bug Description

Just got a swap alert on guava for codebrowse.
doing some back tracing of ps(1) dumps we see:

USER PID PPID NI PRI TIME %MEM RSS SZ VSZ STAT BLOCKED NLWP STARTED ELAPSED CMD
2010-08-19:01:16:17.ps:1000 27243 1 0 19 03:44:36 27.1 1101832 1217072 1502164 Sl 0000000000000000 13 Wed Aug 18 00:00:16 2010 1-01:16:01 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:17:29.ps:1000 27243 1 0 19 03:45:49 27.1 1101832 1217072 1502164 Sl 0000000000000000 13 Wed Aug 18 00:00:16 2010 1-01:17:13 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:18:24.ps:1000 27243 1 0 19 03:46:45 27.1 1101832 1217072 1502164 Sl 0000000000000000 13 Wed Aug 18 00:00:15 2010 1-01:18:09 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:19:07.ps:1000 27243 1 0 19 03:47:29 27.1 1101832 1217072 1502164 Sl 0000000000000000 13 Wed Aug 18 00:00:15 2010 1-01:18:52 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:20:19.ps:1000 27243 1 0 19 03:48:42 27.1 1101832 1217072 1502164 Sl 0000000000000000 13 Wed Aug 18 00:00:15 2010 1-01:20:04 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:21:06.ps:1000 27243 1 0 19 03:49:28 47.9 1942900 2095100 2380192 Rl 0000000000000000 13 Wed Aug 18 00:00:15 2010 1-01:20:51 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:22:13.ps:1000 27243 1 0 19 03:50:36 73.4 2979036 3158584 3443676 Sl 0000000000000000 13 Wed Aug 18 00:00:15 2010 1-01:21:58 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:23:02.ps:1000 27243 1 0 19 03:51:23 82.4 3342920 3798188 4083280 Sl 0000000000000000 13 Wed Aug 18 00:00:16 2010 1-01:22:46 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:24:10.ps:1000 27243 1 0 19 03:52:28 93.3 3784564 4575640 4860732 Sl 0000000000000000 13 Wed Aug 18 00:00:16 2010 1-01:23:54 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:25:07.ps:1000 27243 1 0 19 03:53:25 91.1 3695196 6090020 6375112 Sl 0000000000000000 14 Wed Aug 18 00:00:16 2010 1-01:24:51 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:26:12.ps:1000 27243 1 0 19 03:54:25 50.7 2057364 4110932 4396024 Sl 0000000000000000 12 Wed Aug 18 00:00:16 2010 1-01:25:56 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:27:31.ps:1000 27243 1 0 19 03:55:39 77.8 3155792 5152740 5437832 Sl 0000000000000000 17 Wed Aug 18 00:00:16 2010 1-01:27:15 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:28:09.ps:1000 27243 1 0 19 03:56:17 76.5 3104352 5359276 5644368 Sl 0000000000000000 17 Wed Aug 18 00:00:16 2010 1-01:27:53 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:29:28.ps:1000 27243 1 0 19 03:57:12 30.4 1232336 3332276 3617368 Tl 0000000000000000 18 Wed Aug 18 00:00:16 2010 1-01:29:12 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:30:19.ps:1000 27243 1 0 19 03:57:12 47.0 1905696 3332276 3617368 Tl 0000000000000000 18 Wed Aug 18 00:00:16 2010 1-01:30:03 /usr/bin/python scripts/start-loggerhead.py
2010-08-19:01:32:22.ps:1000 8709 1 0 19 00:00:02 2.0 84984 101880 387252 Sl 0000000000000000 11 Thu Aug 19 01:31:27 2010 00:55 /usr/bin/python scripts/start-loggerhead.py

Which'll probably wrap horribly.
1st column is the file name, date/time stamped.
summary is that we jumped about a 1Gb a minute in memory usage from ~ 1Gb RSS to ~3Gb RSS. And additional memory getting gobbled in swap.

-debug showed:
--01:28:51-- http://localhost:8080/thread-debug
           => `-'
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 Ok
Length: unspecified

    [<=> ] 0 --.--K/s url time running time since last activity
                                 http://bazaar.launchpad.net/%7Eltsp-docwriters/ltsp/ltsp-docs-trunk/files 376.642073154 304.382411003
                         http://bazaar.launchpad.net/%7Ezeitgeist-dataproviders/zeitgeist-dataproviders/trunk/files 376.62422514 304.337142944
                         http://bazaar.launchpad.net/%7Ezeitgeist-dataproviders/zeitgeist-dataproviders/trunk/files 376.654032946 304.404523134
                                      http://bazaar.launchpad.net/%7Evcs-imports/busybox/main/changes 376.613197088 299.557316065
                                      http://bazaar.launchpad.net/%7Evcs-imports/busybox/main/changes 261.015025139 222.790905952
                               https://bazaar.launchpad.net/%7Eexaminer-dev/examiner/bugfixes/revision/15866 223.562906981 223.563609123
 http://bazaar.launchpad.net/%7Ebreathe-dev/breathe-icon-set/trunk/download/head%3A/banner.png-20091222220929-jok8hkkxn07szdp4-1/banner.png 171.593261003 41.7034330368
               http://bazaar.launchpad.net/%7Eubuntu-branches/ubuntu/lucid/util-linux/lucid/annotate/head%3A/debian/changelog 139.243394136 0.90297293663
                                    http://bazaar.launchpad.net/%7Emysql/mysql-connector-cpp/trunk/files 12.6777670383 12.6783139706
                       http://bazaar.launchpad.net/%7Eubuntu-branches/ubuntu/maverick/indicator-sound/maverick/files 83.9491331577 12.7283239365
                                                     http://localhost:8080/thread-debug 8.10623168945e-06 8.10623168945e-06
                                  http://bazaar.launchpad.net/%7Eopenerp/openobject-client-web/trunk/atom 17.2097809315 17.210214138
                                     http://bazaar.launchpad.net/%7Eteamkarl/karl-docs/trunk/.bzr/smart 1.09047794342 1.09110808372
                       http://bazaar.launchpad.net/%7Eubuntu-branches/ubuntu/maverick/indicator-sound/maverick/files 19.7454020977 19.746639967
    [ <=> ] 2,693 --.--K/s

01:28:51 (21.85 MB/s) - `-' saved [2693]

the above dumps are both pasted here in their non-wrapped glory: https://pastebin.canonical.com/36018/

This *seems* to be a new failure mode for codebrowse.
I attempted to get a gcore, but that actually KO'd the service and was still running after a few minutes, so killed and restarted. :-(

Steve McInerney (spm)
tags: added: canonical-losa-lp
Tim Penhey (thumper)
Changed in launchpad-code:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.