bzr branch uses too much RAM when not using a shared repository

Bug #183156 reported by Данило Шеган
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
High
Unassigned

Bug Description

"bzr branch bzr+ssh://devpad.canonical.com/code/rocketfuel/launchpad/devel blahblah" outside a shared repository (I sometimes make that mistake using the parent folder of the shared repository I actually want to be in) consumed all of my memory (I noticed when it started to make my system crawl), and then I killed it.

I was using the following version (I'll test with 1.0 in a few minutes):
$ bzr --version
Bazaar (bzr) 0.92.0.candidate.1
  Python interpreter: /usr/bin/python 2.5.1.final.0
  Python standard library: /usr/lib/python2.5
  bzrlib: /usr/lib/python2.5/site-packages/bzrlib
  Bazaar configuration: /home/danilo/.bazaar
  Bazaar log file: /home/danilo/.bzr.log

Relevant .bzr.log snippet:

bzr arguments: [u'branch', u'bzr+ssh://devpad.canonical.com/code/rocketfuel/laun
chpad/devel', u'import-bugs']
looking for plugins in /home/danilo/.bazaar/plugins
looking for plugins in /usr/lib/python2.5/site-packages/bzrlib/plugins
Plugin name __init__ already loaded
Plugin name __init__ already loaded
encoding stdout as sys.stdout encoding 'UTF-8'
ssh implementation is OpenSSH
created control directory in file:///home/danilo/2006/launchpad/repo/import-bugs
/
creating repository in file:///home/danilo/2006/launchpad/repo/import-bugs/.bzr/
.
Using fetch logic to copy between RemoteRepository(bzr+ssh://devpad.canonical.co
m/code/rocketfuel/launchpad/.bzr/)(remote) and KnitRepository('file:///home/dani
lo/2006/launchpad/repo/import-bugs/.bzr/repository/')(<RepositoryFormatKnit1>)
fetch up to rev {<email address hidden>}

Tags: memory
Revision history for this message
Данило Шеган (danilo) wrote :

Testing with bzr 1.0 doesn't replicate this behaviour: total allocated virtual memory is 479M, but RSS has gone from 29M to 38M and it's staying there for a while already (not yet finished, started about 40 minutes ago).

Revision history for this message
Данило Шеган (danilo) wrote :

1.0 still hasn't finished branching over a good 1024/192 connection, RSS has grown to 291M, and it has used 1.55h of CPU time according to top (2h13m real-world time).

Revision history for this message
Данило Шеган (danilo) wrote :

And the same actually happened with 1.0 now: after ~3h of running, it started eating up memory (I killed it at 1.6g of resident memory, I can't afford for my laptop to die at this moment), and CPU usage got up to a constant ~50%. (the previous 1.55h of CPU time should have been 1.55min though)

Revision history for this message
Andrew Bennetts (spiv) wrote :

I think the new smart streaming pull in bzr.dev should fix this. I'll test and confirm.

Changed in bzr:
importance: Undecided → High
status: New → Triaged
Revision history for this message
John A Meinel (jameinel) wrote :

andrew, did you end up confirming this?

I'm doing some memory debugging today, and I've found that just reading the indexes and holding all of the keys in memory can be a rather significant amount of memory.

Specifically:

python -c "
from bzrlib import trace, branch
trace.enable_default_logging()
b = branch.Branch.open('launchpad-2a/devel')
b.lock_read()
maps = []
for vf_name in ['revisions', 'signatures', 'inventories', 'chk_bytes', 'texts']:
    vf = getattr(b.repository, vf_name)
    maps.append(vf.get_parent_map(vf.keys()))
    trace.debug_memory('after %s' % (vf_name,))
b.unlock()
trace.debug_memory('after unlock')
del maps
trace.debug_memory('after del')
list() # Make sure to trigger real garbage collection after big dicts are deleted
import gc
print gc.collect()
trace.debug_memory('after gc')
del b
trace.debug_memory('after del b')
print gc.collect()
trace.debug_memory('after gc')
"

after revisions
VmPeak: 54408 kB
VmSize: 46268 kB
VmRSS: 43332 kB
after signatures
VmPeak: 64664 kB
VmSize: 58500 kB
VmRSS: 55676 kB
after inventories
VmPeak: 90772 kB
VmSize: 82888 kB
VmRSS: 79892 kB
after chk_bytes
VmPeak: 280732 kB
VmSize: 224852 kB
VmRSS: 222164 kB
after texts
VmPeak: 306600 kB
VmSize: 269712 kB
VmRSS: 266688 kB
after unlock
VmPeak: 306600 kB
VmSize: 269712 kB
VmRSS: 266688 kB
after del
VmPeak: 306600 kB
VmSize: 237436 kB
VmRSS: 234412 kB
0
after gc
VmPeak: 306600 kB
VmSize: 237440 kB
VmRSS: 234416 kB
after del b
VmPeak: 306600 kB
VmSize: 237440 kB
VmRSS: 234416 kB
22
after gc
VmPeak: 306600 kB
VmSize: 237440 kB
VmRSS: 234416 kB

So... why isn't any of that memory getting reclaimed, even after deleting the branch and running gc.collect()

Revision history for this message
John A Meinel (jameinel) wrote :

Note also that:
(for f in launchpad/.bzr/repository/indices/*; do bzr dump-btree --raw $f; done) > raw_content.txt
Ends up with about 140MB of raw content. This doesn't account for the fact that:

1) Pretty much every key will be in there 2 times, if not significantly more than that. (each child will reference its parent, each file_key has a revision_id that is duplicated in the revision graph, and a file_id that is repeated as well.)
2) In memory, we will be using tuples of strings, which adds a fair amount of overhead. Specifically something like, 24 bytes of memory for a string, + 24 bytes of memory for a tuple (12 for the tuple fields, 12 for the gc fields) + 4 or 8 for the tuple pointers .

Revision history for this message
John A Meinel (jameinel) wrote :

Note that if we change the loop to:
python -c "
from bzrlib import trace, branch
trace.enable_default_logging()
b = branch.Branch.open('launchpad-2a/devel')
b.lock_read()
maps = []
for vf_name in ['revisions', 'signatures', 'inventories', 'chk_bytes', 'texts']:
    vf = getattr(b.repository, vf_name)
    maps.append(vf.keys())
    trace.debug_memory('after %s' % (vf_name,))
b.unlock()
trace.debug_memory('after unlock')
del maps
trace.debug_memory('after del maps')
"
Internally, .keys() calls iter_all_entries() which does not cache the keys in the btree caches. At that point we have:
after revisions
VmPeak: 25900 kB
VmSize: 25900 kB
VmRSS: 23012 kB
after signatures
VmPeak: 28464 kB
VmSize: 28464 kB
VmRSS: 25560 kB
after inventories
VmPeak: 32080 kB
VmSize: 32080 kB
VmRSS: 29008 kB
after chk_bytes
VmPeak: 101212 kB
VmSize: 95252 kB
VmRSS: 92396 kB
after texts
VmPeak: 113188 kB
VmSize: 109976 kB
VmRSS: 107108 kB
after unlock
VmPeak: 113188 kB
VmSize: 109976 kB
VmRSS: 107108 kB
after del maps
VmPeak: 113188 kB
VmSize: 94596 kB
VmRSS: 91728 kB

If I add this debug loop:
from memory_dump import scanner, _scanner
scanner.dump_all_referenced(open(',,maps_refs.txt', 'wb'), maps)
size = dict.fromkeys('0123', 0)
size['0'] = _scanner.size_of(maps)
for x in maps:
  size['1'] += _scanner.size_of(x)
  for y in gc.get_referents(x):
    size['2'] += _scanner.size_of(y)
    for z in gc.get_referents(y):
      size['3'] += _scanner.size_of(z)
pprint.pprint(size)

I get:

Now if I did it correctly, maps is a list of lists of keys [of strings].
{'0': 64, '1': 15729200, '2': 26713160, '3': 92853917}
135,296,341

So we have 64 bytes allocated to the overall list
15.7M bytes allocated to the lists of keys
26.7M bytes allocated to the tuples
92.9M bytes allocated to strings

Now the strings are, in theory, duplicated and deduped via intern() and my size_of() loop does not check any of that.

If I get rid of the intern() calls by editing _btree_serializer_pyx.pyx to change the call to safe_interned_string_from_size to safe_string_from_size I end up with:
VmPeak: 162088 kB
VmSize: 158876 kB
VmRSS: 155868 kB
{'0': 64, '1': 15729200, '2': 26713160, '3': 92853917}
135,296,341

However, you can see that VmPeak went from 113188 kB to 162088 kB, so the intern() does seem to be helping.

The most concerning to me is that we don't seem to be reclaiming the memory when we are done. Which is strange.

Martin Pool (mbp)
Changed in bzr:
status: Triaged → Confirmed
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
Jelmer Vernooij (jelmer)
tags: removed: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.