bzr branch on large projects require vast amounts of memory

Bug #408531 reported by Jan Danielsson on 2009-08-03
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Medium
John A Meinel

Bug Description

First, see bug 408526. This is the same system, and the following commands are run just after the successful completion of the commit:

$ cd ~/bazaar
$ ulimit -d
524288
$ bzr branch netbsd-5.0 mybranch
bzr: ERROR: exceptions.MemoryError:

Traceback (most recent call last):
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/commands.py", line 729, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/commands.py", line 924, in run_bzr
    ret = run(*run_argv)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/commands.py", line 560, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/builtins.py", line 1147, in run
    source_branch=br_from)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/bzrdir.py", line 1178, in sprout
    result_repo.fetch(source_repository, fetch_spec=fetch_spec)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/repository.py", line 1553, in fetch
    find_ghosts=find_ghosts, fetch_spec=fetch_spec)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/repository.py", line 3139, in fetch
    pb=pb, find_ghosts=find_ghosts)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/fetch.py", line 82, in __init__
    self.__fetch()
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/fetch.py", line 108, in __fetch
    self._fetch_everything_for_search(search)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/fetch.py", line 136, in _fetch_everything_for_search
    stream, from_format, [])
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/repository.py", line 4047, in insert_stream
    return self._locked_insert_stream(stream, src_format, is_resume)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/repository.py", line 4089, in _locked_insert_stream
    self.target_repo.chk_bytes.insert_record_stream(substream)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/groupcompress.py", line 1369, in insert_record_stream
    for _ in self._insert_record_stream(stream, random_id=False):
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/groupcompress.py", line 1423, in _insert_record_stream
    for record in stream:
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line 932, in _filter_id_to_entry
    self._chk_id_roots, uninteresting_root_keys):
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/chk_map.py", line 1440, in iter_interesting_nodes
    bytes = record.get_bytes_as('fulltext')
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/groupcompress.py", line 419, in get_bytes_as
    self._manager._prepare_for_extract()
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/groupcompress.py", line 512, in _prepare_for_extract
    self._block._ensure_content(self._last_byte)
  File "/usr/pkg/lib/python2.5/site-packages/bzrlib/groupcompress.py", line 156, in _ensure_content
    self._z_content, num_bytes + _ZLIB_DECOMP_WINDOW)
MemoryError

bzr 1.16.1 on python 2.5.4 (netbsd4)
arguments: ['/usr/pkg/bin/bzr', 'branch', 'netbsd-5.0', 'mybranch']
encoding: '646', fsenc: '646', lang: None
plugins:
  bzrtools /usr/pkg/lib/python2.5/site-packages/bzrlib/plugins/bzrtools [1.16]
  launchpad /usr/pkg/lib/python2.5/site-packages/bzrlib/plugins/launchpad [1.16.1]
  netrc_credential_store /usr/pkg/lib/python2.5/site-packages/bzrlib/plugins/netrc_credential_store [1.16.1]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.

If it's running out of memory, it's using more than ~512MB RAM(!).

How many paths are in tree?
How many commits?
Whats the largest file size?

-Rob

John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> How many paths are in tree?
> How many commits?
> Whats the largest file size?
>
> -Rob
>

On IRC he was saying this was the initial commit of the NetBSD tree. So
after doing "cvs co ...", it was "bzr commit" and then push/pull.

It is related to a couple of other bugs where he was:

1) Unable to commit w/ less than 512MB of memory (ulimit 512MB)
2) Unable to branch w/ 512MB of memory.

So for whatever reason, 'bzr branch' was taking more memory than 'bzr
commit'.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkp4RDcACgkQJdeBCYSNAAO7eQCgw0AJg34U3nX2Ko8gvNE8fyAu
ufkAn0CNUJhhs3nIOm2B/lN4R7LUXBKQ
=HWAJ
-----END PGP SIGNATURE-----

Robert,

See bug 408526 -- everything I did is logged there. If you run the relevant commands to export the netbsd sources (as documented in that bug report), you'll get exactly what I was using.

But the quick answers are:
$ find . -type d | wc -l
    9336
One commit.
I don't believe there aren't any abnormally large files in the repository.

I don't believe there _are_ any abnormally ...

I tried updating to bzr 1.17, and there's no change.

I've noticed that there are quite a few bug reports about memory usage, and a few of them get marked as dupes, referring to a bug about large files being read into memory.

I'm fairly certain that's not a problem in my case. There are no abnormally large files involved. But there are many files and many subdirectories.

Andrew Bennetts (spiv) wrote :

If there are no large files, then I think 2.1.0b3 will help. Hopefully it will approximately halve the memory bzr uses for you. Can you try it out and report the results?

John A Meinel (jameinel) wrote :

If this is specifically about large-memory consumption during "bzr branch", this has, indeed, been addressed in bzr-2.1.0b2 (and thus b3 as well).

*A* problem with a bug like "require lots of memory" is that there isn't a clear point when the bug can be considered closed. The old code didn't really grow without bounds, it just had a high bound (say 1GB to branch a Launchpad branch), and the new code has a lower bound (approx 512MB now).

That doesn't let us do the work in say 128MB, but it is *better*. It is a bit hard to give an explicit memory bound for an operation. Lower is always better, but I certainly think that if you are doing an operation on a large amount of data, it is reasonable to expect it to consume more resources (memory, cpu time, etc.)

I don't think there are many remaining "easy-to-trim" memory consumption changes to be made at this point. I'm sure more could be done, but we are certainly into the "effort-vs-benefit" level.

I'm tempted to mark this as fix released in bzr-2.1.0b2 and open a new bug if we want to continue the discussion.

2009/11/18 John A Meinel <email address hidden>:
> *A* problem with a bug like "require lots of memory" is that there isn't
> a clear point when the bug can be considered closed. The old code didn't
> really grow without bounds, it just had a high bound (say 1GB to branch
> a Launchpad branch), and the new code has a lower bound (approx 512MB
> now).

Right.

> I'm tempted to mark this as fix released in bzr-2.1.0b2 and open a new
> bug if we want to continue the discussion.

OK with me.

That has another possibly beneficial effect: you can see if anyone
using >2.1b2 *actually* complains about memory usage, or how many
people do. If nobody, then though it may not be the smallest it could
be, it would seem a low priority.

--
Martin <http://launchpad.net/~mbp/>

John A Meinel (jameinel) wrote :

Technically fixed in 2.1.0b2, but it isn't worth re-opening the milestone for just this.

Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
importance: Undecided → Medium
milestone: none → 2.1.0b4
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers