Comment 4 for bug 256757

Revision history for this message
John A Meinel (jameinel) wrote :

So doing a bit more with the local repositories.

Branching pack => knit took ~5min and consumed 1GB of RAM (doing only the 'schooltool' branch.) When finished the knit repo was only 77MB.

Branching knit => pack (from the small repo) took 700MB of RAM, and the result repo was 169MB.

Something really strange is happening here, since 'du --apparent' on the knit repo is only 35MB, I would expect the pack repository to be on that order, not 2x the size of the knit repository.

On the same repo, doing "bzr upgrade" peaked at XXXXMB of ram, took XXXX, and the result was

When I used ^| to see what was happening it was doing:

            elif record.storage_kind == 'fulltext':
                self.add_lines(record.key, parents,
                    split_lines(record.get_bytes_as('fulltext')))

The specific issue is that the source repo was reading texts as lines, putting them together as a single large string, and then splitting them again into lines. Which is.... unfortunate.
At a minimum, it causes a 3x bloat for any given text. What worries me is that it seems to be caching all texts at the same time while doing this (hence the 700 MB number).

I'll try to dig a bit deeper to see where exactly the memory is being cached.