smart fetch for --2a does not opportunistically combine groups

Bug #402652 reported by John A Meinel on 2009-07-21
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Critical
Robert Collins
2.0
Critical
Unassigned

Bug Description

Summary
=======

Bzr versions 1.16/1.17/1.18/2.0rc1 will fragment 2a repositories as they fetch them.

Symptoms
========

When fetching from a 2a repository, the target repository is much bigger than the source - a factor of three has been observed.

Workarounds
===========

None

Rectification
=============

Bzr 2.0rc2 and 2.1 and above will repack content as needed during fetches, which prevents this happening.

Tags: 2a Edit Tag help

Related branches

John A Meinel (jameinel) on 2009-07-21
summary: - smart fetch for --2a does not opportunistically pack
+ smart fetch for --2a does not opportunistically combine groups
tags: added: 2a
Robert Collins (lifeless) wrote :

I think its very important that 2a Just Work in 2.0, and having to run pack repeatedly won't do that. That said this isn't necessarily a showstopper, so I'm marking it high not critical.

Changed in bzr:
importance: Medium → High
milestone: none → 2.0
Changed in bzr:
importance: High → Critical
Robert Collins (lifeless) wrote :

I am looking at this; it may be splittable into some separate bits that we can parallelise on; please feel free to look and think about this in parallel with me; with only 4 bugs left till release a little overlap won't hurt.

Changed in bzr:
assignee: nobody → Robert Collins (lifeless)
Robert Collins (lifeless) wrote :

We get a single stream for e.g. texts during fetch, so I think most of this bug will just be being smarter about when to reuse groups. For a first cut I'm going to discard reusing a group if we're well under the heuristic change-point.

Robert Collins (lifeless) wrote :

setting reuse_blocks=False makes the test pass...

John can you comment on how severe a performance impact this has?

Robert Collins (lifeless) wrote :

Thinking about this still more; turning reuse blocks off for fetch makes sense to me because:
 - we want to validate the sha1 too [eventually]
 - it will DTRT group wise
 - we should still compress faster than wire speed for many users

Martin Pool (mbp) on 2009-08-31
Changed in bzr:
status: Triaged → In Progress

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> setting reuse_blocks=False makes the test pass...
>
> John can you comment on how severe a performance impact this has?
>

It means we would rebuild *all* of the groups *all* of the time.

At the moment we have 2 states:

Pack: rebuild everything
Stream: rebuild nothing

Obviously the goal of this bug is to do something inbetween the two.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqb4rYACgkQJdeBCYSNAANgvQCfWLHQ/qi0NFnK2iz+cMOwvqsQ
t6YAoL91152y3oJoxg+D8DDmsmGKqpcs
=Z2zl
-----END PGP SIGNATURE-----

Robert Collins (lifeless) wrote :

On Mon, 2009-08-31 at 14:48 +0000, John A Meinel wrote:

> Pack: rebuild everything
> Stream: rebuild nothing
>
> Obviously the goal of this bug is to do something inbetween the two.

I'm questioning whether we need something in between the two. You did
most of performance profiling and tuning of the current iteration of
groupcompress, so I'd depending heavily on your input here. How badly
impracticable is rebuilding everything? pack() doesn't take that long
really, and in fact you suggested while we were talking on IRC about
this that we could just trigger a partial pack - which would have the
same broad effect.

-Rob

John A Meinel (jameinel) wrote :

Robert Collins wrote:
> On Mon, 2009-08-31 at 14:48 +0000, John A Meinel wrote:
>
>
>> Pack: rebuild everything
>> Stream: rebuild nothing
>>
>> Obviously the goal of this bug is to do something inbetween the two.
>
> I'm questioning whether we need something in between the two. You did
> most of performance profiling and tuning of the current iteration of
> groupcompress, so I'd depending heavily on your input here. How badly
> impracticable is rebuilding everything? pack() doesn't take that long
> really, and in fact you suggested while we were talking on IRC about
> this that we could just trigger a partial pack - which would have the
> same broad effect.
>
> -Rob
>

I think we should explicitly test it again. IIRC it is non trivial. I
don't remember how long 'bzr pack' takes in a Launchpad tree, but that
is exactly what you would be doing on the initial branch. Which seems
very wasteful if the source is already mostly/completely packed.

I *would* probably like to extract all the texts and check their
sha1sums. But you can extract texts at about 400MB/s+, and sha1sum at
about 60MB/s. Compression is much much slower than that.

John
=:->

Robert Collins (lifeless) wrote :

On Mon, 2009-08-31 at 20:25 +0000, John A Meinel wrote:
>
>
> I think we should explicitly test it again. IIRC it is non trivial. I
> don't remember how long 'bzr pack' takes in a Launchpad tree, but that
> is exactly what you would be doing on the initial branch. Which seems
> very wasteful if the source is already mostly/completely packed.

doing a full pack took 11m35 on a branch of db-devel a couple of weeks
old.

> I *would* probably like to extract all the texts and check their
> sha1sums. But you can extract texts at about 400MB/s+, and sha1sum at
> about 60MB/s. Compression is much much slower than that.

yup. Its just that I've looked at the code and the problem is the
heuristic about whether to make a new group needs the text size, but to
get the text size we extract the text.

I'm going to do some measurements: just extracting always would be the
_simplest_ way forward, also solve the fragmentation thing better than
unordered does (because its still not combining separate pushes to the
source repo), and we can land reuse in a more polished form later.

-Rob

John A Meinel (jameinel) wrote :
Download full text (4.2 KiB)

Robert Collins wrote:
> On Mon, 2009-08-31 at 20:25 +0000, John A Meinel wrote:
>>
>> I think we should explicitly test it again. IIRC it is non trivial. I
>> don't remember how long 'bzr pack' takes in a Launchpad tree, but that
>> is exactly what you would be doing on the initial branch. Which seems
>> very wasteful if the source is already mostly/completely packed.
>
> doing a full pack took 11m35 on a branch of db-devel a couple of weeks
> old.

So you and I seem to have very different thresholds for what is reasonable.

$ time bzr branch bzr-2a/bzr.dev copy
real 0m59.077s

$ time bzr pack bzr-2a
real 2m1.454s

So it takes 2x longer to repack all the content than it did to just copy
it. By that estimate, it would be 3m+ to do a 'bzr branch' outside of a
shared repo. (Or 3x longer than it currently does.)

Now yes, if you are only doing this sort of thing over a remote
connection, the time is dominated by the network transfer time.

However, I can currently get 250KB/s download, and a fully packed bzr is
37MB. So that would only take 37*1024/250/60 = 2m30s to download. So the
pack time is slightly longer than the download time. If they could be
done perfectly in parallel, then the increase in time would be small. My
guess, though, is that it would be a fair amount longer (at least 1min,
possibly the full 2min longer).

>
>> I *would* probably like to extract all the texts and check their
>> sha1sums. But you can extract texts at about 400MB/s+, and sha1sum at
>> about 60MB/s. Compression is much much slower than that.
>
> yup. Its just that I've looked at the code and the problem is the
> heuristic about whether to make a new group needs the text size, but to
> get the text size we extract the text.

I'm not really sure why that would be necessary. I suppose there are
lots of heuristics one could use.

Certainly if we strictly needed it, we could put that data into the
Factory object, since we have it in the index. Probably the big loss is
that it would change the serialization on the wire for the Remote requests.

>
> I'm going to do some measurements: just extracting always would be the
> _simplest_ way forward, also solve the fragmentation thing better than
> unordered does (because its still not combining separate pushes to the
> source repo), and we can land reuse in a more polished form later.
>
> -Rob
>

I'm not very concerned about 'extracting always'. I'm concerned about
'compressing always'... :)

I just got a new laptop, and these are the times that I see:
TIMEIT -s "b = Branch.open('bzr-2a-test/bzr.dev');
r = b.repository; r.lock_read()
k = r.texts.keys()
r.unlock();
" "r.lock_read()
nbytes = 0
for record in r.texts.get_record_stream(k, 'unordered', True):
  nbytes += len(record.get_bytes_as('fulltext'))
r.unlock()
print nbytes
"

This says that I'm decompressing 2,820,077,108 or 2.6GB, and it is
taking 4.43s/loop. Which is 607MB/s.

Now I believe the above construction also includes the time to read the
indexes, etc, but I won't guarantee that.

openssl speed sha1 says:
type 8192 bytes
sha1 362010.20k

or 362MB/s to compute the sha1 of data.

So it is taking roughly 2min to compress everything, but we can
*extra...

Read more...

Changed in bzr:
status: In Progress → Fix Committed
Robert Collins (lifeless) wrote :

There is a step toward this in 2.0 now, we generate a single stream object locally for all the groups in (say) 'texts', which we didn't do before. This will let the heuristic that John has been putting together work with bzr:// servers as well as over VFS transports.

Changed in bzr:
status: Fix Committed → In Progress
Robert Collins (lifeless) wrote :

On Mon, 2009-08-31 at 22:07 +0000, John A Meinel wrote:

A little stale now, but I wanted to address a couple of points:

> So you and I seem to have very different thresholds for what is
> reasonable.
>
> $ time bzr branch bzr-2a/bzr.dev copy
> real 0m59.077s
>
> $ time bzr pack bzr-2a
> real 2m1.454s
>
> So it takes 2x longer to repack all the content than it did to just
> copy
> it. By that estimate, it would be 3m+ to do a 'bzr branch' outside of
> a
> shared repo. (Or 3x longer than it currently does.)

I tested with launchpad, and saw
8m
and
11m
respectively.

I'm not sure why bzr itself shows greater overhead. Perhaps its the
specific machine, or some measurement issue.

> So it is taking roughly 2min to compress everything, but we can
> *extract* all of that content in about 4s. (Not exactly fair, because
> we
> also have inventory/chk/etc content involved in a 'bzr pack'.)
>
>
> If I change the above loop to be:
> osutils.sha_string(record.get_bytes_as('fulltext'))
>
> The time changes to 12s/loop, or 224MB/s. (About right when you
> consider
> that sha1 is roughly 1/2 the speed of decompression, so you do 1
> decompression and 2 sha1 ticks, or 3x original speed.)
>
> Still, extracting and sha1summing everything is an order of magnitude
> faster than the 120s to compress it all.

Thats true, but doesn't correspond to the moderate (30% by my
measurement, 100% by yours) difference we see at the command line: there
is more to it than this.

> I would really *like* it if we would extract all the content and make
> sure the (file_id, revision) => sha1sum from the inventory matches the
> content before it is put into the repository.....

me too :(

-Rob

Changed in bzr:
status: In Progress → Fix Released
description: updated
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers