Comment 1 for bug 494012

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 494012] [NEW] bzr needs excessive amount of bandwidth for commiting (2a)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ernst wrote:
> Public bug reported:
>
> I'm running bzr 2.0.2 on Ubuntu 9.10. My server uses sftp and thus is
> 'dumb'. The size of my repository is 222M (according to du -hs .bzr) and
> it has 207 revisions (bzr revno). It uses database format 2a.
>
> Yesterday, I committed some trivial changes (two files: one text file of a couple of KB, one PDF of 1.8 MB). However, this commit took a long time and transferred about 70 MB. This is quite a lot; luckily I was on my university were the upload bandwidth is quite high, bit still, it did take much longer than normal. So two problems:
> - Unexpectedly, a lot of data is transferred
> - Unexpectedly, the commit takes a lot of time
>
> This is because bzr started to pack the server repository. I think this should not be invoked without warning and user confirmation; if, for example, you have a flacky 3G connection (with data limit) or are in a hurry, this behavior is certainly not desired.
> For example, I do commit always at the end of the day just before logging of; then, such delay is not desirable.
>
> Hopefully, this behavior can be improved.
>
> The corresponding log entries in .bzr.log:
...

> 282.682 Auto-packing repository <bzrlib.repofmt.groupcompress_repo.GCRepositoryPackCollection object at 0x910fb2c> completed

You are auto-packing over a 'dumb' transport, which means we have to
download and re-upload the content. You could

1) Run the smart server, where autopacking is done server side.
2) Interrupting the auto-pack is 'safe'. I'm not sure if the branch will
be updated, and unlocked, but doing 'bzr push... ^C; bzr break-lock; bzr
push" should make sure everything is uploaded, and the branch history is
correct. Note that the *next* push will also try to autopack until it
succeeds.

I guess if you are doing it with a bound branch and 'commit', it is a
bit harder to trigger at the right times.

3) I'm pretty sure there is already an open bug about having autopack be
something you can disable, but it certainly is something we want to do
automatically so that people don't have to worry about it in normal
operation.

4) I'll also note that once a large auto-pack is done, it will be an
order of magnitude until we do it again. (eg, if you commit from
scratch, we will repack everything at 10 commits, 100 commits, 1000
commits, 10,000 commits, etc.)

So the likelyhood of you encountering this again quickly is quite low.
You also can manually run "bzr pack sftp://" at an opportune time, to
reduce the chance that this will happen automatically.

5) If you are interested in (3), I think a config option for
"disable_autopack=True" as a
"Branch.get_config().get_user_option('autopack')" would be reasonable.
(Possibly as a repository-level config?). Then you could set all your
sftp locations to not autopack automatically.

6) I'll also note that sftp performance is often fairly poor, just
because of sftp limitations. (reading a file requires an OPEN call, a
READ call, and then a CLOSE call. So it is 3 round trips to get all
content. We do try to prefetch, etc when we can.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkselfsACgkQJdeBCYSNAAPJMgCgtTdDCMhTIo0GeUIaD4NagKQ6
WZEAni1GPMcmJBBWbWjbt7q3OF9Z9Jrn
=ZCbA
-----END PGP SIGNATURE-----