bzr pack leaves files in obsolete_packs, therefore doubles repository size

Bug #326369 reported by Seth
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned

Bug Description

I was more than a little surprised when my repository was 27Mb. A single working tree costs about 7megs. There was a single branch in the repo. Granted, I had had many branches, but all of them had been removed, so I thought it was time to do a 'bzr pack' on the shared repo. Boy was I wrong?

sehe@intrepid:~/UWV/work$ bzr branches
current
sehe@intrepid:~/UWV/work$ ls -tlrah
total 8.0K
drwxr-xr-x 4 sehe sehe 74 2008-04-24 00:56 .bzr
drwxr-xr-x 4 sehe sehe 31 2009-01-22 08:51 .
drwxr-xr-x 13 sehe sehe 4.0K 2009-01-22 22:31 ..
drwxr-xr-x 13 sehe sehe 4.0K 2009-02-06 22:59 current
sehe@intrepid:~/UWV/work$ du -shc .bzr
27M .bzr
27M total
sehe@intrepid:~/UWV/work$ du -shc .bzr
sehe@intrepid:~/UWV/work$ bzr pack
sehe@intrepid:~/UWV/work$ du -shc .bzr
52M .bzr
52M total
sehe@intrepid:~/UWV/work$ bzr pack
sehe@intrepid:~/UWV/work$ du -shc .bzr
52M .bzr
52M total

You see, my repository was doubled in size. This is not the documented goal of pack, to put it mildly.

Additional info:

sehe@intrepid:~/UWV/work$ bzr version
Bazaar (bzr) 1.10
  Python interpreter: /usr/bin/python 2.5.2
  Python standard library: /usr/lib/python2.5
  bzrlib: /usr/lib/python2.5/site-packages/bzrlib
  Bazaar configuration: /home/sehe/.bazaar
  Bazaar log file: /home/sehe/.bzr.log

Copyright 2005, 2006, 2007, 2008 Canonical Ltd.
http://bazaar-vcs.org/

bzr comes with ABSOLUTELY NO WARRANTY. bzr is free software, and
you may use, modify and redistribute it under the terms of the GNU
General Public License version 2 or later.

sehe@intrepid:~/UWV/work$ bzr plugins
bzrtools 1.10
    Various useful commands for working with bzr.

email
    Allow sending an email after a new commit.

gtk 0.95.0.final.1
    Graphical support for Bazaar using GTK.

launchpad
    Launchpad.net integration plugin for Bazaar.

pqm 1.0dev
    Functionality for controlling a Patch Queue Manager (pqm).

qbzr 0.9.2
    QBzr - Qt-based front end for Bazaar

rebase 0.3
    Rebase support.

stats
    A Simple bzr plugin to generate statistics about the history.

svn 0.4.16
    Support for Subversion branches

vimdiff
    vimdiff plugin for bzr

Revision history for this message
Seth (bugs-sehe) wrote :

Inspired by similar topic in the bug tracker, I delved a bit deeper and found:

sehe@intrepid:~/UWV/work$ du -shc .bzr/repository/*
4.0K .bzr/repository/format
988K .bzr/repository/indices
0 .bzr/repository/lock
0 .bzr/repository/no-working-trees
26M .bzr/repository/obsolete_packs
4.0K .bzr/repository/pack-names
25M .bzr/repository/packs
0 .bzr/repository/shared-storage
0 .bzr/repository/upload
52M total

So it turns out, in fact, I was right: my repository was doubled. In fact, it seems like it has been a very fancy way to say

cp -a .bzr/repository/packs .bzr/repository/obsolete_packs

Ok, perhaps (or not, because that info doesn't follow from these figures) at most 1Mb was shaved of the size of the packs subdir. But the benefit is more than thwarted by the fact that I'm now the proud owner of 26Mb in obsolete packs. I'm not sure why these are kept (perhaps a backup measure?) but I'm pretty sure there is no mention of it in the docs for the pack command, and it seems to grossly violate the principle of least surprise.

I'd guess I should run 'bzr pack' before rsync-ing my repository. It turns out, this is exactly what I shouldn't do, because it will (can?) double the volume of the transfer.

- is there any mention of this behaviour (+reason?) in the docs for 'bzr pack'
- is there a specific command to get rid of the bloat in obsolete-packs (clean-tree doesn't, and pack doesn't)
- shouldn't the user be warned that backups are taken and not going to be freed automatically?

Regards,
Seth

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 326369] Re: bzr pack doubles repository size

You shouldn't run 'pack' at all. Generally, bzr takes care of tuning
your repository. You definitely should not run it before rsyncing,
because you don't want to rsync data you have before - and packing all
the data into one file will case rsync to copy it from scratch.

You're right that the docs here could and should be expanded.

As for the reason, we keep a backup of files removed by a write
operation, removed the next time such a backup is needed. We do this
because various network servers are not as reliable as we might like,
and in the event of a failure where new data hasn't hit disk but a
delete or move has, its much safer to have issued a move (which can be
reverted) than a delete (which requires an undelete capability on that
filesystem).

-Rob

Revision history for this message
Seth (bugs-sehe) wrote :

Great explanation :)

I'll remember to remove 'pack' from bash-completions

Meanwhile, is there a way to clean up the obsoletes? Would it be safe to
simply delete the contents of that dir?

Robert Collins wrote:
> You shouldn't run 'pack' at all. Generally, bzr takes care of tuning
> your repository. You definitely should not run it before rsyncing,
> because you don't want to rsync data you have before - and packing all
> the data into one file will case rsync to copy it from scratch.
>
> You're right that the docs here could and should be expanded.
>
> As for the reason, we keep a backup of files removed by a write
> operation, removed the next time such a backup is needed. We do this
> because various network servers are not as reliable as we might like,
> and in the event of a failure where new data hasn't hit disk but a
> delete or move has, its much safer to have issued a move (which can be
> reverted) than a delete (which requires an undelete capability on that
> filesystem).
>
>
> -Rob
>
>

Revision history for this message
Robert Collins (lifeless) wrote :

On Sat, 2009-02-07 at 00:31 +0000, Seth wrote:
> Great explanation :)
>
> I'll remember to remove 'pack' from bash-completions
>
> Meanwhile, is there a way to clean up the obsoletes? Would it be safe to
> simply delete the contents of that dir?

yes, thats fine - just leave the directory intact.

Revision history for this message
Dan Watkins (oddbloke) wrote : Re: bzr pack doubles repository size

Marking this as Invalid as it is intended behaviour.

Changed in bzr:
status: New → Invalid
Revision history for this message
Larry Gilbert (l2g) wrote :

Rather than mark this ticket "Invalid," shouldn't work to be done on the "pack" command or the documentation of it so users know what to expect or not to expect from it? If users shouldn't touch it, shouldn't it be taken out? Either way, there is work yet to be done here and I don't think "Invalid" is a satisfactory resolution.

Revision history for this message
Jonathan Lange (jml) wrote : Re: [Bug 326369] Re: bzr pack doubles repository size

On Tue, Sep 29, 2009 at 7:39 PM, Larry Gilbert <email address hidden> wrote:
> Rather than mark this ticket "Invalid," shouldn't work to be done on the
> "pack" command or the documentation of it so users know what to expect
> or not to expect from it?  If users shouldn't touch it, shouldn't it be
> taken out?  Either way, there is work yet to be done here and I don't
> think "Invalid" is a satisfactory resolution.
>

The appropriate thing to do would be to file another bug.

jml

Revision history for this message
Robert Collins (lifeless) wrote :

On Wed, 2009-09-30 at 07:55 +0000, Jonathan Lange wrote:
> On Tue, Sep 29, 2009 at 7:39 PM, Larry Gilbert <email address hidden> wrote:
> > Rather than mark this ticket "Invalid," shouldn't work to be done on the
> > "pack" command or the documentation of it so users know what to expect
> > or not to expect from it? If users shouldn't touch it, shouldn't it be
> > taken out? Either way, there is work yet to be done here and I don't
> > think "Invalid" is a satisfactory resolution.
> >
>
> The appropriate thing to do would be to file another bug.

or we can reopen this one. The original bug was indeed about surprise,
and I think its fine to turn it into a doco bug.

 status confirmed

Changed in bzr:
status: Invalid → Confirmed
Revision history for this message
Seth (bugs-sehe) wrote : Re: bzr pack doubles repository size

Thanks for taking this up in the elegant fashion!

Martin Pool (mbp)
Changed in bzr:
importance: Undecided → Medium
summary: - bzr pack doubles repository size
+ bzr pack leaves files in obsolete_packs, therefore doubles repository
+ size
Jelmer Vernooij (jelmer)
tags: added: packs
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.