Re-packing happens at inconvenient times and blocks further operations

Bug #736001 reported by Julian Edwards
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned

Bug Description

I am just pulling a branch of Launchpad and getting the dreaded "repacking" message. This usually takes a very long time, depending on disk speed I've had it take 20 minutes.

Rob C suggested that it doesn't need to block writes and could be backgrounded. Another idea of mine would be to have bzr prompt/nag me if it thinks a repack is necessary, for example:

$ bzr pull
Repository needs repacking, do it now [Y/N], or run in the [B]ackground?
[Y/N/B] :

description: updated
Martin Pool (mbp)
Changed in bzr:
status: New → Confirmed
importance: Undecided → Medium
tags: added: feature packs performance
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 736001] [NEW] Re-packing happens at inconvenient times and blocks further operations

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/2011 10:21 AM, Julian Edwards wrote:
> Public bug reported:
>
> I am just pulling a branch of Launchpad and getting the dreaded
> "repacking" message. This usually takes a very long time, depending on
> disk speed I've had it take 20 minutes.
>
> Rob C suggested that it doesn't need to block writes and could be
> backgrounded. Another idea of mine would be to have bzr prompt/nag me
> if it thinks a repack is necessary, for example:
>
> $ bzr pull
> Repository needs repacking, do it now [Y/N], or run in the [B]ackground?
> [Y/N/B] :
>

In general, bzr doesn't want to be interactive so that it can be
scripted, etc. I could certainly see adding some sort of hook to make it
easy to configure this for a given user, though.

In the very immediate term, you can try this:

 bzr branch lp:~jameinel/+junk/bzr-prompt-repack \
     ~/.bazaar/plugins/prompt_repack

It traps in the _do_autopack function, and prompts you to see if you
really want to autopack right now.

Note that I haven't tested the '[b]ackground' functionality. But I did
test [N]ow and [p]ostpone.

background should work, though it will probably do naughty things to
your terminal.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2AqnIACgkQJdeBCYSNAAO9RwCgwye6AGozTjU7l9GHA8H7VlJD
h/8AnAgQiNAsnFH1AOrcoJm093c4iAIF
=fABT
-----END PGP SIGNATURE-----

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I understand the need to script. It's trivial to detect if you're running on
a terminal or not though, and bzr should cope with that.

Or it can even just print out something like:
"Your repository needs repacking because <hand wave>, run `bzr pack`"

I appreciate you offering the plugin, but I don't think that's a long-term
solution to usability.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/2011 1:37 PM, Julian Edwards wrote:
> I understand the need to script. It's trivial to detect if you're running on
> a terminal or not though, and bzr should cope with that.
>
> Or it can even just print out something like:
> "Your repository needs repacking because <hand wave>, run `bzr pack`"
>
> I appreciate you offering the plugin, but I don't think that's a long-term
> solution to usability.
>

If it seriously *regularly* spends 20 minutes repacking, then we have a
serious issue. Any chance you can actually quantify how often it happens?

I actually was proposing the plugin as a way for you to experiment with
it, and see how often bzr actually does repack, without you noticing.

As for "isatty", I would say that in places where we've tried to do
that, we've had many bug reports about incorrect detection. It really
isn't as easy as people seem to think.

Like running in a pipeline, and stderr is still a tty, but stdin isn't,
so you don't actually have a way to respond, etc.

It is something where many commands potentially grow a prompt at a
random time that people don't expect. It is one thing to prompt for
something like "bzr shelve" where that is the expected workflow. But all
update/push/pull/even stuff like 'missing' might trigger a repack. (Not
sure if missing fetches or not).

I agree the plugin isn't a full answer to usability, but you were
complaining about behavior X, but I don't think you realize how often X
actually happens that you don't notice it. That, and the plugin was a
30-min hack that was easy enough to illustrate the point.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2AyloACgkQJdeBCYSNAAOlvQCfabIc55odsVU9Eq4AED2hxgji
VO0AoMVb4eAITheCkETMDmp7GWc4RpeT
=SD2m
-----END PGP SIGNATURE-----

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I didn't mean to sound like I was complaining, sorry if I did! I just wanted
to stimulate a conversation around exploring more detail, which I certainly
have. :)

> If it seriously regularly spends 20 minutes repacking, then we have a
> serious issue. Any chance you can actually quantify how often it happens?

It's been often enough lately that I noticed, I can't be more accurate, sorry.
It may be partially down to the fact that I have a Launchpad repo on a few
machines and the long wait is obvious if it happens more than once. I also
heard a few other people saying that it's happened to them, so it does explain
the increased exposure I guess.

I totally understand that this might not happen very often in reality, but
when it does it can be annoying. Perhaps another way around it is to provide
a way to override the repack so I can complete my operation that I want done,
and pack it later?

Anyway, I guess it won't bite me again for a while now :)

Cheers.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/16/2011 3:54 PM, Julian Edwards wrote:
> I didn't mean to sound like I was complaining, sorry if I did! I just wanted
> to stimulate a conversation around exploring more detail, which I certainly
> have. :)
>
>> If it seriously regularly spends 20 minutes repacking, then we have a
>> serious issue. Any chance you can actually quantify how often it happens?
>
> It's been often enough lately that I noticed, I can't be more accurate, sorry.
> It may be partially down to the fact that I have a Launchpad repo on a few
> machines and the long wait is obvious if it happens more than once. I also
> heard a few other people saying that it's happened to them, so it does explain
> the increased exposure I guess.
>
> I totally understand that this might not happen very often in reality, but
> when it does it can be annoying. Perhaps another way around it is to provide
> a way to override the repack so I can complete my operation that I want done,
> and pack it later?
>
> Anyway, I guess it won't bite me again for a while now :)
>
> Cheers.
>

Well, if you have multiple repos, they might trigger close to the same
time if your fetching habits are similar in each. And certainly,
triggering it in one doesn't help the others get repacked.

I've thought about having 'bzr pack --soft'. 'bzr pack' by default
repacks everything, but a '--soft' option could potential leave really
large pack files alone, or something along those lines.

Certainly we could give a config option to prompt if it is going to
repack more than N revisions. Though again that brings up surprising
behavior that happens very infrequently.

Other options would be "don't repack files over a certain size/num
revisions" and just let the user know they might want to run 'bzr pack'
manually. But there again, you don't really want to prompt on every
command that gets run.

We could spawn a process that does the auto-packing in the background,
but people don't really expect a command line vcs to do that sort of thing.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2A1ZcACgkQJdeBCYSNAAOV0gCdF5VJPnP3UWmvetrRzQ5Z5hMc
iHAAn2zI7FQIi/D+VXTwSVCQKcyIyvgf
=Di9J
-----END PGP SIGNATURE-----

Revision history for this message
Martin Pool (mbp) wrote :

'^z bg' ought to work to put it into the background.

However, it may be that bzr doesn't update the working tree until it's
finished repacking, in which case the users won't actually be able to
continue with their work until it completes. Possibly we should do
all the stuff the user cares about first and then repack last of all.
Then they can background it, minimize the window, or whatever, and let
it run. Or we could even optionally background ourselves.

Martin

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/17/2011 6:38 AM, Martin Pool wrote:
> '^z bg' ought to work to put it into the background.
>
> However, it may be that bzr doesn't update the working tree until it's
> finished repacking, in which case the users won't actually be able to
> continue with their work until it completes. Possibly we should do
> all the stuff the user cares about first and then repack last of all.
> Then they can background it, minimize the window, or whatever, and let
> it run. Or we could even optionally background ourselves.
>
> Martin

I'm pretty sure the fetch and repacking happen first. The Branch pointer
probably isn't even updated until the repacking is finished, and during
that time the Branch is probably write locked. So you can't simply
switch to another terminal and re-do the 'bzr up/pull' (you have the
content locally, but everything is marked locked.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2BsEcACgkQJdeBCYSNAAMdVACbBMqLTbMEeCXiKY7k68sg9RGq
pVYAn2oGYs1dd3EB1irHKSYxevRr+RhX
=eYWr
-----END PGP SIGNATURE-----

Revision history for this message
Martin Pool (mbp) wrote :

I think we'd have to change it to:

* pull everything down without repacking
* set the branch pointer
* update the tree
* repack

Revision history for this message
Robert Collins (lifeless) wrote :

What I was saying might make sense is to fork and spin off a repacker
- both on server side or local operations.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/17/2011 8:49 AM, Robert Collins wrote:
> What I was saying might make sense is to fork and spin off a repacker
> - both on server side or local operations.
>

What about a config setting for this? I don't think we want it on by
default, because repacking tends to consume a fair amount of memory and
machine resources (CPU, etc).

However, what if you could set:

 autopack_threshold = 1000 # Number of revisions to handle specilaly
 autopack_action = skip-with-warning, background

Just a thought.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2Ea+4ACgkQJdeBCYSNAAOfYgCeM8ghSUC7EjN20dciPeDZyMqH
DK0AoNnI5K1TdoHfjLAVZzaBZan8/bU7
=lRhz
-----END PGP SIGNATURE-----

Revision history for this message
Martin Pool (mbp) wrote :

I think making it configurable would be ok. The furthest I would
probably go at this point is to give a way to disable automatic
repacking, for users who promise to make other arrangements to repack
from time to time.

Martin

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Tuesday 22 March 2011 09:16:05 you wrote:
> I think making it configurable would be ok. The furthest I would
> probably go at this point is to give a way to disable automatic
> repacking, for users who promise to make other arrangements to repack
> from time to time.

It might be worth adding instructions for the (new) config in the output when
you start doing the auto-packing. That way the user can immediately see a way
to avoid the lengthy repack if they're in a hurry.

Revision history for this message
Martin Packman (gz) wrote :

There's bug 494012 about wanting a way to disable automatic repacking completely, and bug 602614 about OOM errors due to repacking.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/22/2011 10:48 AM, Julian Edwards wrote:
> On Tuesday 22 March 2011 09:16:05 you wrote:
>> I think making it configurable would be ok. The furthest I would
>> probably go at this point is to give a way to disable automatic
>> repacking, for users who promise to make other arrangements to repack
>> from time to time.
>
> It might be worth adding instructions for the (new) config in the output when
> you start doing the auto-packing. That way the user can immediately see a way
> to avoid the lengthy repack if they're in a hurry.
>

Note that my Launchpad repository *just* rolled over 100k revisions,
which is the "really really big and long" repack. That is partially
because it also includes the dependencies as part of one-big-repository.

However, that means that +/- a couple months people working with
Launchpad are going to be hitting a big repack. In my case it took 13
minutes, which is certainly invasive. (Happened to be at a time that I
wanted to 'move quickly', too.)

100k inventories, 550k chk pages, and 369k file texts do take a while to
repack.

I wonder if we want to just have a maximum-auto-repack setting, with a
default of something like 10k revisions. And a possible warning if we
think we should repack more than that.

We might consider even defaulting to more like 1k revisions.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2MlQsACgkQJdeBCYSNAAOfbgCeNx45yyQjp/8Y9UJxSvX0Eyc2
5LkAoNhIrh9Rx91As8XDqmI/MYvak/d5
=YZYl
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

On Sat, Mar 26, 2011 at 2:13 AM, John A Meinel <email address hidden> wrote:
> I wonder if we want to just have a maximum-auto-repack setting, with a
> default of something like 10k revisions. And a possible warning if we
> think we should repack more than that.
>
> We might consider even defaulting to more like 1k revisions.

The mega repack also happened to me - and saved about 30MB of disk -
the incrementally new content packed into just a couple of MB on top
of the primary pack.

-Rob

Revision history for this message
Julian Edwards (julian-edwards) wrote :

This just happened to me again, at the worst possible time.

I was trying to commit a new revision and push some changes for someone else to see, as my battery was dying. bzr started packing on the commit, and 2 minutes later the machine closed down while it was still doing it.

So to answer John's question from earlier:
> Any chance you can actually quantify how often it happens?

The last time I noticed was when I reported this bug. So it's not *that* often, but when it does....

I'm going to remember to install your plugin this time!

Revision history for this message
Brian de Alwis (slyguy) wrote :

I've just been bitten by this when pulling down a set of changes to a large Git repo (16k revisions). Repacking took 37 minutes on a MacBook Pro with 2.5GHz 8GB RAM.

648.407 Auto-packing repository GCRepositoryPackCollection(CHKInventoryRepository('file:///Users/bsd/Manumitting/Projects/e4/.bzr/repository/')), which has 28
pack files, containing 28102 revisions. Packing 21 files into 1 affecting 21102 revisions
648.776 repacking 21102 revisions
726.003 repacking 21102 inventories
875.144 repacking chk: 21093 id_to_entry roots, 3558 p_id_map roots, 160508 total keys
2125.787 repacking 105322 texts
2608.808 repacking 0 signatures
2638.929 Auto-packing repository GCRepositoryPackCollection(CHKInventoryRepository('file:///Users/bsd/Manumitting/Projects/e4/.bzr/repository/')) completed
2643.960 Packing repository GCRepositoryPackCollection(CHKInventoryRepository('file:///Users/bsd/Manumitting/Projects/e4/.bzr/repository/')), which has 8 pack files, containing 28102 revisions with hint ['58d32c698f0f21f898dfdd77332f3234', '6994968d782e2e28d45db4545de9fd87', '73b643691d3cbebe29b2c64486a462f2'].
2644.338 repacking 21102 revisions
2657.347 repacking 21102 inventories
2677.471 repacking chk: 21093 id_to_entry roots, 3558 p_id_map roots, 160508 total keys
2788.576 repacking 105322 texts
2885.674 repacking 0 signatures

I plan to install John's plugin too :-/

Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.