don't open repository unless/until needed

Bug #112028 reported by Alexander Belchenko
2
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Wishlist
Unassigned

Bug Description

Several commands of bzr that works only with working tree object or branch object, like status, add, remove, tag(s), may be some others? -- should not open repository at all. We need some sort of lazy open for repository. Open repository every time just waste of time without any real needs to read data from repository.

My real use case (despite performance question): sometimes I move old branches away from shared repository to "limbo" folder. This limbo is not a subdirectory of shared repo, usually it at level up. After I move my branch I cannot do 'bzr st' just for simple check before I completely remove data from limbo.

Revision history for this message
Aaron Bentley (abentley) wrote : Re: [Bug 112028] don't open repository until needed

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> Public bug reported:
>
> Several commands of bzr that works only with working tree object or
> branch object, like status, add, remove, tag(s), may be some others? --
> should not open repository at all. We need some sort of lazy open for
> repository. Open repository every time just waste of time without any
> real needs to read data from repository.

But
 - most commands do need the repository
 - a lazy-open approach introduces new failure modes for dubious gain
 - you have produced malformed trees by moving them outside the repo,
and it's fine for bzr to fail on malformed trees.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGOeHn0F+nu1YWqI0RArC0AJ9rXlarHq7zDvHiBdh+6t0tn3b1jwCfcodA
A/38IY1nwUfZAmQQCCHMyKk=
=skbi
-----END PGP SIGNATURE-----

Revision history for this message
Martin Pool (mbp) wrote :

I'm not so sure about lazily opening, but rather just explicitly
opening it when we need it, and fixing code that opens it but doesn't
need it.

> - most commands do need the repository

Whether it's "most" or not there are certainly many that do not.

--
Martin

Revision history for this message
John A Meinel (jameinel) wrote : Re: don't open repository until needed

I'm fine with failing on malformed working trees. But what I really want is to not open Branch unless we need to.

This is because in a lightweight checkout, you can do "bzr status" without connecting to the remote machine at all. At this point, we could do it only for BranchReference formats (so a BranchReference stays as a proxy and only connects on request).

I also like Aaron's idea about making BranchReference a Branch6(7) with "use this repository" pointing to a remote repository, and "bound" pointing to the master branch. Though that would still probably want to open the repository.

So I'm confirming this as a wishlist item, because we at least want this for lightweight checkouts.

Changed in bzr:
importance: Undecided → Wishlist
status: Unconfirmed → Confirmed
Revision history for this message
Aaron Bentley (abentley) wrote : Re: [Bug 112028] don't open repository until needed

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> I'm not so sure about lazily opening, but rather just explicitly
> opening it when we need it, and fixing code that opens it but doesn't
> need it.

I count 124 calls WorkingTree.open* and 127 calls to Branch.open*. You
are proposing to update all 251 calls?

I think this is premature optimization. Has anyone observed a
performance difference? You shouldn't, because your working tree's
repository should always be local or on a fast link.

I think it's a good thing for operations to fail if the repository is
missing. The sooner we inform people about that problem, the easier it
will be to fix. If status only fails when there are pending merges, you
may not know about the problem until it is too late.

>> - most commands do need the repository
>
> Whether it's "most" or not there are certainly many that do not.

Any command that takes a revision spec may use the repository. This
includes "status", Alexander's example. Status will also use the
repository if there are pending merges.

I like the current API. I think it's less friendly to require most
Branch.open* and WorkingTree.open* calls to be followed by
branch.open_repository().

So I think the speed claim is premature optimization. I think that the
desire for operations to succeed when the tree is malformed is
wrongheaded. I think explicitly opening the repository is a less
friendly API than our current one. I think the gains do not justify the
costs.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGOfZn0F+nu1YWqI0RAtMnAJ44UMn+bL27ZGnuEzO1ZDJAqqpcfwCgh9w7
M9BvzFpHH11ZONBkNt4+mvk=
=5A+I
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

I noticed a large difference for "bzr status" when with a lightweight checkout over bzr+ssh://. Partly that is because spawning a bzr process on my server takes approx 1s.

If you want to test it yourself you can do:

cd ~/.bazaar/plugins
bzr checkout lp:bzr-hello hello
bzr hello bzr+ssh://localhost

(that plugin connects, and then sends ~20 hello requests to time how long it takes to start up the remote 'bzr serve', and then how much a single round-trip request takes.)

Since "time bzr status" in a bzr.dev tree is ~1s, (time bzr rocks on that machine is 600ms)

So even when it is on the local network, not connecting to the branch and repository is advantageous.

And even more so if someone wants to make a lightweight checkout of a Launchpad branch (think of what people do with CVS and Sourceforge all the time). They don't want to hack on it, they just want a way to have the latest code available. Lightweight checkouts mean you don't have to wait for them to download all of history. That is fixable in the longer term with shallow branches, history horizons, etc. But just not connecting is a nice first step.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 112028] Re: don't open repository unless/until needed

On Fri, 2007-11-30 at 16:13 +0000, John A Meinel wrote:
> I noticed a large difference for "bzr status" when with a lightweight
> checkout over bzr+ssh://. Partly that is because spawning a bzr process
> on my server takes approx 1s.
>
> If you want to test it yourself you can do:
>
> cd ~/.bazaar/plugins
> bzr checkout lp:bzr-hello hello
> bzr hello bzr+ssh://localhost
>
> (that plugin connects, and then sends ~20 hello requests to time how
> long it takes to start up the remote 'bzr serve', and then how much a
> single round-trip request takes.)
>
> Since "time bzr status" in a bzr.dev tree is ~1s, (time bzr rocks on
> that machine is 600ms)
>
> So even when it is on the local network, not connecting to the branch
> and repository is advantageous.
>
> And even more so if someone wants to make a lightweight checkout of a
> Launchpad branch (think of what people do with CVS and Sourceforge all
> the time). They don't want to hack on it, they just want a way to have
> the latest code available. Lightweight checkouts mean you don't have to
> wait for them to download all of history. That is fixable in the longer
> term with shallow branches, history horizons, etc. But just not
> connecting is a nice first step.

We've had this discussion before. I know I think that lazy connecting is
a fundamentally bad idea. We set tree policy from the repository for
instance. Users expect bzr operations to work, and most won't if the
repository is damaged or absent.

-Rob

--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.