bzr merge takes a long time to decide there is nothing to do

Bug #103757 reported by Mark Shuttleworth
4
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
John A Meinel

Bug Description

I just merged one tree into another, then tried to do it again. It would seem to be straightforward to detect that this is a noop, but bzr took a few seconds to do so.

Related branches

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 103757] Bzr takes a long time to decide there is nothing to do

On Fri, 2007-04-06 at 14:34 +0000, Mark Shuttleworth wrote:
> Public bug reported:
>
> I just merged one tree into another, then tried to do it again. It would
> seem to be straightforward to detect that this is a noop, but bzr took a
> few seconds to do so.

We need some more details to tell what was going on here. Were both
trees local, or was the merge from a remote branch to a local tree?
Had you committed after the first merge ?

Thanks,
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 103757] Bzr takes a long time to decide there is nothing to do

Both branches were local, and I had committed.

Mark

Revision history for this message
Martin Pool (mbp) wrote :

<sabdfl> peregrine% time bzr merge ../rocketfuel ~/canonical/onezero-ui-cleanup
 Nothing to do.
 bzr merge ../rocketfuel 44.65s user 0.74s system 99% cpu 45.726 total
<sabdfl> peregrine% time bzr missing ../rocketfuel ~/canonical/onezero-ui-cleanup
 You have 6 extra revision(s):
  ...
 bzr missing ../rocketfuel 2.70s user 0.07s system 98% cpu 2.806 total

I don't see a good reason why merge should be slower than missing...

Changed in bzr:
importance: Undecided → Medium
status: Unconfirmed → Confirmed
Revision history for this message
Aaron Bentley (abentley) wrote : Re: [Bug 103757] Re: Bzr takes a long time to decide there is nothing to do

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> <sabdfl> peregrine% time bzr merge ../rocketfuel ~/canonical/onezero-ui-cleanup
> Nothing to do.
> bzr merge ../rocketfuel 44.65s user 0.74s system 99% cpu 45.726 total
> <sabdfl> peregrine% time bzr missing ../rocketfuel ~/canonical/onezero-ui-cleanup
> You have 6 extra revision(s):
> ...
> bzr missing ../rocketfuel 2.70s user 0.07s system 98% cpu 2.806 total
>
>
> I don't see a good reason why merge should be slower than missing...

Missing doesn't try to find a common ancestor.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGG3df0F+nu1YWqI0RApczAJ9L0TCtVDrPTZ8Y5YekITTDCFa/RwCfXH7Y
jrQFDRYHuSu4RXGitOCNGHU=
=40ad
-----END PGP SIGNATURE-----

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Why try to find a common ancestor when the tip of the other branch is in the ancestry of the tip of this one?

Revision history for this message
John A Meinel (jameinel) wrote :

Yeah, I think the 'common ancestor' code could even just shortcut with a "tip in other" check. I'll try to look around a bit with the code.

Revision history for this message
John A Meinel (jameinel) wrote :

I'm attaching a possible "shortcut" for the case when one tip is contained in the other ancestry.

So far, all tests have passed, but there aren't a lot of direct tests for 'common_ancestry'.

I'm pretty sure this shortcut is 'correct'. In the sense that if you are looking at the tip being present, then no other revision should really be considered the 'base' revision.

The current algorithm detects the merge base as the common ancestor which is farthest from the NULL revision. Because each node's distance is calculated as (max(parent_distances) + 1), it seems like the tip will always be the base as long as it is present on both sides.

I'll do some timing tests with it, but conceptually it should help.

Revision history for this message
Aaron Bentley (abentley) wrote : Re: [Bug 103757] Re: bzr merge takes a long time to decide there is nothing to do

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> Yeah, I think the 'common ancestor' code could even just shortcut with a
> "tip in other" check. I'll try to look around a bit with the code.

True, but I think merge shouldn't take such a long time even when there
is something to do. Since the distance from NULL only changes when
ghosts are filled in, we should cache that information, and
incrementally update it.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGG92r0F+nu1YWqI0RAk2xAJ9LCl5voxqrwp+2M6Kin0R4OnvNsgCeI9yf
evlKwSG3D4biEHLAF3cWfE4=
=vfWV
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

I grabbed a Launchpad devel tree (revno 3872), and looked at the merge history and branched out one of the merges. These are my timings on my Mac.

$ time bzr missing --theirs-only ../other
4.77s user 0.82s system 83% cpu 6.697 total

$ time bzr merge ../other
Nothing to do.
82.32s user 6.75s system 64% cpu 2:17.89 total

With the patch
$ time ~/dev/bzr/shortcut-common-ancestor/bzr merge ../other
Nothing to do.
11.32s user 2.62s system 66% cpu 20.949 total

So it still is slower than "bzr missing", but it seems to be a lot faster than it used to be. (140s => 21s).

My guess is the expense over missing is just all the extra graph generation calls. Since we generate the combined graph, and then extract the separate graphs from it.)

I worked on the patch a bit more, and the attached patch gets the time down to:
9.55s user 1.40s system 81% cpu 13.484 total

(I also associated the relevant bzr branch to this bug).

Basically noticing that set.intersection() doesn't require a set for the parameter, just an iterable, so we don't have to generate multiple set() objects, and that we can do the lookup check in the dictionary returned by get_ancestry().

I think there is still some fat that could be trimmed. Because I'm seeing "Nothing to do" be printed fairly early, and then it continues to hang before bzr exits. I don't know what it is doing during that time.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

This looks like great progress - thanks John!

Revision history for this message
John A Meinel (jameinel) wrote :

Merged the patch a few minutes ago.

Changed in bzr:
assignee: nobody → jameinel
status: Confirmed → Fix Released
Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 103757] Re: bzr merge takes a long time to decide there is nothing to do

On Tue, 2007-04-10 at 16:23 +0000, John A Meinel wrote:
>
> The current algorithm detects the merge base as the common ancestor
> which is farthest from the NULL revision. Because each node's distance
> is calculated as (max(parent_distances) + 1), it seems like the tip
> will
> always be the base as long as it is present on both sides.
>
> I'll do some timing tests with it, but conceptually it should help.

I think our graph routine there needs optimising in general too -
2minutes is very wrong.

-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.