bzr diff in 2a to historical revision too slow

Bug #405166 reported by Paul Sladen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned

Bug Description

Per bug #213718, Rob says I should file this. It's quicker than whatever the example in question was, but it should be indicative:

$ cd launchpad

$ echo Cold cache; time bzr diff -r 7675.852.10 > /dev/null
Cold cache
real 1m0.019s
user 0m31.734s
sys 0m2.100s

$ echo Hot cache; time bzr diff -r 7675.852.10 > /dev/null
real 0m22.269s
user 0m21.033s
sys 0m1.176s

(FWIW, I think at the time I was trying to discover why 'launchpad/database/sampledata/current-dev.sql' contains with 709 consecutive blank newlines which are all variously attributed to different commits---see git blame launchpad/database/sampledata/current-dev.sql).

(Be-aware that the Launchpad bzr tree is 280MB+ may take 3+ hours to fetch via 'bzr get lp:launchpad').

Revision history for this message
Martin Pool (mbp) wrote :

I tested this and it did take 28 seconds on a fairly cold cache (without dropping buffers.) It produces roughly an 8MB diff. (You realize this gives the whole diff to your working tree, not the changes in that one commit?)

For me, getting the diff from the wt to its basis revision is 0.4s, which is more what it should be.

Paul, in general if you're reporting a performance problem, attaching the lsprof output helps diagnosis. (We should mention that in the bug-filing guidelines...)

Changed in bzr:
status: New → Confirmed
importance: Undecided → Medium
summary: - Bzr diff performance takes >5 seconds
+ bzr diff in 2a to historical revision too slow
Revision history for this message
Martin Pool (mbp) wrote :

It's not immediately obvious to me what's wrong in this profile, but maybe it will be to someone else.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 405166] Re: bzr diff in 2a to historical revision too slow

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> It's not immediately obvious to me what's wrong in this profile, but
> maybe it will be to someone else.
>
> ** Attachment added: "diff.callgrind"
> http://launchpadlibrarian.net/29590051/diff.callgrind
>

My immediate guess is that we don't have an optimized path from WT =>
arbitrary rev. So we have to build 2 inventories and status them, and
*then* have to extract some XXMB of raw text out and then run diff
against all of that.

If the *diff* is 8MB, that would make the raw texts probably 16+MB,
which probably takes a while to extract the right versions, and it would
be a *lot* of data to diff.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkpuZx0ACgkQJdeBCYSNAAPaHQCgoLbcDcgsLE7gyyGlXuHYhUGg
dosAoLjQzQKaq10OHUUvYnKwCtKH1aIx
=oSif
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

On Tue, 2009-07-28 at 02:49 +0000, John A Meinel wrote:
>
> My immediate guess is that we don't have an optimized path from WT =>
> arbitrary rev. So we have to build 2 inventories and status them, and
> *then* have to extract some XXMB of raw text out and then run diff
> against all of that.
>
> If the *diff* is 8MB, that would make the raw texts probably 16+MB,
> which probably takes a while to extract the right versions, and it
> would
> be a *lot* of data to diff.

This is a remaining area of optimisation for 2a - the plan is to do
delta composition - basis,tree + arbitrary,basis

-Rob

Revision history for this message
Paul Sladen (sladen) wrote :

John, yeah there's ~22MB of files in the uncompressed files in the source tree.

Jelmer Vernooij (jelmer)
tags: added: 2a diff performance
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.