annotate performance on development-rich--root

Bug #374726 reported by Robert Collins on 2009-05-11
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
High
Vincent Ladeuil

Bug Description

Annotate in the new format may have become slower. We need to measure this, and if it is slower analysis and either fix or decide not to.

Related branches

Changed in bzr:
importance: Undecided → High
status: New → Triaged
Changed in bzr:
assignee: nobody → Vincent Ladeuil (vila)
Martin Pool (mbp) wrote :

needed for brisbane-core; at least we need to know what the performance is and whether it's acceptable

Changed in bzr:
milestone: none → 2.0
Vincent Ladeuil (vila) wrote :

First results: the performance is worse.

It can be acceptably worse (~15%) or dramatically worse (x20).

A aggravating factor is the number of pack files. Issuing a bzr pack bring that done to x3.

Looking closer with --lsprof reveals an explosion around the # calls for
BTreeGraphIndex._get_nodes() from ~3500 to 300.000 in the 3x case.

I've yet to understand if it comes from chk nodes access (doubtly) or is groupcompress related (likely). If the later, we may need to tweak some cache for annotate.

Any feedback welcome in the mean time :)

John A Meinel (jameinel) wrote :

So was the source you were using completely unpacked? which meant the grouping was very unoptimal? Was it a recent conversion, which would mean the grouping was poor but not terrible? etc, etc

If this is the mysql source, a plain conversion will have 5 10k packs and 6 1k packs. The distribution which is interesting is more 1 56k pack, 5 100 item packs, 5 10 item, and then some small ones. Because honestly, post conversion we should pack. And then from then on, someone will either be doing a single big fetch, or slowly committing new ones.

I highly expect the distribution to be skewed from what would happen from a simple "bzr commit 50k times".

I *am* surprised if the 'much much worse' is truly because of BTree code, and not because of much more expensive PatienceDiff calls. Are you running on files like sql/sql_parse.cc? ISTR that disabling the knit fast path was 60x slower for that file.

If the problem *is* chk nodes, something is wrong. As 'bzr annotate foo' shouldn't touch the CHK nodes *at all*.

Now, there also is a factor that GC doesn't pre-fetch multiple blocks at a time. So if you have tons of small groups, rather than a properly packed grouping, I would expect a lot more index hits.

I don't quite understand:
 "Issuing a bzr pack bring that done to x3."
Brings the x20 down to x3?
And:
 "BTreeGraphIndex._get_nodes() from ~3500 to 300.000 in the 3x case"
I assume that is 300k requests down to 3.5k requests when it brings the time down from x20 to x3.

So *if* the overhead is in BTree, then it is definitely something worth spending time looking at, as I certainly thought the problem would be in the diff code, not the text extraction code.

By the way, the file that seems to have the most history is: 'sql/mysqld.cc' which by my quick check has 3719 revisions. but others are just about as many:
[(2060, 'sp1f-sql_table.cc-1
 (2155, 'sp1f-mysqltestrun.pl-2004
 (2229, 'sp1f-manual.texi-19
 (2397, 'sp1f-ha_ndbcluster.cc-200
 (2655, 'sp1f-mysql_priv.h-1
 (2840, 'sp1f-sql_yacc.yy-19
 (2860, 'sp1f-configure.in-1
 (3355, 'sp1f-sql_select.cc-
 (3683, 'sp1f-sql_parse.cc-1
 (3719, 'sp1f-mysqld.cc-1970

If the solution is simply 'bzr pack' in the 'I just converted' case, I'm fine with that. As long as you don't have to pack when you have the 1x56k + 2x1000 + 5x100 + 5x10 + 5x1 case. The latter certainly being a lot harder to simulate. In the past, I think I approximated it by doing a conversion, and then pulling the various branches into another repository. So you got 5.0, then 5.1, then 5.1-ndb, then 6.0, then 6.0-ndb, etc. It isn't perfect, but it at least gives you *some* distribution.

You could also play some tricks with selecting revisions from 'bzr ancestry' and branching them across at appropriate times. (In general, the important thing is that real-world copy code means we are likely to have 1x50k rather than 5x10k, even though a plain conversion generates 5x10k.)

Vincent Ladeuil (vila) wrote :

The associated branch fixes a wtf related to gc where we weren't properly reusing the revision graph.

Changed in bzr:
status: Triaged → Fix Committed
Vincent Ladeuil (vila) wrote :

Marking in progress again for the next target: revno processing for the whole ancestry.

Changed in bzr:
status: Fix Committed → In Progress

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Ladeuil wrote:
> Marking in progress again for the next target: revno processing for the
> whole ancestry.
>
> ** Changed in: bzr
> Status: Fix Committed => In Progress
>

I would probably prefer to open a new bug, since the core of this one is
fixed, but either way, I guess.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpchKoACgkQJdeBCYSNAANxkACdGNJxNxJCswBekTDkDvt7FG//
Is4An2YsK95t4CsQq+Cilnr6dDgzmg26
=K5XE
-----END PGP SIGNATURE-----

Robert Collins (lifeless) wrote :

Vila, is this still a blocker for 2.0? Its been a week since you put this to in-progress again, but AIUI the main issue is fixed?

Vincent Ladeuil (vila) wrote :

This isn't a blocker anymore. As John said the core problem reported here has been fixed.

I'll file a specific bug for the revno processing.

Vincent Ladeuil (vila) wrote :

Really 1.17 but the milestone is not available anymore (why ? 1,16 is still available).

Changed in bzr:
milestone: 2.0 → 1.18rc1
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers