calculating testament sha1 is very slow

Bug #755642 reported by Jelmer Vernooij
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Medium
Unassigned

Bug Description

bzr-svn and bzr-git have started using testament sha's as ways to verify that revisions have roundtripped correctly. Unfortunately this process is very slow (taking up 90% of the time when fetching roundtripped revisions).

Martin Pool (mbp)
tags: added: performance
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 755642] Re: calculating testament sha1 is very slow

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 4/10/2011 1:28 AM, Martin Pool wrote:
> ** Tags added: performance
>

Testament sha1 is currently designed to be an O(tree) validator. So we
have to load all the inventory and generate it.

The only way to have it ultimately scale is if we could only compute a
subset of the whole testament (tree based). However, the main problem is
that it becomes highly subject to how you encode the tree. And we've
tried to have Testament be decoupled from any given repository
implementation.

I see 2 options

1) Tie it to the current tree-based CHK repository implementation.
2) Create a new (semi-arbitrary) layout, but make the intermediate nodes
'cacheable' and able to be looked up based on some easy-to-compute factor.

I'm not sure what an "easy-to-compute" factor would be. Arguably it
could be the tree shape, but I don't know how you'd be able to tell that
nothing in a subtree has been modified without actually recursing into
that tree (at least with our current design).

That said, there could be some 'cheap' wins, but it would still be an
O(tree) operation.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2hs34ACgkQJdeBCYSNAAMngACfZ3HLbfZxD5JZRXAsWz4xm8Xo
GYIAnRLUGUc4LZacFxsVFTYD+wUEwq5p
=5M/I
-----END PGP SIGNATURE-----

Revision history for this message
Jelmer Vernooij (jelmer) wrote : Re: [Bug 755642] [NEW] calculating testament sha1 is very slow

On Sat, 2011-04-09 at 16:37 +0000, Jelmer Vernooij wrote:
> Public bug reported:
>
> bzr-svn and bzr-git have started using testament sha's as ways to verify
> that revisions have roundtripped correctly. Unfortunately this process
> is very slow (taking up 90% of the time when fetching roundtripped
> revisions).

This is also relevant for build from branch into primary, where we were
looking at using GPG signed testaments as a way of verifying the
authenticity of a revision.

Cheers,

Jelmer

Revision history for this message
Jelmer Vernooij (jelmer) wrote : Re: [Bug 755642] Re: calculating testament sha1 is very slow

On Sun, 2011-04-10 at 13:41 +0000, John A Meinel wrote:
> On 4/10/2011 1:28 AM, Martin Pool wrote:
> > ** Tags added: performance
> >
>
> Testament sha1 is currently designed to be an O(tree) validator. So we
> have to load all the inventory and generate it.
>
> The only way to have it ultimately scale is if we could only compute a
> subset of the whole testament (tree based). However, the main problem is
> that it becomes highly subject to how you encode the tree. And we've
> tried to have Testament be decoupled from any given repository
> implementation.
>
> I see 2 options
>
> 1) Tie it to the current tree-based CHK repository implementation.
> 2) Create a new (semi-arbitrary) layout, but make the intermediate nodes
> 'cacheable' and able to be looked up based on some easy-to-compute factor

Revision history for this message
Martin Pool (mbp) wrote :

I think (1) is probably a reasonable way to go: a validator that
changes only when the tree format changes will be stable enough to be
useful. We'll have to check firstly that it actually is stable across
everything but a format upgrade, and secondly that it does include
every important aspect of the tree.

Martin

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 755642] [NEW] calculating testament sha1 is very slow

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 4/10/2011 8:48 PM, Jelmer Vernooij wrote:
> On Sat, 2011-04-09 at 16:37 +0000, Jelmer Vernooij wrote:
>> Public bug reported:
>>
>> bzr-svn and bzr-git have started using testament sha's as ways to verify
>> that revisions have roundtripped correctly. Unfortunately this process
>> is very slow (taking up 90% of the time when fetching roundtripped
>> revisions).
>
> This is also relevant for build from branch into primary, where we were
> looking at using GPG signed testaments as a way of verifying the
> authenticity of a revision.
>
> Cheers,
>
> Jelmer
>

Except aren't you only caring about a single revision there? It takes
maybe a second or 2 to compute a testament (I could see a bit more for
large trees, but my iter_entries_by_dir() changes either helps a ton, or
can be tweaked to help.)

I don't think this would be a significant fraction of the time spent in
BuildFromBranch.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2iux4ACgkQJdeBCYSNAANf2QCfVD8JvxnjvXd8ziKX7RiUVJgu
F5gAnjEE4fltOBK/Tbvd2d0H5WOTAjj6
=Xo7U
-----END PGP SIGNATURE-----

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

On Mon, 2011-04-11 at 08:26 +0000, John A Meinel wrote:
> On 4/10/2011 8:48 PM, Jelmer Vernooij wrote:
> > On Sat, 2011-04-09 at 16:37 +0000, Jelmer Vernooij wrote:
> >> Public bug reported:
> >>
> >> bzr-svn and bzr-git have started using testament sha's as ways to verify
> >> that revisions have roundtripped correctly. Unfortunately this process
> >> is very slow (taking up 90% of the time when fetching roundtripped
> >> revisions).
> >
> > This is also relevant for build from branch into primary, where we were
> > looking at using GPG signed testaments as a way of verifying the
> > authenticity of a revision.
> Except aren't you only caring about a single revision there? It takes
> maybe a second or 2 to compute a testament (I could see a bit more for
> large trees, but my iter_entries_by_dir() changes either helps a ton, or
> can be tweaked to help.)
>
> I don't think this would be a significant fraction of the time spent in
> BuildFromBranch.
No, but if we're changing the testament format then we should probably
make sure to do it before we start using the old format for bfbip.

Cheers,

Jelmer

Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.