Comment 4 for bug 562666

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 562666] Re: 2a fetch is very cpu intensive

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

William Grant wrote:
> A checkout of the above imported kernel over my 10Mbps high-latency
> Internet connection took a little under four hours. Even a local branch
> is slow -- an lsprof'd run (output attached) took 96 minutes, while a
> non-lsprof'd one looks like it will take around an hour (it's not done
> yet, though).
>
> ** Attachment added: "lsprof of local kernel branch"
> http://launchpadlibrarian.net/44115461/lsprof
>

Note that lsprof is showing that the bulk of the time is spent
processing the inventory pages, looking for (file_id, revision_id) keys
that need to be fetched.

Specifically, it is claiming that of the 5.8k seconds to run the whole
branch op, 4.2ks are in _filter_text_keys, w/ 1.7k spent deserialising
the pages to pull out the text keys.

There seem to be 1.4M pages, which triggers 110M text key checks.

Some of that time we can't avoid. However, I'm also guessing that a huge
part of that time is the overhead of hitting so many text keys that the
gc overhead becomes large. (an expense partially caused by fetching
everything-at-once.)

I don't know that there is a lot of tuning we can do here, but there
would probably be some.

The other issue is that probably a lot of the keys are duplicates, so we
end up spending a fair amount of time determining that.

One option might be to store a 'text_keys' like set with the serialized
form, and have a quicker check to see if the serialized form has already
been read, thus avoiding deserializing...

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvFUMEACgkQJdeBCYSNAAOL+ACgywYy7226U/sPNWh0NGONvWu1
9zAAn3elUU3gf0Mi+eCrSHCsV8XO62/X
=kzgY
-----END PGP SIGNATURE-----