many get_parent_map calls during walk to common revisions when branching into empty repository

Bug #388269 reported by Michael Hudson-Doyle
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Fix Released
John A Meinel
John A Meinel

Bug Description

Copying data when a long history has to be walked to determine what to copy is slow. This commonly shows up when copying into a shared repository created with 'bzr init-repo'. Over a dumb protocol btree-caching kicks in and the random layout of graph data vs index order means that it is appreciably faster for this part of the fetch operation.

See bug 435048 for more analysis

Related branches

Martin Pool (mbp)
Changed in bzr:
importance: Undecided → High
status: New → Confirmed
status: Confirmed → New
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

bzr-svn also has this problem inside "bzr svn-import" .

Changed in bzr:
status: New → Triaged
description: updated
tags: added: hpss
Martin Pool (mbp)
summary: - getting a branch into an empty shared repository takes a long time to
- figure out revisions to send
+ many get_parent_map calls during walk to common revisions when branching
+ into empty repository
Changed in bzr:
status: Triaged → Confirmed
description: updated
tags: added: fetch performance
Martin Pool (mbp)
tags: added: affects-linaro
John A Meinel (jameinel)
Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (3.8 KiB)

A little bit of background for the gcc-linaro case. In bzr-2.4 we start requesting all the tagged revisions as heads, though this turns out not to help as much as we might like.
The first request is for 2690 tips, of which only 64 are found. (lots of tagged revs that aren't in the history).
We then get enough data back in that one request to step 4 times (64-78 revs). The next request is for 84 tips, which gives us back 16k compressed data, enough to make 32 more steps (at about 80 each time).

Seems pretty good so far.

The number of search tips starts dropping pretty rapidly after that point, though we seem to still do a decent job at filling in extra history. Overall I only see 52 Repository.get_parent_map RPC calls, though while walking the graph we make 71,595 steps. We are walking 10-at-a-time from step 1300, and one-at-a-time from step 6534 on. So there are a *lot* of steps that we do one-by-one on the graph. Which makes some sense, since gcc's history gets pretty linear.

I'm still trying to sort out where the time is spent, given how few RPCs we seem to actually be making. With my extra debugging, it is taking 335s (5.6min) to find all the revisions we want to transmit.

One bit that showed up as a point of concern is the _get_parent_map_rpc code. Which always walks the entire cached dict to formulate the next query. Which looks like it is going to be O(N^2) in nature. Since by the end of all this walking, we'll have something like 100k entries in the cache. However, lsprof shows that *by far* the bulk of the time is in the actual RPC.

I think, though, that the logic about start and stop revisions is causing us harm. Specifically, this is the time spent in each get_parent_map rpc call:
2.3, 0.7, 0.6, 0.8, 1.0, 0.8, 1.0, 1.3, 1.4, 1.8, 1.5, 2.1, 2.0, 3.1, 3.2, 3.7, 2.9, 2.8, 4.2, 4.9, 6.1, 5.0, 5.5, 5.3, 5.5, 5.5, 5.8, 5.9, 6.4, 6.2, 7.0, 6.6, 7.1, 7.0, 7.5, 7.4, 7.7, 8.0, 8.2, 8.2, 8.6, 8.9, 9.1, 8.9, 9.1, 9.4, 9.5, 11.6, 11.1, 10.0, 10.4, 13.8

If you graph that, it follows an almost completely straight line (R^2 = 0.95), with each request taking 0.2s longer than the previous one.

I'll need to profile the server side and see what might be causing this.

I have another .bzr.log where I was logging the call sizes. And I noticed that "key_count" gets really large for the recipe.

The final request looks like this:
327.535 Doing _get_parent_map_rpc, 110260 cache size, 5 start_set, 110256 result_parents, 1 stop_keys, 110255 included_keys, 1 keys
327.535 keys: set(['svn-v4:138bc75d-0d04-0410-961f-82ee72b054a4:trunk:1152']) body: 'svn-v4:138bc75d-0d04-0410-961f-82ee72b054a4:branches/gcc-4_6-branch:175472 ...\n110260'
327.535 hpss call w/body: 'Repository.get_parent_map', '+branch/gcc-linaro/', 'include-missing:', 'svn-v4:138bc75d-0d04-0410-961f-82ee72b054a4:trunk:1152' ('svn-v4:138bc75d-0d04'...)
327.535 378 bytes
337.595 result: ('ok',)
337.595 2526 body bytes read

So that is only 5 start keys, but 110256 keys in the search set. Does that mean we are starting the search over the *entire ancestry* from scratch on each request.... ugh. This does not seem to be the most efficient way to do a get...


Revision history for this message
John A Meinel (jameinel) wrote :

So to be clear, I think the slowdown is in

SmartServerRepositoryGetParentMap and the fact that it is doing:
        search_result, error = self.recreate_search_from_recipe(
            repository, body_lines)

Which is causing us to walk 100,000 revisions to find out what the next 1000 revisions are.

Also note, that by the time we get to even 20k revisions searched, we are pretty much stepping one-by-one. Now, these 100k revs *are* used on the server to make sure that we don't return parents that we already know about. However, I'm pretty sure that while this is probably good when N searched is small, it is pretty terrible when N searched is big.

Revision history for this message
John A Meinel (jameinel) wrote :

My current thought is this. We are trying to tell the server about the revisions we've already seen, so it doesn't send us those again.

However, locality says that if I'm currently looking at X, then really only revisions 'near' X are going to be encountered again. This isn't strictly true, because of history shortcutting (daggy fixes, in particular, mean that you'll skip a lot of history in one half of the search, which you'll run into again after a long time in the other half of the search.)
While it isn't strictly true, it could certainly limit the effort tremendously.

Having greatest-distance-from-origin (generation count), would be another way to limit this. Instead of walking all parents equally, you'd favor parents that have higher generation number, presuming that parents with low generation number are skipping all of the interesting revisions. You could do a little bit of both, but you could come up with better batches this way. (So if you had 100 parents to search, and 2 of them are at GDFO 1000, and the other 98 of them are at GDFO 10000, you can skip searching the 2 low ones for now.)

Anyway, I'm going to try something that just generates a fake 'local' search, and see how that performs.

Revision history for this message
John A Meinel (jameinel) wrote :

In the associated branch, I use this logic:

If the total walked-revisions exceeds 10,000, start at the current search tips, find their immediate children, and set that as the search graph.

I can easily extend this to something that walks the child graph until we hit something like 1k revs, or something else interesting-but-not-obscene.

In doing so, the time for initial discovery drops (against launchpad) from 5.5min down to 1m15s.

I still need a fair amount of polish and testing on the changes, but I think it shows promise.

Revision history for this message
John A Meinel (jameinel) wrote :

I believe the goal of this 'replicate my search' is so that stuff like history shortcuts don't end up asking the server to walk the same history twice. For example:

 B C
 | |
 D |
 | |
 E |
 | |
 F |

if we start walking at G, we'll get to A relatively quickly. When the other side catches up to finding "B", it knows that it doesn't have to keep walking and giving information about A.

However, the *client* is also walking that same information, and will not make requests for A. So the benefit is only in avoiding what would be walked by a single RPC.

Specifically, in the above search, we would benefit telling the server that we've seen A, if the server was at say D, and then it could tell us about B, and then "skip to the end" of the A line.

From what I can tell, this is only about efficiency. If the server sends us back information about A, we'll just either ignore it, or put it into the cache replacing what we already have there.

We save a few bytes-on-the-wire not re-transmitting data we've sent. We lose work on the server because it still has to walk all of those revisions to find out where the client has walked. (And it does the work in request N that it did for all the requests before it.)

The client can't know exactly what the server will walk again, because it hasn't seen B yet. The client does know about N and A.

The client could approximate how many revisions the server will walk. The server-side code is to walk until we have 250,000 bytes of revision-ids to send. At average sizes, that is about 4,000 revisions.

So at best, sending a full search will save the server from sending us 4k revs that we already know about, which happen to be parents of a current search tip, that we already have searched via an older search tip. And also, instead of sending us the 4k we already know about, the server will send us the 4k that we are actually interested in (on the other side of that chain.)

I'm guessing that having us send a search that matches the most recent 1000 or less revisions would give us most of the same benefit, and avoid the O(N^2) behavior.

I'm going to try to think deeply about the involved graphs, because I think we can do something nice here, that will get us 99% of the benefit, without doing a lot of redundant work (by the client or the server).

Revision history for this message
John A Meinel (jameinel) wrote :

The simple 1-revision search method does show more transfer for bzr, though;
23.269 Transferred: 2550kB (110.9kB/s r:1244kB w:1306kB)
15.020 Transferred: 3801kB (258.3kB/s r:1917kB w:1884kB)

That is the amount transferred before actual data streaming. Notice that it is quite a bit faster, but this is over my loopback.

The same requests to lp:bzr gives:
47.896 Transferred: 2545kB (54.5kB/s r:1242kB w:1303kB)
50.192 Transferred: 3796kB (78.0kB/s r:1915kB w:1881kB)

Note than now the extra data transfer actually makes them equally fast. Though I think we can do something smarter than just a single skipping revision. And certainly, *some* sort of limiter to avoid N^2 makes a lot of sense.

John A Meinel (jameinel)
Changed in bzr:
status: Confirmed → In Progress
Revision history for this message
Martin Pool (mbp) wrote :

So, using John's current branch (r6035) of lp:~jameinel/bzr/2.4-too-much-walking-388269, and branching lp:gcc-linaro/4.6 into a new empty repository (as reported by michaelh in, in Australia with roughly a 500ms rtt.

After 1600s, it's still doing get_parent_map calls, and the requests are now each only about 113b.

Revision history for this message
Martin Pool (mbp) wrote :

it's still running for me after 5400s so unfortunately I think this patch is not enough.

Revision history for this message
John A Meinel (jameinel) wrote :

We looked a bit at it, and I think there 2 things going on with the bug.

1) The bit I fixed, which is with a single branch, walking most of its ancestry whenever we ask for more data from the remote server.

2) With a *stacked* configuration. We end up asking the stacked branch "do you have X" it replies "no", then we go to the fallback repository and ask "do you have X", and it replies "yes, X=>Y, Y=>Z, Z=>A". Then we step to Y, and *ask the stacked repository* "Do you have Y", and it says "No", so then we go to the fallback repository and find it in the cache, going to Z. etc.

So basically, we have a read-ahead cache but the 'cache' only exists for the fallback, and we only ask the fallback after we've asked the stacked repository if it knows anything.

The fix basically seems that we just need to expose the fallback cache up to a higher level. Something along the lines of:

a) Stacked: Do you have X in your cache? (No)
b) Fallbacks: Do you have X in your cache? (No)
c) Stacked: RemoteCall Do you have X? (No)
d) Fallbacks: RemoteCall Do you have X? (Yes, and here is more stuff for the cache)

Or combine the cache somehow. Or notice that when we found X it was from the fallback repository, so search for Y there first (locality).

Revision history for this message
Martin Pool (mbp) wrote :

John's branch <> for me gives decent performance branching from lp:gcc-linaro into a new repository: after about 90s it is done looking at the ancestry and then it starts downloading data at roughly wire speed.

However, branching from a stacked branch is not helped by this - we could either split off a follow on bug or leave this open.

In the short term perhaps we should unstack the key linaro series branches; that will probably help them a bit in other ways anyhow.

Revision history for this message
Martin Pool (mbp) wrote :

OK, and branching trunk using John's variant completed in 27m, which is about 324kB/s on average, and 63 round trips. So, it could be better still, but it's at least the right order of magnitude for ssh bandwidth to London from here.

Revision history for this message
John A Meinel (jameinel) wrote :

Vincent and I chatted about this, and I think we've settled that we need a new method on "ParentsProvider", which exposes the fact that some data may be cached, while other data may require an expensive call. This makes it a little bit risky for a stable series, but not unbearably so. I'll plan on progressing by starting with a 2.4 branch, land it in, smoke test it for a while, and then we can decide if we want to backport it.

Note that I think the patch I have already put together for this branch is still useful, it is orthogonal to the next patch.

Revision history for this message
John A Meinel (jameinel) wrote :

landed in trunk

Changed in bzr:
milestone: none → 2.5b1
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.