Stream fetch fails "insert_from_broken_repo" test

Bug #389141 reported by John A Meinel on 2009-06-18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Robert Collins

Bug Description

Hash: SHA1

So in trying to land my PackStreamSource patch, I ran into an issue with
streaming fetch, in general. Namely the test:

 bzr selftest -s bt.test_repository test_insert_from_broken_repo

Now, #1 this is only a test of a single format, when it really seems
like it should be a per_repo or per_inter_repo test.

Changing from InterPackRepository to PackStreamSource causes the
exception to not be raised, but as near as I can tell, that is true for
*any* StreamSource (at the moment).

The part that the test is triggering is:

1) Insert a text into the repository (fid, rev1c)
2) Build a new revision rev3 which does *not* reference rev1c, but the
*text* (fid, rev3) claims (fid, rev1c) as a parent.
3) During the insert, we call _check_references, which calls
"_get_external_refs()". This ends up seeing (fid, rev1c) being
referenced, but not present in the repository.

With StreamSink.insert_stream() things change. There are 2 differences:

Namely for missing entries it does:

for prefix, versioned_file in (
                ('texts', self.target_repo.texts),
... ):
                if versioned_file is None:
                missing_keys.update((prefix,) + key for key in

It then does a single "fetch me whatever is missing in the first pass",
and then fails if there is still anything missing.

The issue breaks down in 2 ways:

1) It is somewhat random whether (fid, rev1c) is going to be a
compression parent. It certainly wouldn't for --2a repos, as it stands
it randomly is because the text sizes are 0, so you always get a delta
chain of length 1. (1 fulltext, 1 delta)
Because (fid, rev1c) ends up being a compression parent, it goes ahead
and fetches the text, and then nothing is missing. (sort of good, in
that it fills in what would otherwise be missing, though that missing
entry is a bad reference.)

2) So to make it fail, I tried increasing the length of the bogus
references, adding (fid, rev3) => (fid, rev1c) => (fid, rev1d)

However, the get_stream_for_missing_keys() will write a fulltext for
(fid, rev1c) because that is how we designed it.

Which means that again there are no missing *compression* parents. What
*is* missing is the actual parent text.

Now the check that Packer was doing, would be to actually check
"get_missing_parent_keys()". Though this would break with stacking (but
note that Packer doesn't get used for stacked repos...)

One option would be something that would:
 get missing_compression_parents
     and missing_parents_that_arent_in_fallback

Except with the way that stacking works with Smart requests, you don't
*have* fallbacks that you can query...

So at this point I feel like I'm out of options. I can't write a runtime
missing parent check that will work with Smart + Stacking, and the
'get_stream_for_missing_keys()' will fill in the compression parent.

I've decided to just disable that test for now, and submit this as a bug
report to Launchpad so we don't forget about it. But I would certainly
be interested in any feedback people have about possible solutions.


  affects bzr
  status triaged
  importance medium
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla -


Tags: 2a Edit Tag help

Related branches

Robert Collins (lifeless) wrote :

2a requires that we address this now.

AIUI the problem is solved now; that is we can copy successfully from this sort of corruption.

So why not change to test to handle both:
 - error
 - the text is extractable.

That should cover all bases, and I'm working on that now.

tags: added: 2a
Changed in bzr:
status: Triaged → Fix Released
assignee: nobody → Robert Collins (lifeless)
milestone: none → 2.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers