hpss data_stream from pack repository has deltas out of order, fails with KnitCorrupt exception

Bug #164637 reported by Martin Pool
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
Martin Pool

Bug Description

Pulling from one pack repository into another using hpss can fail like this:

  File "/home/mbp/repo/trunk/bzrlib/workingtree.py", line 1529, in pull
    possible_transports=possible_transports)
  File "/home/mbp/repo/trunk/bzrlib/decorators.py", line 165, in write_locked
    return unbound(self, *args, **kwargs)
  File "/home/mbp/repo/trunk/bzrlib/branch.py", line 1697, in pull
    run_hooks=run_hooks)
  File "/home/mbp/repo/trunk/bzrlib/decorators.py", line 165, in write_locked
    return unbound(self, *args, **kwargs)
  File "/home/mbp/repo/trunk/bzrlib/branch.py", line 1489, in pull
    self.update_revisions(source, stop_revision)
  File "/home/mbp/repo/trunk/bzrlib/decorators.py", line 165, in write_locked
    return unbound(self, *args, **kwargs)
  File "/home/mbp/repo/trunk/bzrlib/branch.py", line 1458, in update_revisions
    self.fetch(other, stop_revision)
  File "/home/mbp/repo/trunk/bzrlib/decorators.py", line 165, in write_locked
    return unbound(self, *args, **kwargs)
  File "/home/mbp/repo/trunk/bzrlib/branch.py", line 286, in fetch
    pb=nested_pb)
  File "/home/mbp/repo/trunk/bzrlib/repository.py", line 862, in fetch
    return inter.fetch(revision_id=revision_id, pb=pb, find_ghosts=find_ghosts)
  File "/home/mbp/repo/trunk/bzrlib/decorators.py", line 165, in write_locked
    return unbound(self, *args, **kwargs)
  File "/home/mbp/repo/trunk/bzrlib/repository.py", line 2528, in fetch
    pb=pb)
  File "/home/mbp/repo/trunk/bzrlib/fetch.py", line 103, in __init__
    self.__fetch()
  File "/home/mbp/repo/trunk/bzrlib/fetch.py", line 132, in __fetch
    self._fetch_everything_for_revisions(revs, pp)
  File "/home/mbp/repo/trunk/bzrlib/fetch.py", line 408, in _fetch_everything_for_revisions
    self.to_repository.insert_data_stream(data_stream)
  File "/home/mbp/repo/trunk/bzrlib/repository.py", line 792, in insert_data_stream
    (format, data_list, StringIO(knit_bytes).read))
  File "/home/mbp/repo/trunk/bzrlib/knit.py", line 759, in insert_data_stream
    (version_id, parents[0]))
KnitCorrupt: Knit text:tar_exporter.py-20051114235828-1f6349a2f090a5d0 corrupt: line-delta in version <email address hidden> from stream references missing parent <email address hidden>

Revision history for this message
Martin Pool (mbp) wrote :

The problem is that the knit indexes for pack repositories don't return their versions in compression-topo-sorted order. The data stream code assumes that they do, so it streams out-of-order deltas, and the client crashes trying to apply them.

Changed in malone:
assignee: nobody → mbp
importance: Undecided → Critical
status: New → Confirmed
Martin Pool (mbp)
Changed in bzr:
milestone: none → 1.0rc1
Revision history for this message
Martin Pool (mbp) wrote :

There are get_data_stream methods at two levels: the repository and the versionedfile. (Possibly one of them should be renamed.)

The repository data stream is a sequence of (key, vfdatastream), where the key says which versionedfile to apply the data to, and the data stream is the bytes from that version file's get_data_stream method.

This bug lies in the implementation of the KnitVersionedFile method, which is assuming the index so it seems appropriate to fix and test it there, without reference to the repository.

There are multiple implementations of the knit internals objects (including access_method and index), but most of the knit tests are just run on the default implementation, corresponding to the older .knit files. This is not necessarily unreasonable if we retest the specific parts that are expected to vary per component.

We should also document in the knit .versions method that it's not promised to be in order.

Changed in bzr:
status: Confirmed → In Progress
Revision history for this message
Martin Pool (mbp) wrote :
Changed in bzr:
status: In Progress → Fix Committed
Changed in bzr:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.