Packing a stacked branch removes parent inventories (in --2a)

Bug #412198 reported by John A Meinel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
John A Meinel

Bug Description

As part of investigating bug #402778 and bug #393677 I came across this fact.

The "GCPacker" code explicitly copies a given set of revisions (either by passing in the revision keys, or by looking at source.revisions.keys()) and then it *only* copies inventory texts that match those keys.

This breaks stacking invariants when the stacked-on branch is packed, because it *doesn't* copy across the parent inventories that we so carefully uploaded.

Related branches

Revision history for this message
John A Meinel (jameinel) wrote :

So as I did deeper, it looks like things get worse.

1) As near as I can tell, packing for *all* formats removes basis inventories. See this text in pack_repo.Packer._copy_inventory_texts:
    def _copy_inventory_texts(self):
...
        # select inventory keys
        inv_keys = self._revision_keys # currently the same keyspace, and note that
        # querying for keys here could introduce a bug where an inventory item
        # is missed, so do not change it to query separately without cross
        # checking like the text key check below.

2) We only notice on --2a targets because of this:
        if hints and self.target._format.pack_compresses:
            self.target.pack(hint=hints)

Since only --2a considers "self.target._format.pack_compresses" we only run pack with the hints in the case of 2a repositories.

However, the next autopack or someone running "bzr pack" on a stacked branch *will* cause the loss of parent inventory texts and that *will* cause problem interacting with streaming over bzr+ssh.

summary: - Packing a stacked branch in --2a format removes basis inventories
+ Packing a stacked branch removes parent inventories
Revision history for this message
Robert Collins (lifeless) wrote : Re: Packing a stacked branch removes parent inventories

John - that code uses None to mean 'copy all items'. Remember that Packer /had/ two use cases:
 - fetch by making a pack
 - packing

When doing a pack, self._revision_keys is None, and all inventories are preserved (in knit-style pack repos). When fetching by making a pack, we want to copy only those revisions.

This is essentially dead code now though, as we no longer fetch using packer, do we?

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 412198] Re: Packing a stacked branch removes parent inventories

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> John - that code uses None to mean 'copy all items'. Remember that Packer /had/ two use cases:
> - fetch by making a pack
> - packing
>
> When doing a pack, self._revision_keys is None, and all inventories are
> preserved (in knit-style pack repos). When fetching by making a pack, we
> want to copy only those revisions.
>
> This is essentially dead code now though, as we no longer fetch using
> packer, do we?
>

We no longer fetch using Packer, correct.
This code *is* used during pack operations (bzr pack and autopack)

I'll admit to not tracing to see for sure that _revision_keys was None,
and you could be right.

I can say that GCPacker does *not* work that way:
    def _copy_revision_texts(self):
        source_vf, target_vf = self._build_vfs('revision', True, False)
        if not self.revision_keys:
            # We are doing a full fetch, aka 'pack'
            self.revision_keys = source_vf.keys()

So self.revision_keys is always filled out.

We *can* change the inventory code to be:
    def _copy_inventory_texts(self):
        source_vf, target_vf = self._build_vfs('inventory', True, True)
        self._copy_stream(source_vf, target_vf, self.revision_keys,
                          'inventories', self._get_filtered_inv_stream, 2)

=>

    def _copy_inventory_texts(self):
        source_vf, target_vf = self._build_vfs('inventory', True, True)
        inv_keys = source_vf.keys()
        self._copy_stream(source_vf, target_vf, inv_keys,
                          'inventories', self._get_filtered_inv_stream, 2)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqCItgACgkQJdeBCYSNAAMOogCg0PpOfiFBRh5o7VERYVni1kYv
MXwAoMZaI0m0BZ2ocGkfszd3jUP9DByn
=Z2fe
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

On Wed, 2009-08-12 at 02:03 +0000, John A Meinel wrote:

> We no longer fetch using Packer, correct.
> This code *is* used during pack operations (bzr pack and autopack)
>
> I'll admit to not tracing to see for sure that _revision_keys was None,
> and you could be right.

Conversely I haven't debugged to be confident its not getting
accidentally set in 1.9 formats. It would make sense to me to remove the
support for revision_keys as we're not using it to fetch anymore (though
perhaps reconcilechecker needs it still..)

-Rob

Revision history for this message
Robert Collins (lifeless) wrote : Re: Packing a stacked branch removes parent inventories

:!./bzr pack
> /home/robertc/source/baz/bzr.dev/bzrlib/repofmt/pack_repo.py(836)_copy_inventory_texts()
-> inv_keys = self._revision_keys # currently the same keyspace, and note that
(Pdb) self._revision_keys
None

So I'm pretty sure this doesn't affect non gc repositories.

John A Meinel (jameinel)
summary: - Packing a stacked branch removes parent inventories
+ Packing a stacked branch removes parent inventories (in --2a)
Changed in bzr:
status: In Progress → Fix Committed
Vincent Ladeuil (vila)
Changed in bzr:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.