Bazaar Version Control System

'bzr push' copies the entire repository if there is a BzrDir but not a Branch

Reported by John A Meinel on 2009-10-30
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
High
John A Meinel
2.0
Undecided
Unassigned
2.3
High
John A Meinel
bzr (Ubuntu)
Undecided
Unassigned
Natty
High
Jelmer Vernooij

Bug Description

I don't know what is specifically wrong, but I don't want to release the next version without getting a handle on this.

I just did some work to set up the release branches for bzr 2.0.2 and bzr 2.1.0b2. I then went to push that work to a staging area on 'babune' because I was at a coffee house and babune is closer to Launchpad than I am. (So a few pushes to a shared repo, and then push from there to launchpad.)

However, as I tried to push it seemed to be pushing *way* too much content, given that the remote repository should have all of bzr.dev.

When I instrumented it, I got:

fetching revisions: 11086
streaming inventories: 11086
streaming chk, excluding: 276
streamed 17737 chk pages
found text keys: 33378

So somehow my small incremental branch is trying to push 11k revisions up to the remote side.

'bzr missing --include-merges --mine ../bzr.dev' claims that I only have 7 new revisions, which is obviously << 11k.

Also, in the '-Dhpss' traceback I see lines like:

12.543 Using fetch logic to copy between CHKInventoryRepository('file:///C:/Users/jameinel/dev/bzr/.bzr/repository/')(<RepositoryFormat2a>) and RemoteRepository(bzr+ssh://<email address hidden>:26662/home/jam/dev/bzr-test/.bzr/)(<RemoteRepositoryFormat>)
12.552 fetch up to rev {None}
12.553 hpss call: 'get', '/home/jam/dev/bzr-test/.bzr/branch-format'
12.553 (to bzr+ssh://<email address hidden>:26662/home/jam/dev/bzr-test/lp/2.1.0b3-dev/)
12.694 result: ('ok',)
12.694 35 body bytes read
12.695 hpss call: 'get', '/home/jam/dev/bzr-test/.bzr/repository/format'
12.695 (to bzr+ssh://<email address hidden>:26662/home/jam/dev/bzr-test/lp/2.1.0b3-dev/)
12.835 result: ('ok',)
12.836 54 body bytes read
12.836 hpss call: 'stat', '/home/jam/dev/bzr-test/.bzr/repository'
12.836 (to bzr+ssh://<email address hidden>:26662/home/jam/dev/bzr-test/lp/2.1.0b3-dev/)
12.976 result: ('stat', '4096', '040755')
12.979 hpss call: 'get', '/home/jam/dev/bzr-test/.bzr/repository/pack-names'
12.979 (to bzr+ssh://<email address hidden>:26662/home/jam/dev/bzr-test/lp/2.1.0b3-dev/)
13.132 result: ('ok',)
13.132 696 body bytes read
13.139 RemoteSSHTransport.readv 3 offsets => 1 coalesced => 1 requests (1)
13.139 hpss call w/readv: 'readv', '/home/jam/dev/bzr-test/.bzr/repository/indices/f82954caa7b2bf4e81d54c0dd172c542.rix'
13.139 7 bytes in readv request
13.325 result: ('readv',)
13.493 10520 body bytes read
13.497 RemoteSSHTransport.readv 2 offsets => 1 coalesced => 1 requests (1)
13.497 hpss call w/readv: 'readv', '/home/jam/dev/bzr-test/.bzr/repository/indices/f82954caa7b2bf4e81d54c0dd172c542.rix'
13.497 9 bytes in readv request
13.682 result: ('readv',)
13.704 6424 body bytes read

Which show that it starts out using a "RemoteRepository" but then seems to quickly "ensure_real" determine the exact remote format, and then probe the remote indexes directly, rather than using the smarter 'get_parent_map' implementations.

I'm sure it is still using GCCHKStreamSource when it is done, because that is the code I instrumented to get the debug values.

I will post more as I figure out more.

John A Meinel (jameinel) wrote :

As near as I can tell, it is trying to copy my entire repository, rather than *just* copying the ancestry of the branch. Poking some more to figure out why.

John A Meinel (jameinel) wrote :

So, I tried pushing again, this time to a *different* directory, and I got:
fetching revisions: 5
streaming inventories: 5
streaming chk, excluding: 2
streamed 19 chk pages
found text keys: 12

Which is what I expected to get the first time. Now, I've had some troubles with these pushes, so the remote 'branch' may not be in a perfect state.

Specifically, it appears that there *is* a .bzr directory, and it has a .bzr/branch-format but it does *not* have a .bzr/branch directory.

So perhaps this can be downgraded to:
  bzr push to existing bzrdir (without branch) copies entire repository to remote

The ensure real should only happen if the server is old: what bzr
version is on babune?

-Rob

Looking at the code, it appears that if the bzrdir exists but there is no branch, but there *is* a repository, we end up in this code path:

        if br_to is None:
            # We have a repository but no branch, copy the revisions, and then
            # create a branch.
            repository_to.fetch(source.repository, revision_id=revision_id)
            br_to = source.clone(self, revision_id=revision_id)
            if source.get_push_location() is None or remember:
                source.set_push_location(br_to.base)
            push_result.stacked_on = None
            push_result.branch_push_result = None
            push_result.old_revno = None
            push_result.old_revid = _mod_revision.NULL_REVISION
            push_result.target_branch = br_to
            push_result.master_branch = None
            push_result.workingtree_updated = False

If we *do* have a branch we use:
                push_result.branch_push_result = source.push(br_to,
                    overwrite, stop_revision=revision_id)

This ends up going through InterBranch.push() which winds around again through '_basic_push' and seems to end up in InterBranch.update_revisions (only now having swapped source and target, so it is target.update_revisions(source)), which seems to indicate that we call
 remote_branch.fetch(local_branch, stop_revision)

(note that I'm quite horribly confused by the double indirection through InterBranch, and the fact that the default .update_revisions does this:
    def update_revisions(self, other, stop_revision=None, overwrite=False,
                         graph=None):
...
        return InterBranch.get(other, self).update_revisions(stop_revision,
            overwrite, graph)

^- Which as near as I can tell makes 'other' the self.source attribute for InterBranch, which looking at update_revisions hints that it would then be fetching "remote => local" which certainly doesn't make any sense.

And if we don't have a bzrdir we use:
            br_to = br_from.create_clone_on_transport(to_transport,
                revision_id=revision_id, stacked_on=stacked_on,
                create_prefix=create_prefix, use_existing_dir=use_existing_dir)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> The ensure real should only happen if the server is old: what bzr
> version is on babune?
>
> -Rob
>

2.0.1 as near as I can tell.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrrYZIACgkQJdeBCYSNAAOyQQCfbcdgKAlXSOv/G3CuXnsLLh8e
ib0AoJAR9krd/F5BjBhLkX+6DqOqQROb
=q497
-----END PGP SIGNATURE-----

Downgrading from critical because it only triggers in somewhat exotic circumstances. (canceling a push into a shared repo could create the target bzrdir but not the target branch, pushing again will trigger this sort of bug.)
I'd still probably like to see the fix in the 2.0 series and preferably for 2.0.2

Changed in bzr:
importance: Critical → High
John A Meinel (jameinel) on 2009-10-31
summary: - 'bzr push' severely degraded
+ 'bzr push' copies the entire repository if there is a BzrDir but not a
+ Branch
John A Meinel (jameinel) wrote :

I think I understand it. I haven't probed deep enough to be sure but this line:

if br_to is None:
            # We have a repository but no branch, copy the revisions, and then
            # create a branch.
            repository_to.fetch(source.repository, revision_id=revision_id)

Looks like it is doing the right thing, but I think it is subtly wrong. Namely, 'revision_id' is a string only if the user specifies "bzr push -r XXXX", otherwise it is just "None".

This handled in the other code paths with a:

  if stop_revision is None:
    stop_revision = the_branch.last_revision()

But that isn't being done in this code path.

Jelmer Vernooij (jelmer) on 2011-02-01
tags: added: push stackin
tags: added: stacking
removed: stackin
Martin Pool (mbp) wrote :

I've taken the milestone off because it doesn't sound like this will be done in 2.1, and certainly in a release that's already out.

Changed in bzr:
milestone: 2.1.0b2 → none
John A Meinel (jameinel) on 2011-03-23
Changed in bzr:
status: In Progress → Fix Released
milestone: none → 2.4b2
Jelmer Vernooij (jelmer) on 2011-06-08
Changed in bzr (Ubuntu):
status: New → Fix Released
Changed in bzr (Ubuntu Natty):
status: New → In Progress
Jelmer Vernooij (jelmer) on 2011-06-10
Changed in bzr (Ubuntu Natty):
importance: Undecided → High
assignee: nobody → Jelmer Vernooij (jelmer)

Accepted bzr into natty-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in bzr (Ubuntu Natty):
status: In Progress → Fix Committed
tags: added: verification-needed
Clint Byrum (clint-fewbar) wrote :

Hello John, or anyone else affected,

Accepted bzr into natty-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Jelmer Vernooij (jelmer) wrote :

Verified by running the bzr testsuite from the package in a clean natty install.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package bzr - 2.3.4-0ubuntu1

---------------
bzr (2.3.4-0ubuntu1) natty-proposed; urgency=low

  * New upstream release.
   + Fix bzr version number in deprecation warnings. LP: #794960
   + Prevent write attemps on remote branch during "bzr up". LP: #786980
   + Fix conflict handling when two trees involved in a merge have different
     root ids. LP: #805809

bzr (2.3.3-0ubuntu1) natty-proposed; urgency=low

  * New upstream release.
   + Fixes deprecation warning on newer versions of Python. LP: #760435
   + Stops 'bzr push' from copying entire repository if a .bzr directory is
     present without a branch. LP: #465517
   + Fixes undefined local variable error when waiting for lock. LP: #733136
   + Fixes lock contention issues pushing to a bound branch. LP: #733350
   + Transfers less data creating a new stacked branch. LP: #737234
   + Several fixes to the test suite, making it more robust. LP: #654733,
      LP: #751824
   + 'bzr merge --pull --preview' actually shows a preview rather than
     actually merging. LP: #760152
   + bzr smart server now supports UTF-8 user names. LP: #659763
   + user identity can now be set based on username and /etc/mailname, not
     requiring it to be set manually. LP: #616878
   + stacking is now fully transitive. LP: #715000
   + makes in-terminal crash report of plugins much shorter. LP: #716389
 -- Jelmer Vernooij <email address hidden> Thu, 14 Jul 2011 21:12:58 +0200

Changed in bzr (Ubuntu Natty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers