Pushing a loom takes a long time even though it's stacked

Bug #651656 reported by Barry Warsaw on 2010-09-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Distributed Development
High
Unassigned

Bug Description

This may be Bazaar or Launchpad code hosting instead of UDD, but let's start here. :)

I created a branch of a local mirror of lp:python called issue9807. I loomified the issue9807 branch and created a thread to do some development work in. After my branch was ready, I 'bzr record'd and tried to push it to lp:~barry/python/issue9807. bzr told me it was stacking the branch correctly, but it still took a *huge* amount of time to push the branch.

% time bzr push lp:~barry/python/issue9807
Using default stacking branch /~python-dev/python/py3k at lp-60913168:///~barry/python
Created new stacked branch referring to /~python-dev/python/py3k.

real 89m9.467s
user 2m39.700s
sys 0m3.010s

Wow, that sucks.

When you have a network performance bug, please put

debug_flags = hpss

into your ~/.bazaar/bazaar.conf, or -Dhpss on the command line, and
attach the excerpt from .bzr.log, so we can see where the time goes.

Thanks

Barry Warsaw (barry) wrote :
James Westby (james-w) wrote :

Hi,

It's doing a million billion get_parent_map calls. I don't know which component is
at fault either.

Thanks,

James

Changed in udd:
status: New → Confirmed
importance: Undecided → High

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Oct 04, 2010, at 03:56 PM, John Arbash Meinel wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>On 9/29/2010 6:52 PM, Barry Warsaw wrote:
>
>> % time bzr push lp:~barry/python/issue9807
>> Using default stacking branch /~python-dev/python/py3k at
>> lp-60913168:///~barry/python Created new stacked branch referring
>> to /~python-dev/python/py3k.
>>
>> real 89m9.467s
>> user 2m39.700s
>> sys 0m3.010s
>>
>Do you still have this in ~/.bzr.log? There should be a "bytes" used
>put there, in case it was actually copying content. (you can get this
>to print on the terminal always with "-Dbytes")
>
>I know James said he saw get_parent_map calls, which would not transfer
>much content.

I don't have the bzr.log any more but it's totally reproducible. I'll run the
push again with -Dhpss and -Dbytes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJMqksqAAoJEBJutWOnSwa/EcMP/Rdrj3gcrM/TV20y5us3A938
1Ps18kdRiv32yJKhOTQCsf/m5vVbblT0H7osM5a2rdzLL4hMWayK2BYLyI7is+op
oYZ6V27u4j0sZiIbMZGMaNTUK0zDCa2AN6tiGaDdAoawIOVJpgP7HzUCA4iyQLfO
KNXPn58kSwBUDD4H9v7bcXx1okzvrrje0RvvjGtnNkf1Xc7kXMpx0+h6ULi6fbkL
uPQxAUnHmFcM0i0JOaFwoMtK70cOqA/CZ5n6sSudlyAuCvPrYjW2zqY7150duGVA
EnYp29qIU3n+ttLtik6QbvvXbXX1paf+ueIDklu3wDlF36jwdjLpSHBrhVme+x76
YSyRu7g2FlqhlC1KkJHD/wtYtaOBVMrSBObu2xXyirr7358yjDg5ZpCgW5LrwSOz
GqRNC3tpGmSSXUV5LMdN6xhOll2WtK0QRO+mk1U0oI2h3GasanKqnFtuhkuKTg9z
qeq5gR7iiENjVSd+Vuz3kkvmrT2sqtLHHZndrr7kWVHYJByHfcHd48ErakCpc2m1
+oYy1PZFN2MVTMDrDVc0MIWhT+mt4QEDE0Ic8rPcURDo9SKNA1TDTqDifkDsxDfF
gZNPfrWUxSAq8/yUz8j5vkvdklFP6XOzsu4aKwggoCuPY2ChfuDHYzIWKsoBJAVJ
OTWwNM2jTy+74yquGWTF
=d9Ek
-----END PGP SIGNATURE-----

loom pushes each thread separately; its a bug [mostly] addressed in
the underlying bzr codebase these days, should be a small stretch for
someone familiar with streaming fetch (e.g. andrew) to fix.

-Rob

Barry Warsaw (barry) wrote :

% time bzr push -Dhpss -Dbytes lp:~barry/python/timing
Using default stacking branch /~python-dev/python/py3k at lp-82576784:///~barry/python
Created new stacked branch referring to /~python-dev/python/py3k.
Transferred: 31633kB (5.0kB/s r:7802kB w:23831kB)
HPSS calls: 43480 (64 vfs) SmartSSHClientMedium(bzr+ssh://<email address hidden>/)

real 104m57.374s
user 2m58.510s
sys 0m7.590s

Barry Warsaw (barry) wrote :
Vincent Ladeuil (vila) wrote :

@barry: did you retry recently with a bzr version >= 2.2 ?

Vincent Ladeuil (vila) wrote :

grr, > 2.2 (you did try with 2.2)

Barry Warsaw (barry) wrote :

Unfortunately lp:python doesn't exist any more so I can't reproduce this exactly. However, I branched lp:launchpad, added four threads, and pushed it:

@limelight[/tmp/python/loomtest:1032]% time bzr push -Dhpss -Dbytes lp:~barry/launchpad/loomtest2
Using default stacking branch /+branch-id/24637 at lp-85464912:///~barry/launchpad
Created new stacked branch referring to /+branch-id/24637.
Transferred: 12702kB (7.2kB/s r:4549kB w:8153kB)
HPSS calls: 13420 (63 vfs) SmartSSHClientMedium(bzr+ssh://<email address hidden>/)

real 29m24.819s
user 0m22.290s
sys 0m4.410s

I'll attach ~/.bzr.log again. Please note, I haven't tried to push an unloomified LP yet.

Barry Warsaw (barry) wrote :
John A Meinel (jameinel) wrote :

This is definitely a "get_parent_map" spinning bug. There are about 13k calls to get_parent_map in that file. What confuses me is the keys it is sending:
42.578 hpss call w/body: 'Repository.get_parent_map', '~barry/launchpad/loomtest2/', 'include-missing:', '<email address hidden>' ('<email address hidden>'...)
42.579 51 bytes
42.693 result: ('ok',)
42.693 101 body bytes read
42.693 hpss call w/body: 'Repository.get_parent_map', '~barry/launchpad/loomtest2/', 'include-missing:', '<email address hidden>' ('<email address hidden>'...)
42.694 51 bytes
42.806 result: ('ok',)
42.806 102 body bytes read

Note that "<email address hidden>..." is present pretty much every line. (grep says that out of 13280 get_parent_map calls, 13280 also have the <email address hidden>-* line.)

Something is certainly weird that it isn't walking farther. I suppose it considers the revision a ghost? And it is unable to push it out of the queue.

Anyway, walking all the history looks like walking forever to find what revs are present/missing, but it seems to be doing something really wrong at the same time.

John A Meinel wrote:
> 42.693 hpss call w/body: 'Repository.get_parent_map', '~barry/launchpad/loomtest2/', 'include-missing:', '<email address hidden>' ('<email address hidden>'...)
> 42.694 51 bytes
> 42.806 result: ('ok',)
> 42.806 102 body bytes read
>
> Note that "<email address hidden>..." is present pretty much every line.
> (grep says that out of 13280 get_parent_map calls, 13280 also have the
> <email address hidden>-* line.)

That's the body of the get_parent_map request, which is the “current search
state”. IIRC that's basically the client saying “I already have these, so don't
resend them” so that looks fairly normal to me.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers