overly large delta creation when fetching from 2a repositories

Bug #390563 reported by Stuart Bishop on 2009-06-22
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Critical
John A Meinel
1.16
Critical
Unassigned
Launchpad itself
High
Unassigned

Bug Description

Bzr does not correctly calculate deltas in 2a branches, resulting in AbsentContentFactory errors when pulling from a stacked branch and overly large network transfers with non-stacked branches.

Workarounds
========

Use SFTP:// or HTTP:// (without a smart server) URLs to read from the branch.

Symptoms
========

Traceback (most recent call last):
... File "/usr/lib/python2.6/dist-packages/bzrlib/smart/message.py", line 336, in read_streamed_body
    _translate_error(self._body_error_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/smart/message.py", line 359, in _translate_error
    raise errors.ErrorFromSmartServer(error_tuple)
ErrorFromSmartServer: Error received from smart server: ('error', "Absent factory for ('duplicatebughandling-20090605130533-0b86u1hktjps17o7-1', '<email address hidden>')")

Robert Collins (lifeless) wrote :

Not bug 365615, probably the older stacking bug with damaged branches.

Changed in bzr:
status: New → Incomplete
Martin Pool (mbp) wrote :
Download full text (3.7 KiB)

So I just got this myself, trying to branch from Launchpad:

mbp@grace% bzr branch lp:~mbp/bzr/progress progress
bzr: ERROR: bzrlib.errors.ErrorFromSmartServer: Error received from smart server: ('error', "Absent factory for ('ostools.py-20060731163025-npjffm46rgnkl50d-1', '<email address hidden>')")

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 831, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 1026, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 643, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/builtins.py", line 1147, in run
    source_branch=br_from)
  File "/usr/lib/python2.6/dist-packages/bzrlib/bzrdir.py", line 1176, in sprout
    result_repo.fetch(source_repository, revision_id=revision_id)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 1552, in fetch
    find_ghosts=find_ghosts, fetch_spec=fetch_spec)
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 3126, in fetch
    pb=pb, find_ghosts=find_ghosts)
  File "/usr/lib/python2.6/dist-packages/bzrlib/fetch.py", line 82, in __init__
    self.__fetch()
  File "/usr/lib/python2.6/dist-packages/bzrlib/fetch.py", line 108, in __fetch
    self._fetch_everything_for_search(search)
  File "/usr/lib/python2.6/dist-packages/bzrlib/fetch.py", line 136, in _fetch_everything_for_search
    stream, from_format, [])
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 4029, in insert_stream
    return self._locked_insert_stream(stream, src_format, is_resume)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 4058, in _locked_insert_stream
    for substream_type, substream in stream:
  File "/usr/lib/python2.6/dist-packages/bzrlib/remote.py", line 1754, in missing_parents_chain
    for kind, substream in stream:
  File "/usr/lib/python2.6/dist-packages/bzrlib/smart/repository.py", line 467, in record_stream
    for bytes in byte_stream:
  File "/usr/lib/python2.6/dist-packages/bzrlib/smart/message.py", line 336, in read_streamed_body
    _translate_error(self._body_error_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/smart/message.py", line 359, in _translate_error
    raise errors.ErrorFromSmartServer(error_tuple)
ErrorFromSmartServer: Error received from smart server: ('error', "Absent factory for ('ostools.py-20060731163025-npjffm46rgnkl50d-1', '<email address hidden>')")

bzr 1.17dev on python 2.6.2 (linux2)
arguments: ['/usr/bin/bzr', 'branch', 'lp:~mbp/bzr/progress', 'progress']
encoding: 'UTF-8', fsenc: 'UTF-8', lang: 'en_AU.UTF-8'
plugins:
  explorer /home/mbp/.bazaar/plugins/explorer [0.1dev]
  gtk /home/mbp/.bazaar/plugins/gtk [0.96.0.dev.1]
  launchpad /us...

Read more...

Changed in bzr:
importance: Undecided → Critical
status: Incomplete → Confirmed
Martin Pool (mbp) wrote :
Download full text (8.0 KiB)

It's reproducible for me running that same command. I have an hpss trace for it:

Mon 2009-06-22 20:48:07 +1000
0.026 bzr arguments: [u'-Dhpss', u'branch', u'lp:~mbp/bzr/progress', u'progress']
0.043 looking for plugins in /home/mbp/.bazaar/plugins
[32345] 2009-06-22 20:48:07.761 WARNING: Unable to load plugin 'gtk'. It requested API version (1, 15, 0) of module <module 'bzrlib' from '/home/mbp/bzr/trunk/bzrlib/__init__.pyc'> but the minimum exported version is (1, 17, 0), and the maximum is (1, 17, 0)
0.119 looking for plugins in /home/mbp/bzr/trunk/bzrlib/plugins
0.137 looking for plugins in /usr/lib/python2.6/dist-packages/bzrlib/plugins
0.138 Plugin name netrc_credential_store already loaded
0.138 Plugin name launchpad already loaded
0.141 encoding stdout as sys.stdout encoding 'UTF-8'
2.426 hpss: Built a new medium: SmartSSHClientMedium
2.428 hpss call: 'BzrDir.open', '~mbp/bzr/progress/'
2.428 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
2.473 ssh implementation is OpenSSH
8.555 result: ('yes',)
8.556 hpss call: 'BzrDir.open_branchV2', '~mbp/bzr/progress/'
8.556 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
8.930 result: ('branch', 'Bazaar Branch Format 7 (needs bzr 1.6)\n')
8.930 hpss call: 'BzrDir.find_repositoryV3', '~mbp/bzr/progress/'
8.930 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
9.478 result: ('ok', '', 'no', 'no', 'yes', 'Bazaar RepositoryFormatKnitPack6 (bzr 1.9)\n')
9.483 hpss call: 'Branch.get_stacked_on_url', '~mbp/bzr/progress/'
9.484 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
9.871 result: ('ok', '/~bzr/bzr/trunk')
9.872 hpss call: 'BzrDir.open', '~bzr/bzr/trunk/'
9.872 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
10.311 result: ('yes',)
10.311 hpss call: 'BzrDir.open_branchV2', '~bzr/bzr/trunk/'
10.311 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
10.763 result: ('branch', 'Bazaar Branch Format 7 (needs bzr 1.6)\n')
10.763 hpss call: 'BzrDir.find_repositoryV3', '~bzr/bzr/trunk/'
10.763 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
11.405 result: ('ok', '', 'no', 'no', 'yes', 'Bazaar RepositoryFormatKnitPack6 (bzr 1.9)\n')
11.405 hpss call: 'Branch.get_stacked_on_url', '~bzr/bzr/trunk/'
11.405 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
11.831 result: ('NotStacked',)
11.832 hpss call: 'Branch.last_revision_info', '~mbp/bzr/progress/'
11.832 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
12.212 result: ('ok', '4148', '<email address hidden>')
12.216 hpss call: 'BzrDir.cloning_metadir', '~mbp/bzr/progress/', 'False'
12.216 (to bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/)
12.613 result: ('Bazaar-NG meta directory, format 1\n', 'Bazaar RepositoryFormatKnitPack6 (bzr 1.9)\n', ('branch', 'Bazaar Branch Format 7 (needs bzr 1.6)\n'))
12.644 Using fetch logic to copy between RemoteRepository(bzr+ssh://bazaar.launchpad.net/%7Embp/bzr/progress/.bzr/)(<Re...

Read more...

Martin Pool (mbp) wrote :

Perhaps unsurprisingly (if we believe the error is on the server not in a bad request), I get the same error using bzr.1.15. 1.13 can branch from this, but it's not using the get_stream rpc.

Martin Pool (mbp) wrote :

sftp can be used as a workaround.

summary: - absent factory exception from smart server when merging
+ absent factory exception from smart server when merging or branching
Martin Pool (mbp) wrote :

10:19 <lifeless> just run the fixer, then try the command that was erroring for you yesterday
10:23 <lifeless> http://launchpadlibrarian.net/26166834/fix-branch.py is the fixer

Running this reports:

[0] mbp@grace% python ~/Desktop/fix-branch.py bzr+ssh://bazaar.launchpad.net/~mbp/bzr/progress
Missing inventories: set([('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',), ('<email address hidden>',)])
python ~/Desktop/fix-branch.py 1.15s user 0.10s system 1% cpu 1:18.23 total

Robert Collins (lifeless) wrote :

Ok Martin, that means that what you encountered was the known, fixed, bug with xml based inventory stacked branches. Whatever is going on with the launchpad branch is likely conceptually the same, but its got different code involved.

summary: - absent factory exception from smart server when merging or branching
+ absent factory exception from smart server when streaming 2a stacked
+ branches

BLocker; we can't ship 2.0 with this open.

description: updated
tags: added: brisbane-core hpss
Changed in bzr:
milestone: none → 2.0
Robert Collins (lifeless) wrote :

I've filed an RT ticket to get the logs from the server. And a separate bug on launchpad-code to get future errors logged via OOPS

Changed in bzr:
status: Confirmed → Triaged
Robert Collins (lifeless) wrote :

RT 34739

description: updated
Robert Collins (lifeless) wrote :
Robert Collins (lifeless) wrote :

The older [bzr rotated] server side log

Robert Collins (lifeless) wrote :

Unfortunately, neither log appears to include the backtrace we're interested in.

Robert Collins (lifeless) wrote :

Branch tip: <email address hidden>
graph.PendingAncestryResult(['<email address hidden>'], b.repository).get_keys()

['<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>']

after copying the repo down with lftp, remove the stacked parameter
source = b.repository._get_source(b.repository._format)
stream = source.get_stream(graph.PendingAncestryResult(['<email address hidden>'], b.repository))
>>> for kind, sub in stream:
... print kind
... for record in sub:
... print record.key
... print len(record.get_bytes_as(record.storage_kind))
texts
('test_branchnavigatio-20080715041246-hnlwxkpgg4kh6a26-1', '<email address hidden>')
Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
AttributeError: 'AbsentContentFactory' object has no attribute 'get_bytes_as'

>>> b.repository.texts.keys()
set([('x_Guilherme_Salgado_<email address hidden>_Tue_Mar__1_17:18:32_2005_9482.0', '<email address hidden>'), ('x_Mark_Shuttleworth_<email address hidden>_Sun_Oct__3_21:57:56_2004_7518.0', '<email address hidden>'), ('productportletrequir-20080521004812-mklios0rjxaf3tw9-1', '<email address hidden>'), ('productportletlicens-20080521010624-zdh01mv1lymdws2r-1', '<email address hidden>'), ('x_Alexander_Limi_<email address hidden>_Thu_Aug__5_13:49:18_2004_668.0', '<email address hidden>'), ('newsitem.py-20071111180134-yjrqsxy1qks03jm4-1', '<email address hidden>'), ('x_Guilherme_Salgado_<email address hidden>_Mon_Dec__6_20:18:11_2004_28538.0', '<email address hidden>')])

Robert Collins (lifeless) wrote :

The complete set of texts that bzr tries to send is much larger - 157 items long.

Robert Collins (lifeless) wrote :

I believe this is the problem. Specifically, because it is considering 'whole nodes' not 'interesting items', we see all the items in new pages, not just the items that are new in those pages.

        chk_bytes = self.from_repository.chk_bytes
        def _filter_id_to_entry():
            for record, items in chk_map.iter_interesting_nodes(chk_bytes,
                        self._chk_id_roots, uninteresting_root_keys):
                for name, bytes in items:
                    # Note: we don't care about name_utf8, because we are always
                    # rich-root = True
                    _, file_id, revision_id = bytes_to_info(bytes)
                    self._text_keys.add((file_id, revision_id))
                if record is not None:
                    yield record

To me this reinforces the need to push this diff code into chk_map.py, and have a single implementation of it.

description: updated
description: updated
Robert Collins (lifeless) wrote :

So, further investigation shows that we *do* have an appropriate function in chk_map, which is being used.

However, it is clearly outputting uninteresting items, and I don't know why yet. I need to EOD though.

John A Meinel (jameinel) wrote :

Just to follow up on Robert's "further investigation". iter_interesting_nodes filters interesting keys, so the line:
        for record, items in chk_map.iter_interesting_nodes(chk_bytes,
                        self._chk_id_roots, uninteresting_root_keys):
                for name, bytes in items:

Those items have already gone through a set difference.

Still need to probe into why it seems we are getting more than we should. *my* contention is that something about the *values* is genuinely different because of the buggy parents issues. (one side claims more per-file parents than the other side does.)

Robert Collins (lifeless) wrote :

Partial fix - catching the commonest case (two subtrees changed, two pushes to a stacked repo needed to trigger it) - has landed. More work is needed to completely fix, but this is a 99% fix we should get out to users.

Changed in bzr:
status: Triaged → In Progress
tags: added: lp-needs
Changed in launchpad-code:
importance: Undecided → Critical
Robert Collins (lifeless) wrote :

mwhudson says the lp branch for this is getting tested now.

Jonathan Lange (jml) wrote :

We've landed the fix on our bzr branch. Waiting for the rollout.

Changed in launchpad-code:
milestone: none → 2.2.6
status: New → Fix Committed
Martin Pool (mbp) wrote :

Was released in 1.16.1

Martin Pool (mbp) wrote :

Also now fixed in trunk

Changed in bzr:
status: In Progress → Fix Released
Robert Collins (lifeless) wrote :

Partly fixed in trunk.

Changed in bzr:
status: Fix Released → In Progress
Tim Penhey (thumper) wrote :

Moving launchpad-code back to triaged, as we are awaiting the full fix from bzr.

Changed in launchpad-code:
status: Fix Committed → Triaged
milestone: 2.2.6 → 2.2.7
Martin Pool (mbp) wrote :
Download full text (4.4 KiB)

Robert says this error can still occur when the tree depth is different.

----

Tim reported this error yesterday; could this be a regression of the same thing?

henning@i-5f426f36$ bzr branch bzr+ssh://bazaar.launchpad.net/~launchpad-pqm/launchpad/devel /var/launchpad/test
Warning: Permanently added 'bazaar.launchpad.net' (RSA) to the list of known hosts.
Branched 8712 revision(s).
henning@i-5f426f36$ cd /var/launchpad/test; bzr merge bzr+ssh://bazaar.launchpad.net/~henninge/launchpad/bug-383880
bzr: ERROR: bzrlib.errors.ErrorFromSmartServer: Error received from smart server: ('error', "'AbsentContentFactory' object has no attribute 'get_bytes_as'")

Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 729, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 924, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.5/site-packages/bzrlib/commands.py", line 560, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.5/site-packages/bzrlib/builtins.py", line 3573, in run
    location, revision, remember, possible_transports, pb)
  File "/usr/lib/python2.5/site-packages/bzrlib/builtins.py", line 3680, in _get_merger_from_branch
    other_revision_id, base_revision_id, other_branch, base_branch)
  File "/usr/lib/python2.5/site-packages/bzrlib/merge.py", line 204, in from_revision_ids
    merger.set_other_revision(other, other_branch)
  File "/usr/lib/python2.5/site-packages/bzrlib/merge.py", line 345, in set_other_revision
    self._maybe_fetch(other_branch, self.this_branch, self.other_rev_id)
  File "/usr/lib/python2.5/site-packages/bzrlib/merge.py", line 362, in _maybe_fetch
    target.fetch(source, revision_id)
  File "/usr/lib/python2.5/site-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/branch.py", line 532, in fetch
    pb=pb)
  File "/usr/lib/python2.5/site-packages/bzrlib/repository.py", line 1552, in fetch
    find_ghosts=find_ghosts, fetch_spec=fetch_spec)
  File "/usr/lib/python2.5/site-packages/bzrlib/decorators.py", line 192, in write_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.5/site-packages/bzrlib/repository.py", line 3126, in fetch
    pb=pb, find_ghosts=find_ghosts)
  File "/usr/lib/python2.5/site-packages/bzrlib/fetch.py", line 82, in __init__
    self.__fetch()
  File "/usr/lib/python2.5/site-packages/bzrlib/fetch.py", line 108, in __fetch
    self._fetch_everything_for_search(search)
  File "/usr/lib/python2.5/site-packages/bzrlib/fetch.py", line 136, in _fetch_everything_for_search
    stream, from_format, [])
  File "/usr/lib/python2.5/site-packages/bzrlib/repository.py", line 4029, in insert_stream
    return self._locked_insert_stream(stream, src_format, is_resume)
  File "/usr/lib/python2.5/site-packages/bzrlib/repository.py", line 4058, in _locked_insert_stream
    for substream_type, substream in stream:
  File "/usr/lib/python2.5/site-packages/bzrlib/remote.py", line 1754, in missing_parents_chain
    for kind, substrea...

Read more...

Martin Pool (mbp) wrote :

Well, the client side traceback looks the same; we'd like to get the server side traceback to know.

On Tue, 2009-06-30 at 00:19 +0000, Tim Penhey wrote:
> Moving launchpad-code back to triaged, as we are awaiting the full fix
> from bzr.

I think LP code should be fine with the hotfix so far - in 1.17 you'll
get the complete fix, but its something I'd be less happy hotfixing as
it requires deeper changes to the code to make work - John is rewriting
the code path.

The fix so far is a 99% fix and in practice I don't think LP will see
any of the unsolved cases before 1.17 is available.

-Rob

Robert Collins (lifeless) wrote :

On Tue, 2009-06-30 at 00:19 +0000, Martin Pool wrote:
> Robert says this error can still occur when the tree depth is different.
>
> ----
>
> Tim reported this error yesterday; could this be a regression of the
> same thing?

This is almost certainly a different bug. Tim, please file a new bug
report for this.

-Rob

Robert Collins (lifeless) wrote :

Actually, scratch that; misread the backtrace - it *could* be the same.

Please:
 - get the server backtrace as Martin says.
 - have the user upgrade to 1.16.1 or bzr nightlies.

Thanks,
Rob

Tim Penhey (thumper) wrote :

On Tue, 30 Jun 2009 12:35:59 Robert Collins wrote:
> On Tue, 2009-06-30 at 00:19 +0000, Tim Penhey wrote:
> > Moving launchpad-code back to triaged, as we are awaiting the full fix
> > from bzr.
>
> I think LP code should be fine with the hotfix so far - in 1.17 you'll
> get the complete fix, but its something I'd be less happy hotfixing as
> it requires deeper changes to the code to make work - John is rewriting
> the code path.
>
> The fix so far is a 99% fix and in practice I don't think LP will see
> any of the unsolved cases before 1.17 is available.

We did see an issue though :(

I talked to poolie about it.

However I get your point. Perhaps not worth cherry picking - is that your
rationale?

Robert Collins (lifeless) wrote :

On Tue, 2009-06-30 at 01:27 +0000, Tim Penhey wrote:

> We did see an issue though :(

We need to analyse that particular case in detail. I haven't pulled a
copy of the branch yet to do that. I see three cases:
 - its the same problem but the current fix is less complete than
thought.
 - its the corner cases we know about and have in progress.
 - its something else.

> I talked to poolie about it.
>
> However I get your point. Perhaps not worth cherry picking - is that
> your rationale?

Not *safe* to cherry pick is what I mean. Deep changes need a little
bedding down.

-Rob

According to rockstar in today's Launchpad Prod. Meeting:

<rockstar> Ursinha, there have been multiple team discussions about it.
<Ursinha> rockstar, is that bug really critical?
<rockstar> Ursinha, so, yes, it's really critical, but we're dependent on bzr fixing it.
<rockstar> Ursinha, we may downgrade it for the sake of not misusing Critical. We're still discussing it.
<Ursinha> rockstar, can you update the bug as soon as you have news, pretty please?
<rockstar> Ursinha, will do.

Ursula Junque (ursinha) wrote :

<thumper> Ursinha: the critical bug isn't really critical
<thumper> just high
<thumper> should be fixed with 1.17 release of bzr

Changed in launchpad-code:
importance: Critical → High
Robert Collins (lifeless) wrote :

The critical component was truely critical and has been cherrypicked into lp production some time ago. The lp task would be appropriately closed, I think.

The remaining work is for the complete-case, and I think that that is is still pending a final review.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> The critical component was truely critical and has been cherrypicked
> into lp production some time ago. The lp task would be appropriately
> closed, I think.

This is hitting us any time we use bundles to create/update branches.
It's a pretty high-frequency, high-irritation event.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpWrtcACgkQ0F+nu1YWqI1rOACeI9HDQ6GQL6xJEH05P/xDmW4H
vU0An0Gip9U/xDVipomgV/8M5oIP+XUm
=bBYT
-----END PGP SIGNATURE-----

Robert Collins (lifeless) wrote :

On Fri, 2009-07-10 at 03:00 +0000, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > The critical component was truely critical and has been cherrypicked
> > into lp production some time ago. The lp task would be appropriately
> > closed, I think.
>
> This is hitting us any time we use bundles to create/update branches.
> It's a pretty high-frequency, high-irritation event.

We should definitely solve your problem. However, I think thats a
different bug. ACF is a very generic error, its like 'segfault' for C
programs. Getting an ACF means something is wrong, but doesn't indicate
if its missing data or bad delta logic.

Bug 390563 - *this* bug - is 'when all required data is present, bzr
still fails to generate a stream'. And the fix that has been
cherrypicked should be robust except when the tree height changes.

We should debug the bundle based one. I think it is bug 393349, as JML
has suggested.

-Rob

Martin Pool (mbp) on 2009-07-13
Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> Robert Collins wrote:
>> The critical component was truely critical and has been cherrypicked
>> into lp production some time ago. The lp task would be appropriately
>> closed, I think.
>
> This is hitting us any time we use bundles to create/update branches.
> It's a pretty high-frequency, high-irritation event.
>
> Aaron

so... is this happening when:

1) applying a bundle to an existing stacked branch?
2) applying a bundle that was created *from* a stacked branch?

If I was guessing, I could easily say that the bundle code:

1) Doesn't contain the fulltext for inventories of parent revisions,
   nor does it contain the data for all texts that are referenced by the
   inventories that it *does* have.
2) Thus when creating a stacked branch from the bundle, we don't have
   the parent inventories to *insert* into the new stacked repository.
   (Without going to the backing repository and pulling them out)
3) And thus the stacked branch is now broken.

This points the failure at the "insert data from a bundle" code, causing
it to generate invalid stacked repositories.

This would all be simplified quite a bit if we just made a bundle
effectively just a way to transfer the data for the stream that you get
out of StreamSource. There is a small caveat that the bundle *also*
needs to think of itself as not containing any data, and thus does the
same "what is missing" checks that a stacked repository does.

So put another way:

1) BundleRepository just sets itself up as a stacked repository, stacked
on top of a hypothetical repository that only has the minimal data
defined by the target revision.

2) Alternatively we could go to the real target branch and use that.
However at present we have a habit of setting submit_location to a local
mirror, and obviously that will generate the wrong information about
"what data do you have that I don't".

I'll see if I can trivially recreate this.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpeQA4ACgkQJdeBCYSNAANpAwCfajcwh5TE87OZ9yHFrjm2UmNk
R14AnjjmiO/7i8bVrbDMMvCibPUdbsSq
=gcg6
-----END PGP SIGNATURE-----

Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> Aaron Bentley wrote:
>> This is hitting us any time we use bundles to create/update branches.
>> It's a pretty high-frequency, high-irritation event.
>
>> Aaron
>
> so... is this happening when:
>
> 1) applying a bundle to an existing stacked branch?

After applying the bundle to the stacked branch, our branch puller is
unable to pull the branch.

> 2) applying a bundle that was created *from* a stacked branch?

That's not the case I was describing. I don't think anyone's creating
bundles from stacked branches.

> If I was guessing, I could easily say that the bundle code:
>
> 1) Doesn't contain the fulltext for inventories of parent revisions,

Bundles aren't meant to contain fulltexts of any inventories.

> nor does it contain the data for all texts that are referenced by the
> inventories that it *does* have.

It contains the data for the texts that are referenced by the
inventories it does have, as long as the referenced texts are new. If
they are present in the basis revision, they will not be included.

> 2) Thus when creating a stacked branch from the bundle, we don't have
> the parent inventories to *insert* into the new stacked repository.

Why do we need the parent inventories?

> (Without going to the backing repository and pulling them out)

add_mpdiffs reconstructs the fulltexts and calls
VersionedFile.add_lines, so yes, it would hit the backing repository to
regenerate the fulltexts. But it usually wouldn't try to insert
inventories that were already present in the backing repository.

> 3) And thus the stacked branch is now broken.
>
> This points the failure at the "insert data from a bundle" code, causing
> it to generate invalid stacked repositories.
>
> This would all be simplified quite a bit if we just made a bundle
> effectively just a way to transfer the data for the stream that you get
> out of StreamSource.

I started work on that at the sprint.

> There is a small caveat that the bundle *also*
> needs to think of itself as not containing any data

Pardon?

>, and thus does the
> same "what is missing" checks that a stacked repository does.
>
> So put another way:
>
> 1) BundleRepository just sets itself up as a stacked repository, stacked
> on top of a hypothetical repository that only has the minimal data
> defined by the target revision.
>
> 2) Alternatively we could go to the real target branch and use that.

In the Launchpad case, we are stacked on the real target branch.

> However at present we have a habit of setting submit_location to a local
> mirror, and obviously that will generate the wrong information about
> "what data do you have that I don't".

If the head revision of the local mirror is accurate or stale, it may
bundle more data than is strictly needed, but not too little.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpeRlYACgkQ0F+nu1YWqI3OnQCdEitj+5LO9P19+wBuPj7VcQkw
OhIAn02lm1hJDHvewisLgPTjkDlo/jlc
=Xf9G
-----END PGP SIGNATURE-----

John A Meinel (jameinel) wrote :
Download full text (7.4 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> John A Meinel wrote:
>> Aaron Bentley wrote:
>>> This is hitting us any time we use bundles to create/update branches.
>>> It's a pretty high-frequency, high-irritation event.
>>> Aaron
>> so... is this happening when:
>
>> 1) applying a bundle to an existing stacked branch?
>
> After applying the bundle to the stacked branch, our branch puller is
> unable to pull the branch.
>

So if you follow the rest of my emails, it turns out that creating
bundles from --2a formats is just 100% broken. regardless stacking or
not (chk pages are not being inserted into the bundle, and the raw text
from 'repo.inventories' is not sufficient without them). But there are
still issues to be discussed with how "insert_revisions" is going to
interact with stacking.

>> If I was guessing, I could easily say that the bundle code:
>
>> 1) Doesn't contain the fulltext for inventories of parent revisions,
>
> Bundles aren't meant to contain fulltexts of any inventories.
>
>> nor does it contain the data for all texts that are referenced by the
>> inventories that it *does* have.
>
> It contains the data for the texts that are referenced by the
> inventories it does have, as long as the referenced texts are new. If
> they are present in the basis revision, they will not be included.
>
>> 2) Thus when creating a stacked branch from the bundle, we don't have
>> the parent inventories to *insert* into the new stacked repository.
>
> Why do we need the parent inventories?

Stacking requirements. The specific issues are:

1) When accessing a stacked branch via bzr+ssh the *server process* does
not have access to the backing repository. This was by design, and not
just accidental fallout.

  a) Because of this choice, when *streaming* data from the smart
     server, the stacked branch needs enough information to generate a
     stream for all revisions that it has, and then the client will ask
     the fallback directly for another stream for all the remaining
     data.

2) To stream the data for a set of revisions, we need the text keys that
should be transmitted. This was changed recently from:

  set_of_file_keys_in_new.intersect(revision_ids_being_transmitted)

to

  set_of_file_keys_seen_in_new.difference(
    set_of_file_keys_in_all_available_parents)

This was done so that we could be more resilient in the face of ghosts,
and inventory inconsistencies, etc.
We have two things that we wanted to trust that we've found out we can't:

  file_key == (file_id, revision) #always true
  ie = inventory(revision)[file_id]
  ie.revision == file_key[1] # not always true

This is the inventory inconsistency issue. The other is the ghosts issue:

  if file_key in inventory then either:
    file_revision == inventory.revision # the text was in this revision
      or
    file_revision in set_of_revision_being_transmitted

The issue is with round-tripping and bzr-svn. Basically, if you did work
in bzr, and push that into svn, you end up with last-modified revision
information being stored in svn which points to a bzr revision_id which
was *not* pushed to svn. (you merge a bzr change that modifi...

Read more...

Robert Collins (lifeless) wrote :

Can we move this thread to the right bug please?

Follow-up-to bug 393349

Cheers,
Rob

Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
> Aaron Bentley wrote:
>> Why do we need the parent inventories?
>
> Stacking requirements. The specific issues are:
>
> 1) When accessing a stacked branch via bzr+ssh the *server process* does
> not have access to the backing repository.

It would be nice to support server-side bundle application, but it is
not a requirement. Bundles are deltas, and the client usually needs to
read from the repository in order to install them. If that requires
that fallbacks be available, I think that's a reasonable extension of
the current behaviour.

> 1) If you insert an line-delta for the XML inventory into the stacked
> branch, then it only references newly introduced texts, and does not
> reference all possible texts.

All possible in what context? We certainly don't want to reference
every text in the repository.

> Thus even though we don't have the parent
> inventory later when doing "streaming fetch" we don't try to send more
> data then we have available because it just wasn't referenced.

If you are suggesting that installing an inventory requires installing
all the texts it references, that would require doing it on the client
side, because storing all that in the bundle is pointless waste.

> 2) However, if you are at the end of a delta chain, etc, and you insert
> a "fulltext" for the XML inventory, suddenly when doing streaming fetch
> from that stacked branch, it will see all these text keys that it thinks
> it needs to transmit, for which it cannot find in the local repository.
> [and boom].

I think it doesn't make a lot of sense for a stacked repository to have
a copy of every text referenced in any of its inventories, so I don't
really get this.

> My point was that the streaming code might not look at the
> Branch.last_revision() and instead look at "is revision in
> Repository.all_revision_ids". And if you have a local shared repository,
> it *will* have the revisions you are trying to bundle.

Okay, I will make sure that the new bundle format uses
Branch.last_revision to determine what to send.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpfoGMACgkQ0F+nu1YWqI3sngCffvynD0u6FDE0fBTpLAFhr2DP
9k4AnRKP4mgQLKOuaD3sKmP5b61+66/4
=Xs0R
-----END PGP SIGNATURE-----

John A Meinel (jameinel) wrote :
Download full text (3.5 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> John A Meinel wrote:
>> Aaron Bentley wrote:
>>> Why do we need the parent inventories?
>> Stacking requirements. The specific issues are:
>
>> 1) When accessing a stacked branch via bzr+ssh the *server process* does
>> not have access to the backing repository.
>
> It would be nice to support server-side bundle application, but it is
> not a requirement. Bundles are deltas, and the client usually needs to
> read from the repository in order to install them. If that requires
> that fallbacks be available, I think that's a reasonable extension of
> the current behaviour.
>
>> 1) If you insert an line-delta for the XML inventory into the stacked
>> branch, then it only references newly introduced texts, and does not
>> reference all possible texts.
>
> All possible in what context? We certainly don't want to reference
> every text in the repository.
>
>> Thus even though we don't have the parent
>> inventory later when doing "streaming fetch" we don't try to send more
>> data then we have available because it just wasn't referenced.
>
> If you are suggesting that installing an inventory requires installing
> all the texts it references, that would require doing it on the client
> side, because storing all that in the bundle is pointless waste.

Assume you have revisions a, b, c

In the base repository you have a & b
in the stacked you have c

All revisions have texts foo, bar, baz
'c' only modifies baz

At that point, we need to be able to create a fulltext for the inventory
at c. Either via
1) A fulltext of c
or
2) A fulltext of a parent, and a delta chain including c.

At that point, the stacked branch has 2 options:

1) If it doesn't have inventory for b, it must have the file content for
foo, bar and baz
2) If it does have inventory b, it only needs the file content for baz.

The basic requirement is that a stacked repository should be independent
of its fallback for the case of streaming fetch. So either we

1) Insert a parent inventory so that we can determine that someone who
has revision 'b' will already have these file keys
or
2) Have the file keys ourselves so that we can ensure someone fetching
'c' has everything they need.

Does that make sense?

>
>> 2) However, if you are at the end of a delta chain, etc, and you insert
>> a "fulltext" for the XML inventory, suddenly when doing streaming fetch
>> from that stacked branch, it will see all these text keys that it thinks
>> it needs to transmit, for which it cannot find in the local repository.
>> [and boom].
>
> I think it doesn't make a lot of sense for a stacked repository to have
> a copy of every text referenced in any of its inventories, so I don't
> really get this.

Which is why we want to insert the inventories for parent revisions, so
that *without hitting the fallback* we can determine what file keys do
not need to be fetched.

John
=:->

>
>> My point was that the streaming code might not look at the
>> Branch.last_revision() and instead look at "is revision in
>> Repository.all_revision_ids". And if you have a local shared repository,
>> it *will* have the revisions you are trying to bundl...

Read more...

Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John A Meinel wrote:
>>> Thus even though we don't have the parent
>>> inventory later when doing "streaming fetch" we don't try to send more
>>> data then we have available because it just wasn't referenced.
>> If you are suggesting that installing an inventory requires installing
>> all the texts it references, that would require doing it on the client
>> side, because storing all that in the bundle is pointless waste.
>
> Assume you have revisions a, b, c
>
> In the base repository you have a & b
> in the stacked you have c
>
> All revisions have texts foo, bar, baz
> 'c' only modifies baz
>
> At that point, we need to be able to create a fulltext for the inventory
> at c. Either via
> 1) A fulltext of c
> or
> 2) A fulltext of a parent, and a delta chain including c.
>
> At that point, the stacked branch has 2 options:
>
> 1) If it doesn't have inventory for b, it must have the file content for
> foo, bar and baz

Why?

> 2) If it does have inventory b, it only needs the file content for baz.
>
> The basic requirement is that a stacked repository should be independent
> of its fallback for the case of streaming fetch. So either we
>
> 1) Insert a parent inventory so that we can determine that someone who
> has revision 'b' will already have these file keys
> or

Why do we care about that? Why can't we just send the keys we have and
let the client worry about getting the keys we didn't send?

> 2) Have the file keys ourselves so that we can ensure someone fetching
> 'c' has everything they need.
>
> Does that make sense?

Not yet.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpf2bwACgkQ0F+nu1YWqI3rigCeKKGJVBz7PJvizKG3RviVpFXv
8vAAn1rhpas7k39tARvv1HzdTFfE2RVH
=oZpW
-----END PGP SIGNATURE-----

summary: - absent factory exception from smart server when streaming 2a stacked
- branches
+ overly large delta creation when fetching from 2a repositories
Tim Penhey (thumper) wrote :

I'll move this to 2.2.8 in the hope that there is a fix in Bazaar in time to deploy to LP.

Changed in launchpad-code:
milestone: 2.2.7 → 2.2.8

On Thu, 2009-07-23 at 23:03 +0000, Tim Penhey wrote:
> I'll move this to 2.2.8 in the hope that there is a fix in Bazaar in
> time to deploy to LP.

LP shouldn't be stressing about this bug now; the critical component
landed in LP shortly after it was opened.

-Rob

Tim Penhey (thumper) on 2009-07-23
Changed in launchpad-code:
milestone: 2.2.8 → 2.2.7
status: Triaged → Fix Released
Robert Collins (lifeless) wrote :

The full fix is present in 1.18

Changed in bzr:
milestone: 2.0 → 1.18
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers