Bazaar

failure to fetch from 1.9 to a 2a over bzr+ssh (revision bdecode failure)

Bug #424444 reported by John A Meinel on 2009-09-04

This bug report is a duplicate of: Bug #427736: 1.9->2a fetch from smart server causes "unknown object type identifier 60" in bencode. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Bazaar	Confirmed	Critical	Andrew Bennetts

Bug Description

I'm not sure what is failing here, but the situation is:

1) I have a local heavyweight checkout of a bzr+ssh://myserver branch
2) I'm trying to pull into that branch from lp (which is also then bzr+ssh)
3) I get a traceback with:
    readv_body=readv_body, body_stream=body_stream)
  File "C:\Users\jameinel\dev\bzr\bzr.dev\bzrlib\smart\client.py", line 42, in _send_request
    protocol_version)
  File "C:\Users\jameinel\dev\bzr\bzr.dev\bzrlib\smart\client.py", line 112, in _construct_protocol
    request = self._medium.get_request()
  File "C:\Users\jameinel\dev\bzr\bzr.dev\bzrlib\smart\medium.py", line 699, in get_request
    return SmartClientStreamMediumRequest(self)
  File "C:\Users\jameinel\dev\bzr\bzr.dev\bzrlib\smart\medium.py", line 904, in __init__
    raise errors.TooManyConcurrentRequests(self._medium)
TooManyConcurrentRequests: The medium 'SmartSSHClientMedium(connected=True, username=None, host='juj
u.arbash-meinel.com', port=None)' has reached its concurrent request limit. Be sure to finish_writin
g and finish_reading on the currently open request.

The branch I am pulling from is:
lp:~johnf-inodes/bzr/ppa-doc

Which is a 1.9 format branch.

My best guess is that something about the conversion code is triggering a code path that is trying to open multiple connections to my master branch. I'm investigating now.

Revision history for this message

John A Meinel (jameinel) wrote on 2009-09-04:

I'll note that after the fetch fails, it leaves the master branch in a write-locked state.

So it is possible that the ConcurrentRequest issues is just because we are getting a different exception while streaming. And that the code to unlock is the bit responsible for the TooManyRequests failure, and it is masking the real failure.

Revision history for this message

John A Meinel (jameinel) wrote on 2009-09-04:

So I fetched into a different branch, and found the real error:
File "C:\Users\jameinel\dev\bzr\work\bzrlib\remote.py", line 1912, in missing_parents_rev_handler
revision = self.serialiser.read_revision_from_string(revision_bytes)
File "C:\Users\jameinel\dev\bzr\work\bzrlib\chk_serializer.py", line 104, in read_revision_from_st
ing
ret = bencode.bdecode(text)
File "_bencode_pyx.pyx", line 218, in bzrlib._bencode_pyx.bdecode
File "_bencode_pyx.pyx", line 83, in bzrlib._bencode_pyx.Decoder.decode
File "_bencode_pyx.pyx", line 113, in bzrlib._bencode_pyx.Decoder._decode_object

So it seems there is something seriously wrong with johnf's ppa branch, but I don't quite understand what yet.

Now we have 2 bugs
1) something created invalid an invalid bencode stream
2) getting an error during streaming can cause TooManyConcurrentConnections during unlock, suppressing the real error

Revision history for this message

Robert Collins (lifeless) wrote on 2009-09-06:

Do we get an actual exception here? Or is it perhaps a pyx parser bug?

Revision history for this message

Robert Collins (lifeless) wrote on 2009-09-06:

Looking at this I don't think its directly tied to 2a, untargeting from 2.0: there is no reason to think its going to be widespread at this point.

I've done the following:
branched ppa-doc to /tmp [succeeds]
branched my 2.0 branch to /tmp to get a clean environment
merged from ppa-doc [succeeds]
recreated /tmp/2.0
in ppa-doc done 'bzr serve' with bzr.dev
in 2.0 done bzr merge bzr://localhost [succeeds]

So - I can't reproduce the bug with current code, and the networking layer does seem able to work.

Perhaps it is a bug in the 1.17 codebase launchpad is using?

Changed in bzr:
milestone:	2.0 → none

John A Meinel (jameinel) on 2009-09-08

summary:

- failure to fetch from 1.9 to a 2a heavy checkout of bzr+ssh
+ failure to fetch from 1.9 to a 2a over bzr+ssh (revision bdecode
+ failure)

Revision history for this message

Robert Collins (lifeless) wrote on 2009-09-28:

I think this was determined to be a bug in the 1.17 smart server verb; we probably need to stop using that verb to avoid the bug. Andrew, assigning to you to get your commentary, not to ask you to fix :)

Changed in bzr:
assignee:	John A Meinel (jameinel) → Andrew Bennetts (spiv)

Revision history for this message

Andrew Bennetts (spiv) wrote on 2009-09-28:

Yes, I believe this is in the 1.17 verb. Newer servers will refuse this request in this situation (cross-format fetch to 2a, IIRC), and newer clients should be using a newer verb without this bug.

We perhaps should make newer clients also fallback to vfs rather than the potentially doomed verb in this situation, so that newer clients with older servers won't fail either. (At a glance RemoteStreamSource could check self.from_repository._format.network_name() and self.to_format.network_name() before deciding whether or not it is safe to try the older verb).

Launchpad is now running 2.0.0 on the server though, so perhaps this isn't Critical importance anymore?

Revision history for this message

Andrew Bennetts (spiv) wrote on 2009-10-06:

This appears to be the same as bug 427736 (same branch, even!), which is now fixed.

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #427736 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.