>>>>> Miriup writes: > I've just found these code bits and just wanted to put a quick update > here. As far as I can see, this mechanism kicks in in the following > cirumstances (_readv in http/__init.py): > - errors.ShortReadvError: short-read; the the range is actually incomplete > - errors.InvalidRange > - errors.InvalidHttpRange: when the range headers are malformed. > However, none of these are applying for reading the boundary marker > (read_boundary in http/response.py) when it is missing. Correct. > I will first attempt fitting the necessary adaption there. Great. > I'll see if I can use the existing mechanism for issues in Content- > Range's (_degrade_range_hint in http/__init__.py). According to their > documentation the degration flag propagates whenever the connection > object is cloned - not sure when this is done, though. Hmm, yeah, it's a bit unclear and hairy (and there are edge cases where it's bogus): - it starts at None, - it's set once the first multi-range request succeeds (in whatever form including retries), - it propagates by cloning (see __init__ in HttpTransportBase), So if cloning occurs before the first successful request, each clone have to rediscover it. A better way would be make this attribute part of the connection parameters (which contains only the authentication data so far). Don't worry about that too much in a first step if you feel it's too complex. Anyway, the place you should detect the bug and raise a specific exception is probably in http/response.py, catching it should occur where the hint is handled. > Even if it doesn't propagate and we're not doing requests in > parallel, we'll be using the same HTTP request object for the > remaining requests --- and that's the one who's keeping the HTTP > connection open. You mean connection not request here right ? The connection attribute in the request object is the one that is shared between the transports and this achieved via the request objects but we create a request object for... each request. >> This is also bad for the client and in the end for the user as it >> means the connection will have to be re-established incurring >> latency penalty. > Regarding latency I think flag on fail is a good solution. A user > not affected by evil proxies will not take a performance hit. For > the rest of us this re-establishment will happen exactly once per > HTTP client object and then the degraded mode gets activated and > successive requests over the same connection will request > individual ranges and the connection will be kept alive. This is > definitely preferable over a broken checkout. :) Yes, that's the idea. > Regarding a config option I have a few reasons to refrain from > doing so: > 1. For an ordinary user it will be very difficult to determine > whether he's affected by this issue or not. He'd have to be aware > of a transparent proxy being in place and he'd have to be aware > that those bazaar requests are quite uncommon. Most of the web > works very well without HTTP multipart responses. ;) But almost always used by bzr when supported by the server. But once it's established that a server is affected, the option can be set in such a > 2. Affected are only users who fetch with bazaar through > HTTP. That means amongst the people behind an evil proxies only > people are hit that are *not* fetching sources using typical > developer transport connections (like https or ssh). Oh, you're right, https is immune ! > Most projects offer .tar.gz's and you'd ordinarily download > these. I stumbled over the issue only myself, because I was > writing a live ebuild for Gentoo that checks out another project > using bazaar. > 3. /me would have to fiddle with two proably quite distant (in > terms of cross references) parts of bazaar. I leave that to you > pros. :-P No problem with that. Correctly detecting the behavior is the most important part, we can help you for the rest or even handle it ourselves.