Comment 34 for bug 198646

Revision history for this message
Dirk T (miriup) wrote :

Vincent Ladeuil wrote:
> I suggest you look at the bzr http client implementations which already
> implement a fallback mechanism when encountering servers which doesn't
> implement multiple range requests.
> I'm even surprised it didn't trigger in this case.

I've just found these code bits and just wanted to put a quick update here. As far as I can see, this mechanism kicks in in the following cirumstances (_readv in http/__init.py):

- errors.ShortReadvError: short-read; the the range is actually incomplete
- errors.InvalidRange
- errors.InvalidHttpRange: when the range headers are malformed.

However, none of these are applying for reading the boundary marker (read_boundary in http/response.py) when it is missing. I will first attempt fitting the necessary adaption there.

I'll see if I can use the existing mechanism for issues in Content-Range's (_degrade_range_hint in http/__init__.py). According to their documentation the degration flag propagates whenever the connection object is cloned - not sure when this is done, though. Even if it doesn't propagate and we're not doing requests in parallel, we'll be using the same HTTP request object for the remaining requests --- and that's the one who's keeping the HTTP connection open.

> This is also bad for the client and in the end for the user as it means
> the connection will have to be re-established incurring latency penalty.

Regarding latency I think flag on fail is a good solution. A user not affected by evil proxies will not take a performance hit. For the rest of us this re-establishment will happen exactly once per HTTP client object and then the degraded mode gets activated and successive requests over the same connection will request individual ranges and the connection will be kept alive. This is definitely preferable over a broken checkout. :)

Regarding a config option I have a few reasons to refrain from doing so:

1. For an ordinary user it will be very difficult to determine whether he's affected by this issue or not. He'd have to be aware of a transparent proxy being in place and he'd have to be aware that those bazaar requests are quite uncommon. Most of the web works very well without HTTP multipart responses. ;)
2. Affected are only users who fetch with bazaar through HTTP. That means amongst the people behind an evil proxies only people are hit that are *not* fetching sources using typical developer transport connections (like https or ssh). Most projects offer .tar.gz's and you'd ordinarily download these. I stumbled over the issue only myself, because I was writing a live ebuild for Gentoo that checks out another project using bazaar.
3. /me would have to fiddle with two proably quite distant (in terms of cross references) parts of bazaar. I leave that to you pros. :-P