'Connection reset by peer' error when branching repository

Bug #229076 reported by Sebastian Pölsterl
4
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Undecided
Vincent Ladeuil

Bug Description

I ran the following command:

$ bzr branch http://www.gnome.org/~sebp/bzr/python-dvb
bzr: ERROR: Connection error: while sending GET /%7Esebp/bzr/python-dvb/.bzr/branch-format: (104, 'Connection reset by peer')

Opening the URL in the browser works fine.

Related branches

Revision history for this message
Vincent Ladeuil (vila) wrote :

Can you try prefixing the http URL with "nosmart+", i.e. "nosmart+http://www.gnome.org/~sebp/bzr/python-dvb" to help us diagnose the problem ?

Revision history for this message
Sebastian Pölsterl (sebp) wrote :

The error stays the same and no additional information has been printed.

Revision history for this message
Vincent Ladeuil (vila) wrote :

I can reproduce it with the urllib http implementtion, but the pycurl implementation just works :-/

So, I suppose you don't have pycurl installed (what bzr/OS versions are you using ?).

Installing it should solve the immediate problem (pycurl will be used by default if found).

Changed in bzr:
status: New → Confirmed
assignee: nobody → vila
Revision history for this message
Sebastian Pölsterl (sebp) wrote :

I'm using 1.5rc1 on hardy form the PPA at http://ppa.launchpad.net/lamothe/ubuntu

I guess the package should depend on python-pycurl then.

Revision history for this message
John A Meinel (jameinel) wrote :

What would be helpful is your ~/.bzr.log from a "bzr branch -Dhttp ..." Though if vila is able to reproduce it, he should already have the information he needs.

That should give the HTTP request that is being made, and possible help us understand why we are getting a 104, connection reset.

As for depending on pycurl... it shouldn't be strictly necessary, and there are some drawbacks to using pycurl for the transport. (The biggest drawback is that it doesn't interrupt well with ^C, such that during a long download it goes unresponsive until the download completes.)

So fixing urllib is a bigger priority, as we have more control over it. Certainly if installing pycurl works around this problem, that is a valid workaround.

Revision history for this message
Vincent Ladeuil (vila) wrote :

No need for the .bzr.log, a single telnet session exhibits the problem:

telnet www.gnome.org 80
Trying 209.132.176.176...
Connected to window.gnome.org.
Escape character is '^]'.
GET / HTTP/1.1
GET / HTTP/1.1
Connection closed by foreign host.

Puzzling...

Revision history for this message
Vincent Ladeuil (vila) wrote :

wireshark trace for the telnet session above

Revision history for this message
Vincent Ladeuil (vila) wrote :

wireshark trace for 'bzr bzr branch http+pycurl://www.gnome.org/~sebp/bzr/python-dvb toto2' interrupted quickly to keep the trace short while still showing the beginning for comparisons purposes with the telnet session.

Revision history for this message
Martin Pool (mbp) wrote :

I suspect that either something on their server or in the network is aborting the connection. I'm not sure but I believe that the application closing the socket should never generate a RST in this way, so it may be a firewall of some kind that's deciding to assassinate this connection.

It might be good to talk to the GNOME sysadmins and ask if they have any kind of firewall that could be causing this. I'll ask mneptok.

I would guess that something is strictly enforcing http/1.1 behaviour and a bare get is not compliant. We should see if sending the other headers does give the right behaviour.

Revision history for this message
Martin Pool (mbp) wrote :

The Host header is required by http1.1 (iirc)

If the User-Agent field is missing, the server resets the connection.

It looks like if the request is not sent all in the one packet it is dropped, which again makes me suspect a somewhat dodgy firewall.

We should check that all the headers are sent, and that they're buffered and flushed as one unit.

hope that helps.

Revision history for this message
Vincent Ladeuil (vila) wrote :

I don't know if I should cry or laugh...

The problem is that we use User-agent instead of User-Agent...

There is even a FIXME mentioning that in _urllib2_wrappers.py.

Now, I'd really like to know *why* we get a connection reset when nowhere in RFCs I've seen the case of the header names to be mandatory.

Revision history for this message
Vincent Ladeuil (vila) wrote :

By the way, httplib always sends the request with its headers in a single packet.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 229076] Re: 'Connection reset by peer' error when branching repository

4.2 Message Headers

   HTTP header fields, which include general-header (section 4.5),
   request-header (section 5.3), response-header (section 6.2), and
   entity-header (section 7.1) fields, follow the same generic format as
   that given in Section 3.1 of RFC 822 [9]. Each header field consists
   of a name followed by a colon (":") and the field value. Field names
   are case-insensitive.

That is all :)

-Rob

Vincent Ladeuil (vila)
Changed in bzr:
milestone: none → 1.6
status: Confirmed → Fix Released
Revision history for this message
Olav Vitters (ovitters) wrote :

The reset is due to the firewall on the machine (to prevent a DDoS). It is sending the reset on purpose (for the botnet). However, shouldn't block valid clients obviously. Just very difficult to block the bots, combined with the multiple sites on the machine. I'm going to allow the lower case user-agent. Note: might take a while.

Revision history for this message
Vincent Ladeuil (vila) wrote :

'User-agent' was used and triggered the bug, not 'user-agent'.

Previous versions of bzr, still in the wild, use 'User-agent'.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.