Fail to parse boundary if multiple Content-Type headers are given
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Confirmed
|
High
|
Unassigned | ||
Breezy |
Triaged
|
Medium
|
Unassigned |
Bug Description
See the trailing discussion from bug #198646
If an HTTP Server returns multiple Content-Type entries for a multi-part response, we fail to properly parse the boundary="" string. This happens for both urllib and pycurl.
It seems that bzr.savannah.org is returning headers like:
7.658 < HTTP/1.0 206 Partial content
7.658 < Connection: Keep-Alive
< Content-Type: multipart/
< Date: Thu, 31 Jul 2008 17:21:46 GMT
< Server: Apache/2.2.3 (Debian) DAV/2
< Last-Modified: Thu, 26 Jun 2008 22:04:56 GMT
< ETag: "1c9c203-
< Accept-Ranges: bytes
< Content-Type: application/plain
< Via: 1.0 hinet-C233.10
The first one is a valid entry for multi-part content, and includes a clear description of the boundary string that will be used for parsing.
What seems to be happening, is that both pycurl and urllib are concatenating the Content-Type strings to get:
content_type = 'multipart/
And then finding boundary = '"zrbUwOxpkyBKv
It is unknown at this time whether Savannah is incorrect to return multiple Content-Type fields, or whether urllib and pycurl are incorrect to concatenate them before parsing the boundary.
Changed in bzr: | |
status: | Triaged → Confirmed |
tags: | added: check-for-breezy |
tags: | removed: check-for-breezy |
Changed in brz: | |
status: | New → Triaged |
importance: | Undecided → Medium |
There are http log traces available in bug #198646.
I'm marking this as 'High' because it means that bzr branches on Savannah cannot be reliably accessed. (It works as long as you don't need a multi-part request, otherwise it fails.)