hardy regression: reading from a urllib2 file descriptor happens byte-at-a-time
Bug #214183 reported by
James Troup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
python2.5 (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: python2.5
Hi,
When reading from urllib2 file descriptor, python2.5 in hardy will
read the data a byte at a time regardless of how much you ask for.
python2.4 will read the data in 8K chunks.
This has enough of a performance impact that it increases download
time for a large file over a gigabit LAN from 10 seconds to 34
minutes. (!)
Trivial/obvious example code:
f = urllib2.urlopen("http://
while 1:
chunk = f.read()
... and then strace it to see the recv()'s chugging along, one byte at
a time.
--
James
Related branches
To post a comment you must log in.
This bug was fixed in the package python2.5 - 2.5.2-2ubuntu3
---------------
python2.5 (2.5.2-2ubuntu3) hardy; urgency=low
* Fix urllib2 file descriptor happens byte-at-a-time, reverting
a fix for excessively large memory allocations when calling .read()
on a socket object wrapped with makefile(). LP: #214183.
-- Matthias Klose <email address hidden> Tue, 08 Apr 2008 23:27:23 +0200