hardy regression: reading from a urllib2 file descriptor happens byte-at-a-time

Bug #214183 reported by James Troup
2
Affects Status Importance Assigned to Milestone
python2.5 (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: python2.5

Hi,

When reading from urllib2 file descriptor, python2.5 in hardy will
read the data a byte at a time regardless of how much you ask for.
python2.4 will read the data in 8K chunks.

This has enough of a performance impact that it increases download
time for a large file over a gigabit LAN from 10 seconds to 34
minutes. (!)

Trivial/obvious example code:

    f = urllib2.urlopen("http://launchpadlibrarian.net/13214672/nexuiz-data_2.4.orig.tar.gz")
    while 1:
   chunk = f.read()

... and then strace it to see the recv()'s chugging along, one byte at
a time.

--
James

Related branches

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python2.5 - 2.5.2-2ubuntu3

---------------
python2.5 (2.5.2-2ubuntu3) hardy; urgency=low

  * Fix urllib2 file descriptor happens byte-at-a-time, reverting
    a fix for excessively large memory allocations when calling .read()
    on a socket object wrapped with makefile(). LP: #214183.

 -- Matthias Klose <email address hidden> Tue, 08 Apr 2008 23:27:23 +0200

Changed in python2.5:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers