KeyboardInterrupt thrown by urlgrab

Bug #776555 reported by Andrew on 2011-05-03
This bug affects 1 person
Affects Status Importance Assigned to Milestone
urlgrabber (Fedora)
Won't Fix
urlgrabber (Ubuntu)

Bug Description

TypeError: start() got an unexpected keyword argument 'size'
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/", line 552, in __bootstrap_inner
  File "/home/anrobins/development/jdev-manager/", line 314, in run
  File "/home/anrobins/development/jdev-manager/", line 53, in start
    progress_obj = self._progress_obj)
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 618, in urlgrab
    return default_grabber.urlgrab(url, filename, **kwargs)
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 982, in urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 886, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 968, in retryfunc
    fo = PyCurlFileObject(url, filename, opts)
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 1063, in __init__
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 1351, in _do_open
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 1481, in _do_grab
  File "/usr/lib/pymodules/python2.7/urlgrabber/", line 1276, in _do_perform
    raise KeyboardInterrupt

This code was working in maverick with python 2.6. It looks like is calling the curl_obj with incorrect arguments.

This could be related to bug 776500:

Before I was using "check_timestamp", now I am trying "simple"

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: python-urlgrabber 3.9.1-4
ProcVersionSignature: Ubuntu 2.6.38-9.43-generic
Uname: Linux 2.6.38-9-generic x86_64
Architecture: amd64
Date: Tue May 3 10:47:55 2011
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64 (20110427.1)
PackageArchitecture: all
 PATH=(custom, user)
SourcePackage: urlgrabber
UpgradeStatus: No upgrade log present (probably fresh install)

Andrew (andrew-rw-robinson) wrote :
Andrew (andrew-rw-robinson) wrote :


The problem is that the code is not matching the documentation.

In the python docs for grabber (pydoc urlgrabber.grabber), the progress_obj is clearly documented as to what arguments are supported:

      progress_obj = None

        a class instance that supports the following methods:
          po.start(filename, url, basename, length, text)
          # length will be None if unknown
          po.update(read) # read == bytes read so far

But, in the actual source shipped with natty, this is the code that calls this method:

    def _retrieve(self, buf):
            if not self._prog_running:
                if self.opts.progress_obj:
                    size = self.size + self._reget_length
                    self._prog_running = True

            self._amount_read += len(buf)
            return len(buf)
        except KeyboardInterrupt:
            return -1

Note how "size" and "text" are passed as named arguments instead of positional arguments?

The correct code should be:


This would correctly match the documentation.

Alternative would be to use length=size and not size=size in the code as the documentation shows the parameter name as "length", not "size".

Andrew (andrew-rw-robinson) wrote :

Work-around is to define the start method as:
  def start(self, filename, url, basename, size, text):
    length = size
instead of:
  def start(self, filename, url, basename, length, text):

Andrew (andrew-rw-robinson) wrote :

Another work-around that should work with the correct code and the broken code:

  def start(self, filename, url, basename, *args, **kwargs):
    if len(args) >= 1:
      length = args[0]
      if len(args) >= 2:
        text = args[1]
        text = kwargs.get('text')
      length = kwargs.get('size') if kwargs.has_key('size') else kwargs.get('length')
      text = kwargs.get('text')

Created attachment 497172
Patch to fix the bug

Description of problem:
I found the bug in Ubuntu Natty:

The progress_obj is documented as:
po.start(filename, url, basename, length, text)

But it is called as:
po.start(filename, url, basename, size=..., text=...)

Regression (worked fine in Ubuntu Maverick)

Version-Release number of selected component (if applicable):
Ubuntu version of python-urlgrabber: 3.9.1-4

How reproducible:
100% reproducible

Steps to Reproduce:
1. Pass a progress_obj to the grabber
2. Implement the documented API
3. Code fails when called incorrectly by

Actual results:
Code fails with a KeyboardInterrupt

Expected results:
Should work as documented, and as it used to work

Additional info:
See details in ubuntu bug:

Andrew (andrew-rw-robinson) wrote :

Since there is no activity from the maintainer yet, I filed an upstream bug:

tags: added: patch

This code has been like this pretty much forever (2005 the text param. was added), I can see how you might be confused if you just looked at the documentation though. Alas. we can't just remove the param name from the call, as the API has other params:

    def start(self, filename=None, url=None, basename=None,
              size=None, now=None, text=None): our only sane option is to treat it as a documentation bug, and fix that. Sorry, if that doesn't help you much.

Daniel T Chen (crimsun) on 2011-07-20
Changed in urlgrabber (Ubuntu):
status: New → Triaged
Daniel T Chen (crimsun) wrote :

Upstream has decided to fix the documentation instead; see changeset 674d545ee303aa99701ffb982536851572d8db77.

Changed in urlgrabber (Ubuntu):
status: Triaged → Won't Fix

The documentation was fixed upstream.

This request was evaluated by Red Hat Engineering for inclusion in a Red
Hat Enterprise Linux maintenance release.

Red Hat does not currently plan to provide this change in a Red Hat
Enterprise Linux update release for currently deployed products.

With the goal of minimizing risk of change for deployed systems, and in
response to customer and partner requirements, Red Hat takes a
conservative approach when evaluating enhancements for inclusion in
maintenance updates for currently deployed products. The primary
objectives of update releases are to enable new hardware platform
support and to resolve critical defects.

Changed in urlgrabber (Fedora):
importance: Unknown → Undecided
status: Unknown → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.