Ubuntu

KeyboardInterrupt thrown by urlgrab

Reported by Andrew on 2011-05-03
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
urlgrabber (Fedora)
Unknown
Unknown
urlgrabber (Ubuntu)
Undecided
Unassigned

Bug Description

TypeError: start() got an unexpected keyword argument 'size'
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
    self.run()
  File "/home/anrobins/development/jdev-manager/downloadUi.py", line 314, in run
    self._control._downloader.start()
  File "/home/anrobins/development/jdev-manager/download.py", line 53, in start
    progress_obj = self._progress_obj)
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 618, in urlgrab
    return default_grabber.urlgrab(url, filename, **kwargs)
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 982, in urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 886, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 968, in retryfunc
    fo = PyCurlFileObject(url, filename, opts)
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 1063, in __init__
    self._do_open()
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 1351, in _do_open
    self._do_grab()
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 1481, in _do_grab
    self._do_perform()
  File "/usr/lib/pymodules/python2.7/urlgrabber/grabber.py", line 1276, in _do_perform
    raise KeyboardInterrupt
KeyboardInterrupt

This code was working in maverick with python 2.6. It looks like grabber.py is calling the curl_obj with incorrect arguments.

This could be related to bug 776500:
https://bugs.launchpad.net/bugs/776500

Before I was using "check_timestamp", now I am trying "simple"

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: python-urlgrabber 3.9.1-4
ProcVersionSignature: Ubuntu 2.6.38-9.43-generic 2.6.38.4
Uname: Linux 2.6.38-9-generic x86_64
Architecture: amd64
Date: Tue May 3 10:47:55 2011
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64 (20110427.1)
PackageArchitecture: all
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: urlgrabber
UpgradeStatus: No upgrade log present (probably fresh install)

Andrew (andrew-rw-robinson) wrote :
Andrew (andrew-rw-robinson) wrote :

Update:

The problem is that the code is not matching the documentation.

In the python docs for grabber (pydoc urlgrabber.grabber), the progress_obj is clearly documented as to what arguments are supported:

      progress_obj = None

        a class instance that supports the following methods:
          po.start(filename, url, basename, length, text)
          # length will be None if unknown
          po.update(read) # read == bytes read so far
          po.end()

But, in the actual source shipped with natty, this is the code that calls this method:

    def _retrieve(self, buf):
        try:
            if not self._prog_running:
                if self.opts.progress_obj:
                    size = self.size + self._reget_length
                    self.opts.progress_obj.start(self._prog_reportname,
                                                 urllib.unquote(self.url),
                                                 self._prog_basename,
                                                 size=size,
                                                 text=self.opts.text)
                    self._prog_running = True
                    self.opts.progress_obj.update(self._amount_read)

            self._amount_read += len(buf)
            self.fo.write(buf)
            return len(buf)
        except KeyboardInterrupt:
            return -1

Note how "size" and "text" are passed as named arguments instead of positional arguments?

The correct code should be:

                    self.opts.progress_obj.start(self._prog_reportname,
                                                 urllib.unquote(self.url),
                                                 self._prog_basename,
                                                 size,
                                                 self.opts.text)

This would correctly match the documentation.

Alternative would be to use length=size and not size=size in the code as the documentation shows the parameter name as "length", not "size".

Andrew (andrew-rw-robinson) wrote :

Work-around is to define the start method as:
  def start(self, filename, url, basename, size, text):
    length = size
    ...
instead of:
  def start(self, filename, url, basename, length, text):
    ...

Andrew (andrew-rw-robinson) wrote :

Another work-around that should work with the correct code and the broken code:

  def start(self, filename, url, basename, *args, **kwargs):
    if len(args) >= 1:
      length = args[0]
      if len(args) >= 2:
        text = args[1]
      else:
        text = kwargs.get('text')
    else:
      length = kwargs.get('size') if kwargs.has_key('size') else kwargs.get('length')
      text = kwargs.get('text')

Andrew (andrew-rw-robinson) wrote :

Since there is no activity from the maintainer yet, I filed an upstream bug:

https://bugzilla.redhat.com/show_bug.cgi?id=702457

tags: added: patch
Daniel T Chen (crimsun) on 2011-07-20
Changed in urlgrabber (Ubuntu):
status: New → Triaged
Daniel T Chen (crimsun) wrote :

Upstream has decided to fix the documentation instead; see changeset 674d545ee303aa99701ffb982536851572d8db77.

Changed in urlgrabber (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.