Recommended packages are excessive

Bug #662423 reported by mlissner
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
python-scrapy (Ubuntu)
In Progress
Undecided
Ignace Mouzannar

Bug Description

Binary package hint: python-scrapy

I just ran sudo aptitude install python-scrapy, and without paying much attention also got a number of totally unrelated packages, such as python-django and python-guppy.

I know what django is, but I don't know what python-guppy is, and I shouldn't need to in order to install python-scrapy.

My understanding of aptitude is that it only installs what's necessary for a package, and in my opinion, this shouldn't include a bunch of unrelated cruft.

If I'm wrong, please close, and I'll look into it further on my end, but if not, the python-scrapy recommended packages should be revised as soon as possible.

Revision history for this message
Pablo Hoffman (pablohoffman) wrote :

The Scrapy version in the Ubuntu/Debian python-scrapy package is quite old. I suggest you use the Ubuntu packages provided by the Scrapy project instead: http://doc.scrapy.org/topics/ubuntu.html (APT repo available too).

It would be great if we could synchronize the Debian package provided by the Scrapy project with the one available in Ubuntu/Debian, but not I'm not sure what's the procedure for that.

Revision history for this message
Ignace Mouzannar (ghantoos) wrote :

Hello,

python-guppy is recommended by python-scrapy because one of it files tries to import it [1]:
/
| try:
| import guppy
| hpy = guppy.hpy()
| except ImportError:
| hpy = None
\

As for the not up-to-date package in Debian/Ubuntu, it is due to the Debian Freeze. I have uploaded the 0.9 version of python-scrapy to experimental, as it does not fit to be in squeeze (due to the freeze). I will work on packaging the latest version and upload it on experimental (Debian). Then, a synchronization request can be sent on launchpad asking the Ubuntu Teams to review, and upload the package onto Ubuntu repositories.

If you need any help on this part, I would be glad to give you a hand.

Cheers,
 Ignace M

[1] http://hg.scrapy.org/scrapy/file/8ffd63e657f0/scrapy/telnet.py
[2] http://packages.qa.debian.org/p/python-scrapy/news/20100823T164725Z.html

Revision history for this message
Ignace Mouzannar (ghantoos) wrote :

I am closing this bug setting it as "Invalid". Please reopen it if you it necessary.

Regards,
 Ignace M

Changed in python-scrapy (Ubuntu):
status: New → Invalid
assignee: nobody → Ignace Mouzannar (ghantoos)
Revision history for this message
mlissner (mlissner-michaeljaylissner) wrote :

I see that guppy is imported there...but what about requiring python-django? Surely, a scraping utility doesn't need django installed, right?

Changed in python-scrapy (Ubuntu):
status: Invalid → New
Revision history for this message
Ignace Mouzannar (ghantoos) wrote :

I simple grep shows that it does:
----
$ grep -R django scrapy/
scrapy/tests/test_djangoitem/models.py:from django.db import models
scrapy/tests/test_djangoitem/models.py: app_label = 'test_djangoitem'
scrapy/tests/test_djangoitem/__init__.py:from scrapy.contrib_exp.djangoitem import DjangoItem, Field
scrapy/tests/test_djangoitem/__init__.py:os.environ['DJANGO_SETTINGS_MODULE'] = 'scrapy.tests.test_djangoitem.settings'
scrapy/tests/test_djangoitem/__init__.py: import django
scrapy/tests/test_djangoitem/__init__.py: django = None
scrapy/tests/test_djangoitem/__init__.py:if django:
scrapy/tests/test_djangoitem/__init__.py: django_model = Person
scrapy/tests/test_djangoitem/__init__.py: if not django:
scrapy/contrib_exp/djangoitem.py: if cls.django_model:
scrapy/contrib_exp/djangoitem.py: cls._model_meta = cls.django_model._meta
scrapy/contrib_exp/djangoitem.py: django_model = None
scrapy/contrib_exp/djangoitem.py: model = self.django_model(**modelargs)
----

However, it might be interesting to 'suggest' it instead 'recommend' it. I'll work on this in the next release.

Cheers,
 Ignace M

Changed in python-scrapy (Ubuntu):
status: New → In Progress
Revision history for this message
Pablo Hoffman (pablohoffman) wrote :

Scrapy contains many additional code in scrapy.contrib which are extras and not required to run the framework. boto is another example of such dependency, which is even listed as depends (not recommends). The only 3 depends are: python-libxml2, python-twisted, python-openssl (this is optional but highly recommended and it's listed as suggests now).

I would list as recommended: python-pygments, python-imaging (used by a very popular contrib - the images pipeline), and maybe python-guppy (or perhaps in suggests).

Packages I can't decide whether to put in suggests or recommends:

* ipython (supports an improved shell console - we should probably do the same as django?)
* python-boto (enables S3 backend for images pipeline)
* python-pygments (supports colorized output in scrapy parse command)
* python-guppy (enhances memory debugging capabilities from the telnet console)

python-django doesn't fit in either recommends or suggests, IMO.

Revision history for this message
Ignace Mouzannar (ghantoos) wrote :

Thanks Pablo for your input!

I will take all of your remarks into account in the upcoming package.

Regards,
 Ignace M

Revision history for this message
mlissner (mlissner-michaeljaylissner) wrote :

iPython shouldn't be included. It's a tool that's useful for Python developers. Surely, if you installed a scraping utility, you wouldn't expect typing the word python into a console to change, right? I'd be dumbfounded if that happened to me and I wasn't watching carefully what I was installing.

Principle of least surprise.

Revision history for this message
Pablo Hoffman (pablohoffman) wrote :

mlissner, bear in mind that installing ipython doesn't replace the "python" command, but adds an "ipython" command instead. However, applications can optionally choose to embed ipython shells (when ipython is available) and otherwise fallback to regular python shell. This is what Django and Scrapy do.

It could still affect other applications that follow this behaviour, but it's not as critical as replacing a console command.

Revision history for this message
Pablo Hoffman (pablohoffman) wrote :

Ignace, thanks for taking care of this. We have just released Scrapy 0.14, so you may want to bump the version as part of the changes for the upcoming package. Bear in mind that Scrapy now depends on a separate package: w3lib, so you would have to add a python-w3lib package as well.

Perhaps this is a good time to synchronize Ubuntu official package with Scrapy official package [1]. But it's probably worth creating a separate ticket to discuss that.

[1] https://github.com/scrapy/scrapy/tree/master/debian

Revision history for this message
mlissner (mlissner-michaeljaylissner) wrote :

Pablo, true, but at least in Ubuntu, when you install iPython, it replaces the python command with itself, I think. Maybe that's just the django shell...not positive.

Revision history for this message
Ignace Mouzannar (ghantoos) wrote : Re: [Bug 662423] Re: Recommended packages are excessive

On Thu, Nov 17, 2011 at 15:21, Pablo Hoffman <email address hidden> wrote:
> Ignace, thanks for taking care of this. We have just released Scrapy
> 0.14, so you may want to bump the version as part of the changes for the
> upcoming package. Bear in mind that Scrapy now depends on a separate
> package: w3lib, so you would have to add a python-w3lib package as well.
>
> Perhaps this is a good time to synchronize Ubuntu official package with
> Scrapy official package [1].  But it's probably worth creating a
> separate ticket to discuss that.
>
> [1] https://github.com/scrapy/scrapy/tree/master/debian

Sure! I will work on it by the end of next week, I'll keep you posted.

Cheers,
 Ignace M

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.