Publish binary "wheel" packages to pypi (Accepted as PEP 427)

Bug #1176147 reported by Kenneth Knowles
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
lxml
Opinion
Wishlist
scoder

Bug Description

The binary packaging format called "wheel" is now accepted as PEP 427 (alongside PEP 376) and lxml is a prime candidate. Since wheel is already supported by distribute ("python setup.py bdist_wheel") I presume this is very easy for those platforms that the lxml team has easy access to.

Revision history for this message
scoder (scoder) wrote :

Well, it's additional overhead and I don't see the advantage on Linux. It's best to target that platform with a source build, IMHO.

Changed in lxml:
importance: Undecided → Wishlist
status: New → Opinion
Revision history for this message
Kenneth Knowles (kenn-knowles) wrote :

My use case may actually be very common: I develop an open source project on github that uses travis-ci. This means rebuilding the testing environment with each build, or else doing hacky things. Any open source python project that depends on lxml could/should be doing something similar. By uploading a wheel you might save 2-10 minutes from each build, of which we certainly have dozens per day, so the overall savings for just typing "python setup.py bdist_wheel upload" could be huge.

Continuing to use a source distribution is *not* an option. The question is simply whether we can stick with the standard pypi or have to roll something ourselves that will, unfortunately, not benefit the rest of the community.

Revision history for this message
scoder (scoder) wrote :

But what should the binaries look like?

Creating static binaries isn't an option on Linux because it would prevent users from upgrading libxml2/libxslt independently of lxml. Dynamically linked binaries don't work because both the API and the feature set of libxml2 changes over time, so it would either fail to install for users or fail to take advantage of their specific local installation (including bug fixes and sometimes even security fixes).

A source build is really the best option, IMHO, and it may be the only one that actually works in practice.

BTW, lxml builds quite quickly when you disable all C compiler optimisations (-O0). There's really no reason not to do that in a CI environment.

Revision history for this message
Jan Vlčinský (jan-vlcinsky) wrote :

I have to seccond Kenneth, that having lxml distribution available in wheel format is real timesaver, which is about to change working style of thousands of Python app developers. Having lxml installed in fraction of second compared to minutes is real difference.

re: how would the binary look like:
at the time of lxml new version, build wheel using wheel tool. Yes, it woudl be statically linked.

re: static build preventing users from upgrading libxml2/libxslt independently from lxml.
Simply not true. Option to build it from sourde still remains there and anyone can make update, if he/she is really interested.

re: dynamically linked binaries don't work:
True. But nobody is asking for having it dynamically linekd, in fact, it could be source of possible problems.

re: source build is really the best option.
Keneth summarized the advantages very well. The overall savings would be really huge. Just count - measure standard compile time, measure optimized compile time (which almost noone uses, as nobody really knows, how to set this option into standard application buildout procedure), measure the time needed to install it from wheel. Take the differennce and multiply by number of developers, or even number of builds.

re: lxml builds quite quickly
I do not know, what is meant by "quite quickly". Is that less then wheel format?
I wanted to test the compilatin without optimizhation, but reading documentation for lxml is not now showing easy method, how to do it, pointing me to special page, dealing with this topic. But I am not C-programmer, I am developong Python applications and want to have it integrated into my test and build environment efficiently.
And even more important: why should I switch of optimization? Am I supposed to run CI test using differently compiled component then in production?

There are pros and cons of having wheel format for lxml. They shall be evaluated and compared. Please, do it from user point of view.

Revision history for this message
Milan Holub (milan-holub-z) wrote :

I'd like to vote for lxml being packed in wheel - reasons were already mentioned in previous comments. Especially time-saving in CI environments.

Revision history for this message
scoder (scoder) wrote :

I don't buy the CI argument. For that use case, I'm sure you'd rather keep packages locally than depending on PyPI being up and fast in the first place. And if you keep your packages locally anyway, you can just as well keep your own binary package around, instead of relying on some externally provided binary build to match whatever your code will actually find in production.

The problem here is not so much that lxml can't be provided as binary package. It's rather that I consider it a bad idea to provide binary packages with a specific vanilla version of a statically linked libxml2/libxslt, rather than whatever your operating system distribution provides as security patched installation of those libraries. And providing dynamically linked binary packages is just screaming for hassle.

Revision history for this message
Kenneth Knowles (kenn-knowles) wrote :

There is no possibility of keeping packages "locally" as this concept does not really exist in our setting. Travis-CI (and other hosted CI services) provide a fresh and clean VM with each build. This is a superior in terms of finding bugs in a project's packaging and installation process, and corresponds to an effective way of deploying a cloud service, so artifacts of prior installs cannot interfere with an updated version. This is also superior in terms of finding bugs in a project's packaging, and

However, because of lxml and a variety of other such large libraries, the rather monolithic project I am working on has resorted to pre-building a virtualenv compatible with Travis, which we then clone to travis for each build. We lose the testing of our packaging, but gain a more fluid workflow by skipping the install time. A good tradeoff today, but one that I can hope will not be permanent.

Revision history for this message
scoder (scoder) wrote :

For clarification, I added a paragraph to the docs (on github) that shows how to set the CFLAGS in a test environment. For example, you can speed up the build by installing lxml like this:

    CFLAGS=-O0 pip install lxml

And no, I don't see a problem with using different C compiler flags for a test environment and production.

Revision history for this message
scoder (scoder) wrote :

It seems that wheel uploads for any other platforms than Windows are currently disallowed by PyPI. I can't build Windows packages myself (and the existing binary installers seem to work well enough there), so this ticket is currently a theoretical one.

Revision history for this message
Sascha Peilicke (saschpe) wrote :

Besides the valid (IMO) points that others raised already, wheels on PyPI have other advantages. Currently, I need to have gcc / libxml2-devel and libxslt-devel available on my system in order to be able to "pip install lxml". While I can fix that on my workstation, it's rather annoying other environments (including CI). Since external dependencies can't be handled by pip/setuptools, I'm left to whatever the system has to offer. On Fedora, I need to use yum and openSUSE has zypper. However, if lxml wheels would be published, pip would prefer those and skip the compilation altogether, hence avoiding the need for gcc & co.

And "pip install ..." becomes much faster. This is a valid concern if you run it several thousand times a day (ok, CI again).

Revision history for this message
scoder (scoder) wrote :

I understand and see the advantages of wheels. However, the disadvantages of this format for binary packages that depend on external libraries are also clear from my POV.

Plus, there is no guarantee that a wheel built for, say, CPython 3.3.4 works for an installation of CPython 3.3.0. It usually should, but it may not, in some cases. I also don't see a way to specify that "this wheel requires glibc 2.65 and thus won't work on your Ubuntu 12.04 LTS, sorry".

Dealing with these kinds of tiny incompatibilities is definitely not what I want to invest my time in.

Revision history for this message
Sascha Peilicke (saschpe) wrote :

I see. I guess if patch-level Python releases break ABI, that's an upstream bug. But it can't be guaranteed that this won't happen, true. For external libraries, I would have to check if and who wheel supports specifying the SONAMEs needed.

Revision history for this message
scoder (scoder) wrote :

> I guess if patch-level Python releases break ABI, that's an upstream bug

It's not about breaking the ABI so much as about bugs getting fixed in CPython point releases, for example. Cython generated C code will typically stop working around them if you build on a later version, so that the work-around will be gone if you then try to run the binary module on an older point version that has the bug. It's this kind of subtle issues that I'd hate to push users into (and thus myself, because if they come complaining, then it takes time on both sides to debug it).

Revision history for this message
anatoly techtonik (techtonik) wrote :

This is sad - https://stackoverflow.com/questions/5178416/pip-install-lxml-error

Is it possible to provide wheels for those users who need it and leave source for everybody else?

Revision history for this message
scoder (scoder) wrote :

Setting to "won't fix" to reduce further discussion (although it may not have an impact on people who do not read the rest of the ticket before they reply).

Changed in lxml:
assignee: nobody → scoder (scoder)
status: Opinion → Won't Fix
Revision history for this message
anatoly techtonik (techtonik) wrote :

tl;dr problem can alleviate with proper sum up of reasons in ticket closing message.

Revision history for this message
Markus Unterwaditzer (untitaker) wrote :

It would be nice if lxml provided wheels specifically targeted at Ubuntu 12.04, because that's what Travis is using. It would help avoiding having to rebuild lxml, which is time consuming even with -O0.

Revision history for this message
Danilo Bargen (gwrtheyrn) wrote :

I don't think CI is a big issue, -O0 reduces Travis CI time by about 2 minutes.

On the other hand, installing lxml on Windows is a *huge* pain. Just take a look at http://stackoverflow.com/search?q=windows+lxml Maybe you or someone else could provide wheels for Windows, either on PyPI or for manual download? The current binary distributions (.exe) are hard to install in a virtualenv.

Revision history for this message
Hugo (hugovk) wrote :

-C0 is much quicker, but wheels would be nice.

See also http://pythonwheels.com/ -- lxml is the 6th most-downloaded package on PyPI and the second most-downloaded package without a wheel.

Revision history for this message
scoder (scoder) wrote :

I uploaded wheels for lxml 3.3.5 that I generated from the Windows installers with "wheel convert". I cannot test them myself, so please verify that they install correctly and work as expected.

Relevant tickets for Windows builds:

https://github.com/zopefoundation/zope.wineggbuilder/issues/2

https://github.com/zopefoundation/zope.wineggbuilder/issues/3

https://github.com/zopefoundation/zope.wineggbuilder/issues/7

Revision history for this message
scoder (scoder) wrote :

Changing back to "Opinion". If you can help out with the Windows build setup, please refer to the tickets I posted above. It's pretty much clear what needs to be done, I think, it only needs a bit of work and testing. Please do not lament here if you are unwilling to invest this time yourself.

Changed in lxml:
status: Won't Fix → Opinion
Revision history for this message
jfs (jfs+lp) wrote :

I don't know whether it is relevant here but I can install numpy on Ubuntu without compilation via `pip install numpy` (it uses a binary wheel even on Linux).

numpy provides binary wheels for Windows, OS X, (some) Linux: https://pypi.python.org/pypi/numpy

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.