Importing lxml.etree fails on pypy

Bug #1273709 reported by Choongmin Lee
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxml
Fix Released
Low
scoder

Bug Description

I checked it on Ubuntu 12.04.4 LTS and Mac OS X 10.9.1.

On Ubuntu 12.04.4 LTS, I downloaded pypy 2.2.1 64-bit binary and made a virtualenv with pypy.

Then installed lxml using pip:

    $ pip install lxml
    Downloading/unpacking lxml
      Downloading lxml-3.3.0.tar.gz (3.4MB): 3.4MB downloaded
      Running setup.py (path:/home/choongmin/.virtualenvs/pypy/build/lxml/setup.py) egg_info for package lxml
        /opt/pypy-2.2.1-linux64/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.26
        Building against libxml2/libxslt in the following directory: /usr/lib/x86_64-linux-gnu

        warning: no previously-included files found matching '*.py'
    Installing collected packages: lxml
      Running setup.py install for lxml
        /opt/pypy-2.2.1-linux64/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.26
        Building against libxml2/libxslt in the following directory: /usr/lib/x86_64-linux-gnu
        warning: build_py: byte-compiling is disabled, skipping.

        building 'lxml.etree' extension
        cc -O2 -fPIC -Wimplicit -I/usr/include/libxml2 -I/home/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/home/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w
        cc -shared build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -L/usr/lib/x86_64-linux-gnu -lxslt -lexslt -lxml2 -lz -lm -o build/lib.linux-x86_64-2.7/lxml/etree.pypy-22.so
        building 'lxml.objectify' extension
        cc -O2 -fPIC -Wimplicit -I/usr/include/libxml2 -I/home/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/home/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.objectify.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.objectify.o -w
        cc -shared build/temp.linux-x86_64-2.7/src/lxml/lxml.objectify.o -L/usr/lib/x86_64-linux-gnu -lxslt -lexslt -lxml2 -lz -lm -o build/lib.linux-x86_64-2.7/lxml/objectify.pypy-22.so
        warning: install_lib: byte-compiling is disabled, skipping.

    Successfully installed lxml
    Cleaning up...

Then when I try to import lxml.etree, it raises ImportError:

    $ python
    Python 2.7.3 (87aa9de10f9c, Nov 24 2013, 18:48:13)
    [PyPy 2.2.1 with GCC 4.6.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    iAnd now for something completely different: ``if switzerland were where greece
    is (on islands) would they all be connected by bridges?''
    >>>> import lxml.etree
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: unable to load extension module '/home/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so': /home/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so: undefined symbol: PyUnicode_Concat
    >>>>

When I do the same for python 2.7.3, it works fine. Version info:

    Python : sys.version_info(major=2, minor=7, micro=3, releaselevel='final', serial=0)
    lxml.etree : (3, 3, 0, 0)
    libxml used : (2, 7, 8)
    libxml compiled : (2, 7, 8)
    libxslt used : (1, 1, 26)
    libxslt compiled : (1, 1, 26)

And similar things happen on Mac OS X 10.9.1. In OS X, I have installed pypy, libxml2, and libxslt using homebrew. Installing lxml:

    $ pip install lxml
    Downloading/unpacking lxml
      Downloading lxml-3.3.0.tar.gz (3.4MB): 3.4MB downloaded
      Running setup.py (path:/Users/choongmin/.virtualenvs/pypy/build/lxml/setup.py) egg_info for package lxml
        /usr/local/Cellar/pypy/2.2.1/libexec/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.28

        warning: no previously-included files found matching '*.py'
    Installing collected packages: lxml
      Running setup.py install for lxml
        /usr/local/Cellar/pypy/2.2.1/libexec/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.28
        warning: build_py: byte-compiling is disabled, skipping.

        building 'lxml.etree' extension
        cc -O2 -fPIC -Wimplicit -arch i386 -arch x86_64 -I/usr/include/libxml2 -I/Users/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/Users/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.etree.c -o build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.etree.o -w -flat_namespace
        cc -shared -undefined dynamic_lookup -arch i386 -arch x86_64 build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.etree.o -lxslt -lexslt -lxml2 -lz -lm -o build/lib.macosx-10.9-x86_64-2.7/lxml/etree.pypy-22.so
        building 'lxml.objectify' extension
        cc -O2 -fPIC -Wimplicit -arch i386 -arch x86_64 -I/usr/include/libxml2 -I/Users/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/Users/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.objectify.c -o build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.objectify.o -w -flat_namespace
        cc -shared -undefined dynamic_lookup -arch i386 -arch x86_64 build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.objectify.o -lxslt -lexslt -lxml2 -lz -lm -o build/lib.macosx-10.9-x86_64-2.7/lxml/objectify.pypy-22.so
        warning: install_lib: byte-compiling is disabled, skipping.

    Successfully installed lxml
    Cleaning up...

Then import lxml.etree:

    $ python
    Python 2.7.3 (87aa9de10f9c, Nov 24 2013, 20:57:21)
    [PyPy 2.2.1 with GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    And now for something completely different: ``and now for something completely
    different''
    >>>> import lxml
    >>>> lxml.etree
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'module' object has no attribute 'etree'
    >>>> import lxml.etree
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: unable to load extension module '/Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so': dlopen(/Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so, 6): Symbol not found: _PyByteArray_AS_STRING
      Referenced from: /Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so
      Expected in: flat namespace
     in /Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so
    >>>>

It works with python 2.7.6. This is the version information on OS X:

    Python : sys.version_info(major=2, minor=7, micro=6, releaselevel='final', serial=0)
    lxml.etree : (3, 3, 0, 0)
    libxml used : (2, 9, 0)
    libxml compiled : (2, 9, 0)
    libxslt used : (1, 1, 28)
    libxslt compiled : (1, 1, 28)

Revision history for this message
scoder (scoder) wrote :

PyPy isn't currently tested, mostly because I don't have a CI server with a running PyPy installation (nor do I really care much about PyPy myself).

In any case, PyPy's C-API emulation isn't great (to put it *really* friendly), so this doesn't come unexpected. I've pushed a couple of fixes into Cython but can't test them. One is this:

"""
     } else
 #endif /* __PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT */

+#if !CYTHON_COMPILING_IN_PYPY
 #if PY_VERSION_HEX >= 0x02060000
     if (PyByteArray_Check(o)) {
         *length = PyByteArray_GET_SIZE(o);
         return PyByteArray_AS_STRING(o);
     } else
 #endif
+#endif
     {
         char* result;
         int r = PyBytes_AsStringAndSize(o, &result, length);
"""

(i.e. add another preprocessor condition around the Py2.6 test) and the other is to replace all occurrences of "PyUnicode_Concat" by "PyNumber_Add" in the code. Both of these need to be done in lxml.etree.c and lxml.objectify.c.

Obviously, the correct place to fix these would be PyPy, but at least the work-arounds aren't difficult to implement.

Revision history for this message
scoder (scoder) wrote :

I've uploaded a source distro here, would be nice if you could test it.

http://lxml.de/files/lxml-3.3.1pre.tar.gz

Revision history for this message
scoder (scoder) wrote :

Works for me now.

Changed in lxml:
assignee: nobody → scoder (scoder)
importance: Undecided → Low
milestone: none → 3.3
status: New → Fix Committed
Revision history for this message
Choongmin Lee (choongmin) wrote :

It works for me too. Thanks!

Revision history for this message
scoder (scoder) wrote :

Fixed in lxml 3.3.1.

Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.