Importing lxml.etree fails on pypy

Bug #1273709 reported by Choongmin Lee on 2014-01-28
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxml
Low
scoder

Bug Description

I checked it on Ubuntu 12.04.4 LTS and Mac OS X 10.9.1.

On Ubuntu 12.04.4 LTS, I downloaded pypy 2.2.1 64-bit binary and made a virtualenv with pypy.

Then installed lxml using pip:

    $ pip install lxml
    Downloading/unpacking lxml
      Downloading lxml-3.3.0.tar.gz (3.4MB): 3.4MB downloaded
      Running setup.py (path:/home/choongmin/.virtualenvs/pypy/build/lxml/setup.py) egg_info for package lxml
        /opt/pypy-2.2.1-linux64/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.26
        Building against libxml2/libxslt in the following directory: /usr/lib/x86_64-linux-gnu

        warning: no previously-included files found matching '*.py'
    Installing collected packages: lxml
      Running setup.py install for lxml
        /opt/pypy-2.2.1-linux64/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.26
        Building against libxml2/libxslt in the following directory: /usr/lib/x86_64-linux-gnu
        warning: build_py: byte-compiling is disabled, skipping.

        building 'lxml.etree' extension
        cc -O2 -fPIC -Wimplicit -I/usr/include/libxml2 -I/home/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/home/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w
        cc -shared build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -L/usr/lib/x86_64-linux-gnu -lxslt -lexslt -lxml2 -lz -lm -o build/lib.linux-x86_64-2.7/lxml/etree.pypy-22.so
        building 'lxml.objectify' extension
        cc -O2 -fPIC -Wimplicit -I/usr/include/libxml2 -I/home/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/home/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.objectify.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.objectify.o -w
        cc -shared build/temp.linux-x86_64-2.7/src/lxml/lxml.objectify.o -L/usr/lib/x86_64-linux-gnu -lxslt -lexslt -lxml2 -lz -lm -o build/lib.linux-x86_64-2.7/lxml/objectify.pypy-22.so
        warning: install_lib: byte-compiling is disabled, skipping.

    Successfully installed lxml
    Cleaning up...

Then when I try to import lxml.etree, it raises ImportError:

    $ python
    Python 2.7.3 (87aa9de10f9c, Nov 24 2013, 18:48:13)
    [PyPy 2.2.1 with GCC 4.6.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    iAnd now for something completely different: ``if switzerland were where greece
    is (on islands) would they all be connected by bridges?''
    >>>> import lxml.etree
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: unable to load extension module '/home/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so': /home/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so: undefined symbol: PyUnicode_Concat
    >>>>

When I do the same for python 2.7.3, it works fine. Version info:

    Python : sys.version_info(major=2, minor=7, micro=3, releaselevel='final', serial=0)
    lxml.etree : (3, 3, 0, 0)
    libxml used : (2, 7, 8)
    libxml compiled : (2, 7, 8)
    libxslt used : (1, 1, 26)
    libxslt compiled : (1, 1, 26)

And similar things happen on Mac OS X 10.9.1. In OS X, I have installed pypy, libxml2, and libxslt using homebrew. Installing lxml:

    $ pip install lxml
    Downloading/unpacking lxml
      Downloading lxml-3.3.0.tar.gz (3.4MB): 3.4MB downloaded
      Running setup.py (path:/Users/choongmin/.virtualenvs/pypy/build/lxml/setup.py) egg_info for package lxml
        /usr/local/Cellar/pypy/2.2.1/libexec/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.28

        warning: no previously-included files found matching '*.py'
    Installing collected packages: lxml
      Running setup.py install for lxml
        /usr/local/Cellar/pypy/2.2.1/libexec/lib-python/2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
          warnings.warn(msg)
        Building lxml version 3.3.0.
        Building without Cython.
        Using build configuration of libxslt 1.1.28
        warning: build_py: byte-compiling is disabled, skipping.

        building 'lxml.etree' extension
        cc -O2 -fPIC -Wimplicit -arch i386 -arch x86_64 -I/usr/include/libxml2 -I/Users/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/Users/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.etree.c -o build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.etree.o -w -flat_namespace
        cc -shared -undefined dynamic_lookup -arch i386 -arch x86_64 build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.etree.o -lxslt -lexslt -lxml2 -lz -lm -o build/lib.macosx-10.9-x86_64-2.7/lxml/etree.pypy-22.so
        building 'lxml.objectify' extension
        cc -O2 -fPIC -Wimplicit -arch i386 -arch x86_64 -I/usr/include/libxml2 -I/Users/choongmin/.virtualenvs/pypy/build/lxml/src/lxml/includes -I/Users/choongmin/.virtualenvs/pypy/include -c src/lxml/lxml.objectify.c -o build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.objectify.o -w -flat_namespace
        cc -shared -undefined dynamic_lookup -arch i386 -arch x86_64 build/temp.macosx-10.9-x86_64-2.7/src/lxml/lxml.objectify.o -lxslt -lexslt -lxml2 -lz -lm -o build/lib.macosx-10.9-x86_64-2.7/lxml/objectify.pypy-22.so
        warning: install_lib: byte-compiling is disabled, skipping.

    Successfully installed lxml
    Cleaning up...

Then import lxml.etree:

    $ python
    Python 2.7.3 (87aa9de10f9c, Nov 24 2013, 20:57:21)
    [PyPy 2.2.1 with GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    And now for something completely different: ``and now for something completely
    different''
    >>>> import lxml
    >>>> lxml.etree
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'module' object has no attribute 'etree'
    >>>> import lxml.etree
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: unable to load extension module '/Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so': dlopen(/Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so, 6): Symbol not found: _PyByteArray_AS_STRING
      Referenced from: /Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so
      Expected in: flat namespace
     in /Users/choongmin/.virtualenvs/pypy/site-packages/lxml/etree.pypy-22.so
    >>>>

It works with python 2.7.6. This is the version information on OS X:

    Python : sys.version_info(major=2, minor=7, micro=6, releaselevel='final', serial=0)
    lxml.etree : (3, 3, 0, 0)
    libxml used : (2, 9, 0)
    libxml compiled : (2, 9, 0)
    libxslt used : (1, 1, 28)
    libxslt compiled : (1, 1, 28)

scoder (scoder) wrote :

PyPy isn't currently tested, mostly because I don't have a CI server with a running PyPy installation (nor do I really care much about PyPy myself).

In any case, PyPy's C-API emulation isn't great (to put it *really* friendly), so this doesn't come unexpected. I've pushed a couple of fixes into Cython but can't test them. One is this:

"""
     } else
 #endif /* __PYX_DEFAULT_STRING_ENCODING_IS_ASCII || __PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT */

+#if !CYTHON_COMPILING_IN_PYPY
 #if PY_VERSION_HEX >= 0x02060000
     if (PyByteArray_Check(o)) {
         *length = PyByteArray_GET_SIZE(o);
         return PyByteArray_AS_STRING(o);
     } else
 #endif
+#endif
     {
         char* result;
         int r = PyBytes_AsStringAndSize(o, &result, length);
"""

(i.e. add another preprocessor condition around the Py2.6 test) and the other is to replace all occurrences of "PyUnicode_Concat" by "PyNumber_Add" in the code. Both of these need to be done in lxml.etree.c and lxml.objectify.c.

Obviously, the correct place to fix these would be PyPy, but at least the work-arounds aren't difficult to implement.

scoder (scoder) wrote :

I've uploaded a source distro here, would be nice if you could test it.

http://lxml.de/files/lxml-3.3.1pre.tar.gz

scoder (scoder) wrote :

Works for me now.

Changed in lxml:
assignee: nobody → scoder (scoder)
importance: Undecided → Low
milestone: none → 3.3
status: New → Fix Committed
Choongmin Lee (choongmin) wrote :

It works for me too. Thanks!

scoder (scoder) wrote :

Fixed in lxml 3.3.1.

Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers