Lxml segfault

Bug #183791 reported by Konrad Wojas
2
Affects Status Importance Assigned to Milestone
lxml
Won't Fix
Medium
Unassigned

Bug Description

We are building a Zope3 application that uses lxml for XSLT transformations.
Recently we discovered a weird bug, which we haven't been able to isolate. The server
segfaults on a specific page we use for diagnostics. On this page the following trivial
XSLT transformation is used to convert a label with markup into plain text:

_flatten_sheet = e.XSLT(e.fromstring('''
    <xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      ><xsl:output method="text" encoding="UTF-8" /></xsl:stylesheet>
'''))

Not very efficient, but it was a work in progress. This sheet is called many times (about 1000)
for all labels that are matched in an XML file. If a label is matched, the corresponding node
is passed through this style sheet. One node can be transformed multiple times.

If we run this code, the process segfaults. If we make a deepcopy() of each node before passing
it for transformation, no segfault occurs.

We ran the Zope process through valgrind and received the following interesting output:

==31919== Invalid read of size 4
==31919== at 0x74B29A4: xmlFreeNodeList (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x74B29E5: xmlFreeNodeList (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x74B29E5: xmlFreeNodeList (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x74B280D: xmlFreeDoc (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x73CE79C: __pyx_tp_dealloc_5etree__Document (etree.c:2494)
==31919== by 0x73D3450: __pyx_tp_dealloc_5etree__Element (etree.c:11530)
==31919== by 0x80F337E: (within /data/usr/bin/python2.5)
==31919== by 0x80F3838: _PyObject_GC_Malloc (in /data/usr/bin/python2.5)
==31919== by 0x809B55C: PyType_GenericAlloc (in /data/usr/bin/python2.5)
==31919== by 0x73933B5: __pyx_tp_new_5etree__Element (etree.c:40345)
==31919== by 0x73AFDFB: __pyx_f_5etree__elementFactory (etree.c:6738)
==31919== by 0x73D4969: __pyx_f_5etree__makeElement (etree.c:12905)
==31919== Address 0x702F73BE is not stack'd, malloc'd or (recently) free'd
==31919==
==31919== Process terminating with default action of signal 11 (SIGSEGV)
==31919== Access not within mapped region at address 0x702F73BE
==31919== at 0x74B29A4: xmlFreeNodeList (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x74B29E5: xmlFreeNodeList (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x74B29E5: xmlFreeNodeList (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x74B280D: xmlFreeDoc (in /data/usr/lib/libxml2.so.2.6.27)
==31919== by 0x73CE79C: __pyx_tp_dealloc_5etree__Document (etree.c:2494)
==31919== by 0x73D3450: __pyx_tp_dealloc_5etree__Element (etree.c:11530)
==31919== by 0x80F337E: (within /data/usr/bin/python2.5)
==31919== by 0x80F3838: _PyObject_GC_Malloc (in /data/usr/bin/python2.5)
==31919== by 0x809B55C: PyType_GenericAlloc (in /data/usr/bin/python2.5)
==31919== by 0x73933B5: __pyx_tp_new_5etree__Element (etree.c:40345)
==31919== by 0x73AFDFB: __pyx_f_5etree__elementFactory (etree.c:6738)
==31919== by 0x73D4969: __pyx_f_5etree__makeElement (etree.c:12905)

We weren't able to reproduce this outside of Zope, but valgrind still gives some output about
lxml which I'm not really able to interpret. It's attached to this message.

This behaviour has been seen on 32bit Ubuntu, 64bit Ubuntu and OSX, so it seems to be platform
independent. The output is from lxml 1.3.6 + python 2.5 on Ubuntu, installed from an egg.

Any clue? Any suggestions as to how to isolate this problem? We can work around it, but it's
nasty.

Revision history for this message
scoder (scoder) wrote :
Revision history for this message
scoder (scoder) wrote :

Sorry, you actually stated that it was 1.3.6. Could you check if it still occurs with 2.0beta1?

Revision history for this message
scoder (scoder) wrote :

Also, using lxml 2.0, the above stylesheet can be replaced by a call to 'tostring(..., method="text")', which would provide an easy work-around.

Revision history for this message
Konrad Wojas (kwojas) wrote : Re: [Bug 183791] Re: Lxml segfault

On Thu, Jan 17, 2008 at 06:20:43PM -0000, Stefan Behnel wrote:
> Sorry, you actually stated that it was 1.3.6. Could you check if it
> still occurs with 2.0beta1?

Hi Stefan,

Sorry for the late reaction. I will not be able to test it before
Tuesday. Thanks for the swift response.

Cheers,
--
Konrad Wojas

Revision history for this message
Konrad Wojas (kwojas) wrote :

I've tested it with lxml 2.0alpha1, the problem does not occur with this version.

Revision history for this message
scoder (scoder) wrote :

Ok, that's good to know. I can't promise that 1.3 will ever be fixed, but once 2.0 final is out, I'll go through the changes to find back-portable fixes.

Changed in lxml:
importance: Undecided → Medium
status: New → In Progress
scoder (scoder)
Changed in lxml:
milestone: none → 1.3.7
Revision history for this message
scoder (scoder) wrote :

Threading related bugs in lxml 1.3 will not be fixed. The lxml 2.1 release series provides a safe replacement.

Changed in lxml:
milestone: 1.3.7 → none
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.