python crash or exception during XPath query

Bug #1071450 reported by Anselm Kruis on 2012-10-25
This bug affects 1 person
Affects Status Importance Assigned to Milestone

Bug Description

If a program executes a XPath query and deletes nodes that are members of the XPath result set at the very same time, a python crash or an exception can occur.

I attached two scripts, to demonstrate this issue. The first script is single threaded and uses a python xpath extension function to delete a node, that is part of the xpath result set. This example is simple to analyse, but not likely to occur in reality. The second example uses two threads. It is more realistic, and reproduces the situation in a real world application.

For reference
Python : sys.version_info(major=2, minor=7, micro=2, releaselevel='final', serial=0)
lxml.etree : (2, 3, 3, 0)
libxml used : (2, 7, 8)
libxml compiled : (2, 7, 8)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 26)

Details to reproduce the bug:

Expected result:
- Output OK or
- Python crash (usually a segmentation fault (Unix) or access violation (Windows)
- Exception:
  File "lxml.etree.pyx", line 1447, in lxml.etree._Element.xpath (src/lxml/lxml.etree.c:41736)
  File "xpath.pxi", line 321, in lxml.etree.XPathElementEvaluator.__call__ (src/lxml/lxml.etree.c:117877)
  File "xpath.pxi", line 242, in lxml.etree._XPathEvaluatorBase._handle_result (src/lxml/lxml.etree.c:117079)
  File "extensions.pxi", line 546, in lxml.etree._unwrapXPathObject (src/lxml/lxml.etree.c:112539)
  File "extensions.pxi", line 580, in lxml.etree._createNodeSetResult (src/lxml/lxml.etree.c:112912)
  File "extensions.pxi", line 627, in lxml.etree._unpackNodeSetEntry (src/lxml/lxml.etree.c:113340)
NotImplementedError: Not yet implemented result node type: 509144904

Security Issues
This bug is probably a typical "access after free" issue. I don't think it is exploitable, but experts should look at it.

Anselm Kruis (a-kruis) wrote :
Anselm Kruis (a-kruis) wrote :

This is how I discovered the problem.

scoder (scoder) wrote :

Yes, it's currently a bad idea to modify a tree while it's beeing traversed by other threads. I'm planning to eventually implement a safe locking scheme for the trees but haven't got around to finish it up. (Time, money, and all that Jazz).


Changed in lxml:
importance: Undecided → High
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers