Python crash on comment validation

Bug #1255132 reported by Jan Stracinsky on 2013-11-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
High
scoder

Bug Description

If you attempt to validate Comment instance (for example when iterating through tree parts), entire Python process crashes.

Tested on:
Python : (2, 6, 6, 'final', 0)
lxml.etree : (3, 2, 3, 0)
libxml used : (2, 9, 0)
libxml compiled : (2, 9, 0)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)

And can be replicated with:

    from lxml import etree

    schemaString = """<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
      <xs:element name="root">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="element1" type="xs:string"/>
            <xs:element name="element2" type="xs:string"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:schema>"""

    treeString = """<root>
       <element1/>
       <!--comment-->
       <element2/>
    </root>"""

    schemaTree = etree.fromstring(schemaString)
    schema = etree.XMLSchema(schemaTree)
    tree = etree.fromstring(treeString)

    for part in tree:
        print "Validating part '%s'.." % part
        isPartValid = schema.validate(part)
        print "Part valid - %s." % isPartValid

Last output message that can be seen is "Validating part '<!--comment-->'..", after which Python process crashes.

Note that the XSD in this example is not quite compatible with validating the tree part by part (as it would instead make sense to validate it by passing the whole tree), but this is just simplified example that can reproduce the problem.

scoder (scoder) wrote :

Thanks for the report and the excellent test case. Here is a fix:

diff -r 74c41e0ab944 src/lxml/apihelpers.pxi
--- a/src/lxml/apihelpers.pxi Fri Nov 15 16:15:42 2013 +0100
+++ b/src/lxml/apihelpers.pxi Thu Nov 28 18:08:22 2013 +0100
@@ -58,7 +58,7 @@
     else:
         raise TypeError, u"Invalid input object: %s" % \
             python._fqtypename(input)
- if node is None:
+ if node is None or node._c_node.type != tree.XML_ELEMENT_NODE:
         raise ValueError, u"Input object has no element: %s" % \
             python._fqtypename(input)
     _assertValidNode(node)

scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → scoder (scoder)
importance: Undecided → High
status: New → Fix Committed
Jan Stracinsky (jstracinsky963) wrote :

Thank you for fixing the issue so quickly and for the attached patch.

scoder (scoder) wrote :

Fixed in lxml 3.2.5.

Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers