Python crash on comment validation

Bug #1255132 reported by Jan Stracinsky
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Fix Released
High
scoder

Bug Description

If you attempt to validate Comment instance (for example when iterating through tree parts), entire Python process crashes.

Tested on:
Python : (2, 6, 6, 'final', 0)
lxml.etree : (3, 2, 3, 0)
libxml used : (2, 9, 0)
libxml compiled : (2, 9, 0)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)

And can be replicated with:

    from lxml import etree

    schemaString = """<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
      <xs:element name="root">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="element1" type="xs:string"/>
            <xs:element name="element2" type="xs:string"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:schema>"""

    treeString = """<root>
       <element1/>
       <!--comment-->
       <element2/>
    </root>"""

    schemaTree = etree.fromstring(schemaString)
    schema = etree.XMLSchema(schemaTree)
    tree = etree.fromstring(treeString)

    for part in tree:
        print "Validating part '%s'.." % part
        isPartValid = schema.validate(part)
        print "Part valid - %s." % isPartValid

Last output message that can be seen is "Validating part '<!--comment-->'..", after which Python process crashes.

Note that the XSD in this example is not quite compatible with validating the tree part by part (as it would instead make sense to validate it by passing the whole tree), but this is just simplified example that can reproduce the problem.

Revision history for this message
scoder (scoder) wrote :

Thanks for the report and the excellent test case. Here is a fix:

diff -r 74c41e0ab944 src/lxml/apihelpers.pxi
--- a/src/lxml/apihelpers.pxi Fri Nov 15 16:15:42 2013 +0100
+++ b/src/lxml/apihelpers.pxi Thu Nov 28 18:08:22 2013 +0100
@@ -58,7 +58,7 @@
     else:
         raise TypeError, u"Invalid input object: %s" % \
             python._fqtypename(input)
- if node is None:
+ if node is None or node._c_node.type != tree.XML_ELEMENT_NODE:
         raise ValueError, u"Input object has no element: %s" % \
             python._fqtypename(input)
     _assertValidNode(node)

Revision history for this message
scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → scoder (scoder)
importance: Undecided → High
status: New → Fix Committed
Revision history for this message
Jan Stracinsky (jstracinsky963) wrote :

Thank you for fixing the issue so quickly and for the attached patch.

Revision history for this message
scoder (scoder) wrote :

Fixed in lxml 3.2.5.

Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.