Error log locations are zeroed out

Bug #1756920 reported by Roma Klapaukh on 2018-03-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Undecided
Unassigned

Bug Description

Errors in the context error_log may have their line and column set to zero (and the path is None) when reporting on errors from a schema.

This error occurs in python but not in xmllint.

OS info:
$ uname -a
Darwin Nix.local 17.4.0 Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64 x86_64

Requested information:
>>> print("%-20s: %s" % ('Python', sys.version_info))
Python : sys.version_info(major=3, minor=6, micro=3, releaselevel='final', serial=0)
>>> print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION))
lxml.etree : (4, 2, 0, 0)
>>> print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION))
libxml used : (2, 9, 8)
>>> print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION))
libxml compiled : (2, 9, 8)
>>> print("%-20s: %s" % ('libxslt used', etree.LIBXSLT_VERSION))
libxslt used : (1, 1, 32)
>>> print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_COMPILED_VERSION))
libxslt compiled : (1, 1, 32)

Files / code to reproduce:

test.py
--------
#!/usr/bin/env python3

from lxml.etree import parse, XMLSchema, iterparse, XMLSyntaxError

xml_file = 'test.xml'
xsd_file = 'library.xsd'

xsd_document = parse(xsd_file)
schema = XMLSchema(xsd_document)

context = iterparse(xml_file, schema=schema)

try:
    for _, elem in context:
        pass
except XMLSyntaxError as error:
    for error in context.error_log:
        print('At', error.line, ':', error.column,'(', error.path, ')', error.message)

library.xsd
------------
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:d="http://test.com/library"
   xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
   targetNamespace="http://test.com/library">
   <xs:element name="document">
         <xs:complexType>
         <xs:sequence>
            <xs:element name="metadata" type="xs:string"/>
         </xs:sequence>
      </xs:complexType>
    </xs:element>
</xs:schema>

test.xml
---------
<?xml version="1.0"?>
<d:document xmlns:d="http://test.com/library" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://test.com/library">
  <d:dog/>
</d:document>

Python output:
---------------
$ ./test.py
At 0 : 0 ( None ) Element '{http://test.com/library}dog': This element is not expected. Expected is ( {http://test.com/library}metadata ).

^^ Note that the line is set to 0 rather than 3

xmllint output:
----------------
$ xmllint --version
xmllint: using libxml version 20905
   compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib Lzma

$ xmllint --noout --schema library.xsd test.xml
test.xml:3: element dog: Schemas validity error : Element '{http://test.com/library}dog': This element is not expected. Expected is ( {http://test.com/library}metadata ).
test.xml fails to validate

scoder (scoder) wrote :

Probably worth investigating what xmllint does differently here. Or just put a print statement in xmlerror.pxi to see what exact error information libxml2 reports here, and if there is anything else that can be extracted from it.

Changed in lxml:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers