parse errors in generic entityies are not properly reported

Bug #1906643 reported by Cardinal Kracker
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

Somethings seems to be wrong with the parser when generic entities
are involved:

This program does not throw an error unless there is
also an error in the root document:

*** getloc.py
import sys, os
from lxml import etree
os.environ["XML_CATALOG_FILES"]="catalog.xml"
parser = etree.XMLParser(dtd_validation=True,load_dtd=True)
with open(sys.argv[1],"rb") as f:
  tree = etree.parse( f, parser )

*** book.dtd
<!ELEMENT book (chapter)* >
<!ELEMENT chapter (#PCDATA|b)* >
<!ELEMENT b (#PCDATA) >
<!ATTLIST chapter nr CDATA #REQUIRED >

*** catalog.xml
<?xml version="1.0"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD XML Catalogs V1.0//EN"
  "file:///usr/share/xml/schema/xml-core/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<public publicId="-//TEST//DTD BOOK//EN" uri="book.dtd"/>
</catalog>

*** part1.xml
<chapter nr="1">
<x/>
</chapter>
<chapter nr="2">
</chapter>

Running the program with this root document runs errorneously fine
*** book-a.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book PUBLIC "-//TEST//DTD BOOK//EN" "book.dtd" [
<!ENTITY part1 SYSTEM "part1.xml">
]>
<book>
&part1;
<chapter nr="3">
</chapter>

This one drops the error from part1:
*** book-b.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book PUBLIC "-//TEST//DTD BOOK//EN" "book.dtd" [
<!ENTITY part1 SYSTEM "part1.xml">
]>
<book>
&part1;
<chapter nr="3">
<z/>
</chapter>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.