iterparse cannot load entity definitions from local dtd file

Bug #1457230 reported by six degree
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Fix Released
Medium
six degree

Bug Description

$python ./iterparse_bug.py
Python : sys.version_info(major=2, minor=7, micro=9, releaselevel='final', serial=0)
lxml.etree : (3, 4, 4, 0)
libxml used : (2, 9, 0)
libxml compiled : (2, 9, 0)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)
end: leaf
end: root
Traceback (most recent call last):
  File "./iterparse_bug.py", line 14, in <module>
    for action, elem in context:
  File "iterparse.pxi", line 208, in lxml.etree.iterparse.__next__ (src/lxml/lxml.etree.c:131498)
lxml.etree.XMLSyntaxError: Entity 'bar' not defined, line 4, column 14

The python script, the xml and the dtd are attached. Since those files are really simple, I'll also briefly show their contents here.

The main part of the python script looks like:

context = etree.iterparse("foo.xml",load_dtd=True)
for action, elem in context:
    print("%s: %s" % (action, elem.tag))

The xml file is:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo SYSTEM "foo.dtd">
<root>
  <leaf>&bar;</leaf>
</root>

The dtd file is:
<!ELEMENT root (leaf)*>
<!ELEMENT leaf (#PCDATA)>
<!ENTITY bar "&#246;" >

I've tried to validate the xml with xmllint (xmllint --dtdvalid foo.dtd --noout foo.xml), and it was successful.
Also, when I use the following python code

import lxml.etree as etree
parser = etree.XMLParser(load_dtd=True)
etree.parse("foo.xml",parser)

it was also successful.

So it seems that the problem is with the iterparse function.

Revision history for this message
six degree (sixdegreepub) wrote :
description: updated
description: updated
Changed in lxml:
status: New → Fix Released
Revision history for this message
scoder (scoder) wrote :

Does "Fix released" mean that you found a version (3.6?) in which this is fixed? Can this ticket be closed then?

scoder (scoder)
Changed in lxml:
milestone: none → 3.6.1
importance: Undecided → Medium
Revision history for this message
scoder (scoder) wrote :

Ah, sorry. Was confused. Yes, it's obviously fixed in 3.6.1.

Changed in lxml:
assignee: nobody → six degree (sixdegreepub)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.