Resolving entities without a DTD
Bug #267825 reported by
Kovid Goyal
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Confirmed
|
Wishlist
|
Unassigned |
Bug Description
Hi,
My application needs to process XML files that do not have DTD declarations but that contain entities. I am aware that this is not well formed XML, but nonetheless, I need to be able to process the files. Can I inform XMLParser of the entities somehow? Setting resolve_entities to False doesn't work (still raises an undeclared entity error). Setting recover=True causes the entities to be removed from the tree:
etree.tostring(
gives
'<a>12</a>'
etree.LXML_VERSION
(2, 0, 5, 0)
etree.LIBXML_
(2, 6, 32)
To post a comment you must log in.
There isn't currently a way to work around such a broken document.
libxml2 follows the XML spec strictly in that it rejects references to
undeclared entities in the absence of a DTD.
ElementTree lacks DTD support and instead allows you to specify entities
through a parser local "entity" dictionary. lxml could potentially support
a similar interface by intercepting the entity reference resolving at the
SAX layer ("getEntity()" callback function).