Comment 3 for bug 673205

Revision history for this message
Søren Bech Christensen (sbc-x) wrote :

I have this same issue on my Windows/cygwin environment using the following construct to parse an xhtml document:

parser = etree.XMLParser(load_dtd = True, dtd_validation = True, remove_blank_text=True, attribute_defaults = True)
html = etree.parse(inputhtmlfile,parser)

returns:

Python : (2, 6, 5, 'final', 0)
lxml.etree : (2, 2, 6, 0)
libxml used : (2, 7, 7)
libxml compiled : (2, 7, 7)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 26)
Traceback (most recent call last):
  File "generatehtml.py", line 61, in <module>
    html = etree.parse(inputhtmlfile,parser)
  File "lxml.etree.pyx", line 2706, in lxml.etree.parse (src/lxml/lxml.etree.c:49958)
  File "parser.pxi", line 1500, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71797)
  File "parser.pxi", line 1529, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:72080)
  File "parser.pxi", line 1429, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:71175)
  File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:68173)
  File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:64257)
  File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:65178)
  File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521)
lxml.etree.XMLSyntaxError: Attempt to load network entity http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

Whereas on my Debian environment, with an older installation, there is no problem doing the same:

Python : (2, 5, 2, 'final', 0)
lxml.etree : (2, 1, 1, 0)
libxml used : (2, 6, 32)
libxml compiled : (2, 6, 32)
libxslt used : (1, 1, 24)
libxslt compiled : (1, 1, 24)