etree.fromstring with a UTF-32 encoded string fails

Bug #1703810 reported by Dale P
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fix Released

Bug Description

I am trying to pass a UTF-32 encoded string (with the XML encoding declaration) to lxml.etree.fromstring This raises an lxml.etree.XMLSyntaxError exception (Document is empty).

Is this expected? Following the same process for all other encodings that I have tested works fine (UTF-8, UTF-16, ASCII, ISO-8859-1, ISO-8859-2, BIG5, EUC-JP).

I have not tested this in libxml2 as I do not know how to.

Python : sys.version_info(major=3, minor=6, micro=0, releaselevel='final', serial=0)
lxml.etree : (3, 8, 0, 0)
libxml used : (2, 9, 4)
libxml compiled : (2, 9, 4)
libxslt used : (1, 1, 29)
libxslt compiled : (1, 1, 29)

from lxml import etree
foo = """<?xml version='1.0' encoding='utf-32'?>\n<tag attrib='123'></tag>"""
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "src/lxml/lxml.etree.pyx", line 3228, in lxml.etree.fromstring (src/lxml/lxml.etree.c:79594)
  File "src/lxml/parser.pxi", line 1848, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:119113)
  File "src/lxml/parser.pxi", line 1736, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:117793)
  File "src/lxml/parser.pxi", line 1102, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:112037)
  File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105881)
  File "src/lxml/parser.pxi", line 706, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:107589)
  File "src/lxml/parser.pxi", line 635, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:106443)
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1

scoder (scoder)
Changed in lxml:
importance: Undecided → Low
status: New → Confirmed
Revision history for this message
scoder (scoder) wrote :
Changed in lxml:
milestone: none → 3.9.0
status: Confirmed → Fix Committed
scoder (scoder)
Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.