Yes, we use incremental parsing because some of the files can be quite big.
You get a clearer error when using "fromstring" which is why I used it and it looks like the BOM is for UTF-16 despite the declared encoding of UTF-8
The code and error with iterparse:
it = iterparse("Issues/bug260/xl/worksheets/sheet1.xml")
<lxml.etree.iterparse object at 0x10d865b90>
for e, t in it: print e
Traceback (most recent call last):
File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 1, in <module>
# Used internally for debug sandbox under external interpreter
File "/Users/charlieclark/Projects/openpyxl/lib/python2.7/site-packages/lxml/etree.so", line 179, in lxml.etree.iterparse.__next__ (src/lxml/lxml.etree.c:124400)
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1
I'll see if I can come up with a workaround for openpyxl. It's a bit tricky because we interface with files inside a zip-archive. But maybe lxml could come up with a nicer error? Close to the one if fromstring is used?
Yes, we use incremental parsing because some of the files can be quite big.
You get a clearer error when using "fromstring" which is why I used it and it looks like the BOM is for UTF-16 despite the declared encoding of UTF-8
The code and error with iterparse:
it = iterparse("Issues/bug260/xl/worksheets/ sheet1. xml") iterparse object at 0x10d865b90>
<lxml.etree.
for e, t in it: print e WingIDE. app/Contents/ MacOS/src/ debug/tserver/ _sandbox. py", line 1, in <module> charlieclark/ Projects/ openpyxl/ lib/python2. 7/site- packages/ lxml/etree. so", line 179, in lxml.etree. iterparse. __next_ _ (src/lxml/ lxml.etree. c:124400) XMLSyntaxError: Document is empty, line 1, column 1
Traceback (most recent call last):
File "/Applications/
# Used internally for debug sandbox under external interpreter
File "/Users/
lxml.etree.
I'll see if I can come up with a workaround for openpyxl. It's a bit tricky because we interface with files inside a zip-archive. But maybe lxml could come up with a nicer error? Close to the one if fromstring is used?