Python crashes when setting elem.text during etree.iterparse
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to always crash on parsing certain files. (check attached file for an illustration)
The crash can be reproduced on at least 2 computers running Windows 7 SP1.
Removing the setting of elem.text (Line 17-18 in script.py of the attached file) seems to stop the crash.
--
script.py (same as the attached file)
--
#!/usr/bin/env python3
import os
import platform
import traceback
import re
import lxml.etree as etree
def main():
fsrc = 'data.xml'
for event, elem in etree.iterparse
if event == 'start':
if re.search(
if elem.text is not None:
elif event == 'end':
if re.search(
if elem.tail is not None:
if __name__ == "__main__":
if platform.system() == 'Windows' and not 'PROMPT' in os.environ:
try:
main()
except Exception:
else:
main()
--
Python : sys.version_
lxml.etree : (4, 1, 1, 0)
libxml used : (2, 9, 5)
libxml compiled : (2, 9, 5)
libxslt used : (1, 1, 30)
libxslt compiled : (1, 1, 30)
description: | updated |
description: | updated |
description: | updated |
summary: |
- Python crashes when setting elem.text or elem.tail during - etree.iterparse + Python crashes when setting elem.text during etree.iterparse |
I agree that it shouldn't crash, but this is difficult to prevent and your usage example is explicitly forbidden in the docs.
http:// lxml.de/ parsing. html#modifying- the-tree