Activity log for bug #1743420

Date Who What changed Old value New value Message
2018-01-15 16:43:36 danny0838 bug added bug
2018-01-15 16:43:36 danny0838 attachment added crash demo https://bugs.launchpad.net/bugs/1743420/+attachment/5037413/+files/crash.zip
2018-01-15 16:47:53 danny0838 description I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to crash for certain file. (check attached file for an illustration) Removing the setting of elem.text (Line 17-18 in script.py of the attached file) stops the crash. -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30) I try to run a XML parser which sets elem.text during an iterparse, and Python crashes when parsing certain files. (check attached file for an illustration) This crash can be reproduced on at least 2 computers running Windows 7 SP1. Removing the setting of elem.text (Line 17-18 in script.py of the attached file) seems to stop the crash. -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30)
2018-01-15 16:49:51 danny0838 description I try to run a XML parser which sets elem.text during an iterparse, and Python crashes when parsing certain files. (check attached file for an illustration) This crash can be reproduced on at least 2 computers running Windows 7 SP1. Removing the setting of elem.text (Line 17-18 in script.py of the attached file) seems to stop the crash. -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30) I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to crash for certain file. (check attached file for an illustration) Removing the setting of elem.text (Line 17-18 in script.py of the attached file) stops the crash. -- script.py (same as the attached file) -- #!/usr/bin/env python3 import os import platform import traceback import re import lxml.etree as etree def main(): fsrc = 'data.xml' for event, elem in etree.iterparse(fsrc, events=('start', 'end')): print(event, elem.tag, elem.attrib, elem.text, elem.tail) if event == 'start': tag_name = elem.tag if re.search(r'^_.*_$', tag_name): tag_name = tag_name[1:-1] if elem.text is not None: elem.text = re.sub(r'^\n', r'', elem.text) elif event == 'end': tag_name = elem.tag if re.search(r'^_.*_$', tag_name): tag_name = tag_name[1:-1] if elem.tail is not None: elem.tail = re.sub(r'^\n', r'', elem.tail) elem.clear() if __name__ == "__main__": if platform.system() == 'Windows' and not 'PROMPT' in os.environ: try: main() except Exception: traceback.print_exc() os.system('pause') else: main() -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30)
2018-01-15 16:51:21 danny0838 description I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to crash for certain file. (check attached file for an illustration) Removing the setting of elem.text (Line 17-18 in script.py of the attached file) stops the crash. -- script.py (same as the attached file) -- #!/usr/bin/env python3 import os import platform import traceback import re import lxml.etree as etree def main(): fsrc = 'data.xml' for event, elem in etree.iterparse(fsrc, events=('start', 'end')): print(event, elem.tag, elem.attrib, elem.text, elem.tail) if event == 'start': tag_name = elem.tag if re.search(r'^_.*_$', tag_name): tag_name = tag_name[1:-1] if elem.text is not None: elem.text = re.sub(r'^\n', r'', elem.text) elif event == 'end': tag_name = elem.tag if re.search(r'^_.*_$', tag_name): tag_name = tag_name[1:-1] if elem.tail is not None: elem.tail = re.sub(r'^\n', r'', elem.tail) elem.clear() if __name__ == "__main__": if platform.system() == 'Windows' and not 'PROMPT' in os.environ: try: main() except Exception: traceback.print_exc() os.system('pause') else: main() -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30) I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to always crash on parsing certain files. (check attached file for an illustration) The crash can be reproduced on at least 2 computers running Windows 7 SP1. Removing the setting of elem.text (Line 17-18 in script.py of the attached file) stops the crash. -- script.py (same as the attached file) -- #!/usr/bin/env python3 import os import platform import traceback import re import lxml.etree as etree def main():     fsrc = 'data.xml'     for event, elem in etree.iterparse(fsrc, events=('start', 'end')):         print(event, elem.tag, elem.attrib, elem.text, elem.tail)         if event == 'start':             tag_name = elem.tag             if re.search(r'^_.*_$', tag_name):                 tag_name = tag_name[1:-1]                 if elem.text is not None:                     elem.text = re.sub(r'^\n', r'', elem.text)         elif event == 'end':             tag_name = elem.tag             if re.search(r'^_.*_$', tag_name):                 tag_name = tag_name[1:-1]                 if elem.tail is not None:                     elem.tail = re.sub(r'^\n', r'', elem.tail)             elem.clear() if __name__ == "__main__":     if platform.system() == 'Windows' and not 'PROMPT' in os.environ:         try:             main()         except Exception:             traceback.print_exc()         os.system('pause')     else:         main() -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30)
2018-01-15 16:52:01 danny0838 description I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to always crash on parsing certain files. (check attached file for an illustration) The crash can be reproduced on at least 2 computers running Windows 7 SP1. Removing the setting of elem.text (Line 17-18 in script.py of the attached file) stops the crash. -- script.py (same as the attached file) -- #!/usr/bin/env python3 import os import platform import traceback import re import lxml.etree as etree def main():     fsrc = 'data.xml'     for event, elem in etree.iterparse(fsrc, events=('start', 'end')):         print(event, elem.tag, elem.attrib, elem.text, elem.tail)         if event == 'start':             tag_name = elem.tag             if re.search(r'^_.*_$', tag_name):                 tag_name = tag_name[1:-1]                 if elem.text is not None:                     elem.text = re.sub(r'^\n', r'', elem.text)         elif event == 'end':             tag_name = elem.tag             if re.search(r'^_.*_$', tag_name):                 tag_name = tag_name[1:-1]                 if elem.tail is not None:                     elem.tail = re.sub(r'^\n', r'', elem.tail)             elem.clear() if __name__ == "__main__":     if platform.system() == 'Windows' and not 'PROMPT' in os.environ:         try:             main()         except Exception:             traceback.print_exc()         os.system('pause')     else:         main() -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30) I try to run a XML parser which sets elem.text during an iterparse, and it seems to cause the Python to always crash on parsing certain files. (check attached file for an illustration) The crash can be reproduced on at least 2 computers running Windows 7 SP1. Removing the setting of elem.text (Line 17-18 in script.py of the attached file) seems to stop the crash. -- script.py (same as the attached file) -- #!/usr/bin/env python3 import os import platform import traceback import re import lxml.etree as etree def main():     fsrc = 'data.xml'     for event, elem in etree.iterparse(fsrc, events=('start', 'end')):         print(event, elem.tag, elem.attrib, elem.text, elem.tail)         if event == 'start':             tag_name = elem.tag             if re.search(r'^_.*_$', tag_name):                 tag_name = tag_name[1:-1]                 if elem.text is not None:                     elem.text = re.sub(r'^\n', r'', elem.text)         elif event == 'end':             tag_name = elem.tag             if re.search(r'^_.*_$', tag_name):                 tag_name = tag_name[1:-1]                 if elem.tail is not None:                     elem.tail = re.sub(r'^\n', r'', elem.tail)             elem.clear() if __name__ == "__main__":     if platform.system() == 'Windows' and not 'PROMPT' in os.environ:         try:             main()         except Exception:             traceback.print_exc()         os.system('pause')     else:         main() -- Python : sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 1, 1, 0) libxml used : (2, 9, 5) libxml compiled : (2, 9, 5) libxslt used : (1, 1, 30) libxslt compiled : (1, 1, 30)
2018-01-15 16:54:02 danny0838 summary Python crashes when setting elem.text or elem.tail during etree.iterparse Python crashes when setting elem.text during etree.iterparse
2018-01-15 18:01:56 scoder lxml: status New Won't Fix