Heap corruption (0xC0000374) during xmlFreeNodeList
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Fix Released
|
High
|
Unassigned |
Bug Description
Python crash with heap corruption/
XML File is below. Basically it's loaded and a new node is inserted before <test>. This is distributed over several classes. Moving the code in a single file let the crash also disappear.
I'm a bit stuck how to debug this further.
WinDBG Session (full session attached):
=======
!heap
*******
* *
* HEAP ERROR DETECTED *
* *
*******
Details:
Heap address: 00000000014d0000
Error address: 000000000e756fdc
Error type: HEAP_FAILURE_
Details: The caller tried to a free a block at an invalid
Follow-up: Check the error's stack trace to find the culprit.
Stack trace:
Heap Address NT/Segment Heap
Version Info:
=============
Python : sys.version_
lxml.etree : (4, 2, 1, 0)
libxml used : (2, 9, 5)
libxml compiled : (2, 9, 5)
libxslt used : (1, 1, 30)
libxslt compiled : (1, 1, 30)
XML File:
=========
<?xml version="1.0" encoding="UTF-8"?>
<assistant module="9" name="" key="cleaning">
<step text="" key="">
</step>
</assistant>
Changed in lxml: | |
importance: | Undecided → High |
status: | New → Fix Committed |
milestone: | none → 4.2.4 |
Changed in lxml: | |
status: | Fix Committed → Fix Released |
Tracked down the root cause to be a node name which is moved between two documents where the name of the node is still in the dictionary of the old document. As soon as the new document gets freed again it tries to free the memory for the node name. As the node name is not stored in the new documents dictionary the OS is called to free the memory. This results in a crash as the specific memory address is part of a larger allocated memory pool and the address of the name has an invalid alignment.
It looks for me as the etree.insert function is not handling the move of the element between the documents correctly. Opened pull request: https:/ /github. com/lxml/ lxml/pull/ 268 which fixes the issue