Heap corruption (0xC0000374) during xmlFreeNodeList

Bug #1773749 reported by Alexander Weggerle
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Fix Released
High
Unassigned

Bug Description

Python crash with heap corruption/HEAP_FAILURE_INVALID_ARGUMENT during garbage collection. This only appears in a larger application. There the issue is 100% reproducible. I was not successful to strip a reproducer out of the application. Changing also non XML related code let the problem disappear. Maybe it's an issue with the Python memory management or multi threading?
XML File is below. Basically it's loaded and a new node is inserted before <test>. This is distributed over several classes. Moving the code in a single file let the crash also disappear.
I'm a bit stuck how to debug this further.

WinDBG Session (full session attached):
=======================================
!heap
**************************************************************
* *
* HEAP ERROR DETECTED *
* *
**************************************************************

Details:

Heap address: 00000000014d0000
Error address: 000000000e756fdc
Error type: HEAP_FAILURE_INVALID_ARGUMENT
Details: The caller tried to a free a block at an invalid
            (unaligned) address.
Follow-up: Check the error's stack trace to find the culprit.

Stack trace:
                00007ffc9ce4c8d3: ntdll!RtlFreeHeap+0x0000000000000143
                0000000059cfcabc: msvcr90!free+0x000000000000001c
                000000018016b2f5: etree!xmlFreeNodeList+0x0000000000000155
                000000018016b240: etree!xmlFreeNodeList+0x00000000000000a0
                000000018016b240: etree!xmlFreeNodeList+0x00000000000000a0
                000000018016b240: etree!xmlFreeNodeList+0x00000000000000a0
                000000018016b240: etree!xmlFreeNodeList+0x00000000000000a0
                000000018016b240: etree!xmlFreeNodeList+0x00000000000000a0
                000000018016b240: etree!xmlFreeNodeList+0x00000000000000a0
                000000018016ae54: etree!xmlFreeDoc+0x00000000000000d4
                000000018005c634: etree!__pyx_tp_dealloc_4lxml_5etree__Document+0x0000000000000034
                00000001800dc425: etree!__pyx_tp_dealloc_4lxml_5etree__Element+0x0000000000000065
                0000000058d1eedb: python27!dict_dealloc+0x00000000000000db
                0000000058d62e30: python27!subtype_dealloc+0x00000000000002f0
                0000000058d1eedb: python27!dict_dealloc+0x00000000000000db
                0000000058d62b10: python27!subtype_clear+0x0000000000000130

        Heap Address NT/Segment Heap

             1510000 NT Heap
              f30000 NT Heap
             33e0000 NT Heap
             14d0000 NT Heap
             c7c0000 NT Heap
             c740000 NT Heap
            1e070000 NT Heap

Version Info:
=============
Python : sys.version_info(major=2, minor=7, micro=14, releaselevel='final', serial=0)
lxml.etree : (4, 2, 1, 0)
libxml used : (2, 9, 5)
libxml compiled : (2, 9, 5)
libxslt used : (1, 1, 30)
libxslt compiled : (1, 1, 30)

XML File:
=========
<?xml version="1.0" encoding="UTF-8"?>
<assistant module="9" name="" key="cleaning">
    <step text="" key="">
        <test></test>
    </step>
</assistant>

Revision history for this message
Alexander Weggerle (weggerlea) wrote :
Revision history for this message
Alexander Weggerle (weggerlea) wrote :

Tracked down the root cause to be a node name which is moved between two documents where the name of the node is still in the dictionary of the old document. As soon as the new document gets freed again it tries to free the memory for the node name. As the node name is not stored in the new documents dictionary the OS is called to free the memory. This results in a crash as the specific memory address is part of a larger allocated memory pool and the address of the name has an invalid alignment.

It looks for me as the etree.insert function is not handling the move of the element between the documents correctly. Opened pull request: https://github.com/lxml/lxml/pull/268 which fixes the issue

scoder (scoder)
Changed in lxml:
importance: Undecided → High
status: New → Fix Committed
milestone: none → 4.2.4
scoder (scoder)
Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.