element with xmlns attribute is not rendered properly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
beautifulsoup4 (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
lxml (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
lxml 4.4.0 introduces an issue with the following XML:
<?xml version="1.0" encoding="utf-8"?>
<NAMM_PO version="2009.2" xmlns="http://
<Id>TEST_ID</Id>
<NAMM_PO>
This is the output XML:
<?xml version="1.0" encoding="utf-8"?>
<NAMM_PO version="2009.2" xmlns:="http://
<Id>TEST_ID</Id>
<NAMM_PO/
Note the missing closing tag of the NAMM_PO element, the additional NAMM_PO element, and the addition of a colon to the xmlns attribute.
Version info:
Python : sys.version_
lxml.etree : (4, 4, 0, 0)
libxml used : (2, 9, 9)
libxml compiled : (2, 9, 9)
libxslt used : (1, 1, 33)
libxslt compiled : (1, 1, 33)
To repro:
Open and unarchive attached 'lxml_4_
The input is not well-formed XML, so I guess you used the "recover" option to parse it at all.
Just reject invalid input instead.