namespace map argument of target's start method in lxml.etree.XMLParser changes between 4.3.5 and 4.4.0
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
New
|
Undecided
|
Unassigned |
Bug Description
Please see my findings at https:/
test.py prints the contents of the namespace map.
run.sh runs the test on the 2 versions between which the change was introduced.
Dockerfile is needed by run.sh.
Relevant part of attached test.py:
DFXP_BASE_MARKUP = '''
<tt xmlns="http://
xmlns:tts="http://
'''
from lxml import etree
class X(object):
def start(self, name, attrs, nsmap={}):
if nsmap:
p = etree.XMLParser
p.feed(
My output from run.sh (results.txt) is:
Python : sys.version_
lxml.etree : (4, 4, 0, 0)
libxml used : (2, 9, 9)
libxml compiled : (2, 9, 9)
libxslt used : (1, 1, 33)
libxslt compiled : (1, 1, 33)
{'': 'http://
---
Python : sys.version_
lxml.etree : (4, 3, 5, 0)
libxml used : (2, 9, 9)
libxml compiled : (2, 9, 9)
libxslt used : (1, 1, 33)
libxslt compiled : (1, 1, 33)
{None: 'http://
---
I consider this a potential bug, because it breaks compatibility with existing software and I couldn't find the expected format of the namespace map in the docs or release notes. See for example beautifulsoup4 and the following test:
#!/usr/bin/env python3
DFXP_BASE_MARKUP = '''
<tt xmlns="http://
'''
from bs4 import BeautifulSoup
b = BeautifulSoup(
print(b.prettify())
On lxml-4.4.0 it produces invalid XML.
On lxml-4.3.5 the output is valid.
description: | updated |