namespace map argument of target's start method in lxml.etree.XMLParser changes between 4.3.5 and 4.4.0

Bug #1842026 reported by Michał Trybus
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

Please see my findings at https://stackoverflow.com/questions/57687116/is-there-a-contract-for-namespace-map-argument-of-targets-start-method-in-lxml

test.py prints the contents of the namespace map.
run.sh runs the test on the 2 versions between which the change was introduced.
Dockerfile is needed by run.sh.

Relevant part of attached test.py:

DFXP_BASE_MARKUP = '''
<tt xmlns="http://www.w3.org/ns/ttml"
    xmlns:tts="http://www.w3.org/ns/ttml#styling" />
'''

from lxml import etree

class X(object):
    def start(self, name, attrs, nsmap={}):
        if nsmap:
            print("{}".format(dict(nsmap)))

p = etree.XMLParser(target=X())
p.feed(DFXP_BASE_MARKUP)

My output from run.sh (results.txt) is:
Python : sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0)
lxml.etree : (4, 4, 0, 0)
libxml used : (2, 9, 9)
libxml compiled : (2, 9, 9)
libxslt used : (1, 1, 33)
libxslt compiled : (1, 1, 33)
{'': 'http://www.w3.org/ns/ttml', 'tts': 'http://www.w3.org/ns/ttml#styling'}
---
Python : sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0)
lxml.etree : (4, 3, 5, 0)
libxml used : (2, 9, 9)
libxml compiled : (2, 9, 9)
libxslt used : (1, 1, 33)
libxslt compiled : (1, 1, 33)
{None: 'http://www.w3.org/ns/ttml', 'tts': 'http://www.w3.org/ns/ttml#styling'}
---

I consider this a potential bug, because it breaks compatibility with existing software and I couldn't find the expected format of the namespace map in the docs or release notes. See for example beautifulsoup4 and the following test:

#!/usr/bin/env python3

DFXP_BASE_MARKUP = '''
<tt xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ttml#styling" />
'''

from bs4 import BeautifulSoup

b = BeautifulSoup(DFXP_BASE_MARKUP, 'lxml-xml)
print(b.prettify())

On lxml-4.4.0 it produces invalid XML.
On lxml-4.3.5 the output is valid.

Revision history for this message
Michał Trybus (komar007) wrote :
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.