Activity log for bug #1842026

Date Who What changed Old value New value Message
2019-08-30 07:22:22 Michał Trybus bug added bug
2019-08-30 07:22:22 Michał Trybus attachment added test_lxml.tar.gz https://bugs.launchpad.net/bugs/1842026/+attachment/5285744/+files/test_lxml.tar.gz
2019-08-30 09:34:50 Michał Trybus description Please see my findings at https://stackoverflow.com/questions/57687116/is-there-a-contract-for-namespace-map-argument-of-targets-start-method-in-lxml test.py prints the contents of the namespace map. run.sh runs the test on the 2 versions between which the change was introduced. Dockerfile is needed by run.sh. My output from run.sh (results.txt) is: Python : sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 4, 0, 0) libxml used : (2, 9, 9) libxml compiled : (2, 9, 9) libxslt used : (1, 1, 33) libxslt compiled : (1, 1, 33) {'': 'http://www.w3.org/ns/ttml', 'tts': 'http://www.w3.org/ns/ttml#styling'} --- Python : sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 3, 5, 0) libxml used : (2, 9, 9) libxml compiled : (2, 9, 9) libxslt used : (1, 1, 33) libxslt compiled : (1, 1, 33) {None: 'http://www.w3.org/ns/ttml', 'tts': 'http://www.w3.org/ns/ttml#styling'} --- I consider this a potential bug, because it breaks compatibility with existing software and I couldn't find the expected format of the namespace map in the docs or release notes. See for example beautifulsoup4 and the following test: #!/usr/bin/env python3 DFXP_BASE_MARKUP = ''' <tt xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ttml#styling" /> ''' from bs4 import BeautifulSoup b = BeautifulSoup(DFXP_BASE_MARKUP, 'lxml-xml) print(b.prettify()) On lxml-4.4.0 it produces invalid XML. On lxml-4.3.5 the output is valid. Please see my findings at https://stackoverflow.com/questions/57687116/is-there-a-contract-for-namespace-map-argument-of-targets-start-method-in-lxml test.py prints the contents of the namespace map. run.sh runs the test on the 2 versions between which the change was introduced. Dockerfile is needed by run.sh. Relevant part of attached test.py: DFXP_BASE_MARKUP = ''' <tt xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ttml#styling" /> ''' from lxml import etree class X(object): def start(self, name, attrs, nsmap={}): if nsmap: print("{}".format(dict(nsmap))) p = etree.XMLParser(target=X()) p.feed(DFXP_BASE_MARKUP) My output from run.sh (results.txt) is: Python : sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 4, 0, 0) libxml used : (2, 9, 9) libxml compiled : (2, 9, 9) libxslt used : (1, 1, 33) libxslt compiled : (1, 1, 33) {'': 'http://www.w3.org/ns/ttml', 'tts': 'http://www.w3.org/ns/ttml#styling'} --- Python : sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0) lxml.etree : (4, 3, 5, 0) libxml used : (2, 9, 9) libxml compiled : (2, 9, 9) libxslt used : (1, 1, 33) libxslt compiled : (1, 1, 33) {None: 'http://www.w3.org/ns/ttml', 'tts': 'http://www.w3.org/ns/ttml#styling'} --- I consider this a potential bug, because it breaks compatibility with existing software and I couldn't find the expected format of the namespace map in the docs or release notes. See for example beautifulsoup4 and the following test: #!/usr/bin/env python3 DFXP_BASE_MARKUP = ''' <tt xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ttml#styling" /> ''' from bs4 import BeautifulSoup b = BeautifulSoup(DFXP_BASE_MARKUP, 'lxml-xml) print(b.prettify()) On lxml-4.4.0 it produces invalid XML. On lxml-4.3.5 the output is valid.