FTBFS 2.2.1-1 HTML CDATA handling
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
soupsieve (Debian) |
Fix Released
|
Unknown
|
|||
soupsieve (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
soupsieve FTBFS due to test suite failures [1].
As per upstream's bug report [2], we arrive at [3] to understand the root cause is due to lxml being built against libxml2 >= 2.9.11, where CDATA is no longer stripped, causing parsing inconsistencies in BeautifulSoup.
[4] has been reported for lxml and, once it is fixed, rebuilding this package should be enough to close this bug. In the meanwhile, skipping the tests based on the libxml2 version as suggested in [2] should be safe.
[1] https:/
[2] https:/
[3] https:/
[4] https:/
Failed tests report:
=======
__________________ TestSoupContain
self = <tests.
def test_contains_
"""Test contains CDATA in HTML5."""
markup = """
<body><div id="1">Testing that <span id="2">
"""
> self.assert_
markup,
'body *:-soup-
['1'],
)
tests/test_
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/util.py:122: in assert_selector
self.
E AssertionError: Lists differ: ['1', '2'] != ['1']
E
E First list contains 1 additional elements.
E First extra element 1:
E '2'
E
E - ['1', '2']
E + ['1']
-------
----Running Selector Test----
PATTERN: body *:-soup-
## PARSING: 'body *:-soup-
TOKEN: 'tag' --> 'body' at position 0
TOKEN: 'combine' --> ' ' at position 4
TOKEN: 'tag' --> '*' at position 5
TOKEN: 'pseudo_contains' --> ':-soup-
## END PARSING
====PARSER: html5lib
TAG: div
====PARSER: lxml
TAG: div
TAG: span
_______________ TestSoupContain
self = <tests.
def test_contains_
"""Test contains CDATA in HTML5."""
markup = """
<body><div id="1">Testing that <span id="2">
"""
> self.assert_
markup,
'body *:-soup-
['1'],
)
tests/test_
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/util.py:122: in assert_selector
self.
E AssertionError: Lists differ: ['1', '2'] != ['1']
E
E First list contains 1 additional elements.
E First extra element 1:
E '2'
E
E - ['1', '2']
E + ['1']
-------
----Running Selector Test----
PATTERN: body *:-soup-
## PARSING: 'body *:-soup-
TOKEN: 'tag' --> 'body' at position 0
TOKEN: 'combine' --> ' ' at position 4
TOKEN: 'tag' --> '*' at position 5
TOKEN: 'pseudo_contains' --> ':-soup-
## END PARSING
====PARSER: html5lib
TAG: div
====PARSER: lxml
TAG: div
TAG: span
=======
FAILED tests/test_
FAILED tests/test_
Changed in soupsieve (Debian): | |
status: | Unknown → New |
Changed in soupsieve (Debian): | |
status: | New → Fix Released |
Changed in soupsieve (Ubuntu): | |
assignee: | nobody → Paride Legovini (paride) |
tags: |
added: needs-sync removed: server-next |
Changed in soupsieve (Ubuntu): | |
milestone: | none → ubuntu-21.10 |
Fixed in Debian in version 2.2.1-2 (same upstream version).