This is a strange failure, since superficially similar tests like test_entities_converted_on_the_way_out don't have the problem.
Are you running this against Python 3.9.19, which was released yesterday? Are you also running it against any other Python releases, with or without this problem? I can't duplicate your setup exactly, but I created a fresh 3.9.19 environment that resembles yours, and ran the test suite successfully.
Here's a diagnostic I'd like you to run:
---
data = b"<p>\x91Foo\x92</p>"
from bs4.diagnose import diagnose
diagnose(data)
print("\nBEGINNING HTMLPARSER TRACE")
from bs4.diagnose import htmlparser_trace
from bs4.dammit import UnicodeDammit
u = UnicodeDammit(data).unicode_markup
print(u)
htmlparser_trace(u)
---
This will show what markup the html.parser parser is receiving and how it handles that markup.
This is a strange failure, since superficially similar tests like test_entities_ converted_ on_the_ way_out don't have the problem.
Are you running this against Python 3.9.19, which was released yesterday? Are you also running it against any other Python releases, with or without this problem? I can't duplicate your setup exactly, but I created a fresh 3.9.19 environment that resembles yours, and ran the test suite successfully.
Here's a diagnostic I'd like you to run:
--- x92</p> "
data = b"<p>\x91Foo\
from bs4.diagnose import diagnose
diagnose(data)
print("\nBEGINNING HTMLPARSER TRACE") data).unicode_ markup
from bs4.diagnose import htmlparser_trace
from bs4.dammit import UnicodeDammit
u = UnicodeDammit(
print(u)
htmlparser_trace(u)
---
This will show what markup the html.parser parser is receiving and how it handles that markup.