Some tests fail with python3
Bug #1681115 reported by
Aloysius
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Invalid
|
Undecided
|
Unassigned |
Bug Description
bs4 4.5.3
It seems to be unicode-related, but I'm not sure how it relates to prior bugs.
See attachment
Changed in beautifulsoup: | |
status: | Incomplete → Invalid |
To post a comment you must log in.
I can't duplicate this on Python 3.6.1 or Python 3.5.2, and I can't make sense of the test failures.
The fact that SNOWMAN is rendering as "☃" in your terminal tells me that you may have your default output encoding set to Windows-1252, or that you otherwise have a Windows-like system. I can't square that with the UNIX paths in the tracebacks, but I think the root of the problem is that the test environment occasionally reaches for an encoding it thinks is UTF-8, and it gets Windows-1252 instead.
However, even assuming that's the case, I can't make sense of the test_simple_ html_substituti on failure, which is the simplest one. Since \u2200 is not being converted to ∀, there's a problem with the regular expression, which is generated from the code points listed in html.entities. codepoint2name.
The regular expression should be matching \u2200, but it isn't. This tells me that the regular expression was probably created out of garbled data--maybe Unicode strings encoded as UTF-8 and then decoded as Windows-1252. But I don't see a way that could happen. The Unicode strings are created with chr(), which takes a Unicode code point. Then they're immediately compiled into a regular expression.
I am not denying your experience, but I can't duplicate or diagnose this issue, so I'm closing it.
Some things that might be useful:
* The tests that are failing here are very old. Try running the tests on earlier versions of Beautiful Soup and see if there's a point where the test failures start. ion.substitute_ html.
* Duplicate the problem with standalone Python code that uses EntitySubstitut
* Duplicate the problem with Python 2 code on the same machine that gives you the problem with Python 3.