Converting a tag to a string can exceed maximum recursion depth
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Medium
|
Unassigned |
Bug Description
When I do this:
>>> from bs4 import BeautifulSoup
>>> BeautifulSoup(
An exception is raised:
.....
File "/Users/
indent_
File "/Users/
formatter))
File "/Users/
indent_
File "/Users/
formatter))
File "/Users/
indent_
File "/Users/
formatter))
File "/Users/
indent_
File "/Users/
for c in self:
RuntimeError: maximum recursion depth exceeded while calling a Python object
This seems to be because BeautifulSoup uses recursion to find child elements. Also, BeautifulSoup seems to treat `<br>` as a tag that should be closed or self-closed, but that is not necessarily true for HTML5. Same issue with `<img>` and unclosed `<a>` tags, as well as other tags I assume.
Changed in beautifulsoup: | |
status: | New → Confirmed |
tags: | added: bug |
Changed in beautifulsoup: | |
importance: | Undecided → Medium |
summary: |
- Many unclosed tags result in RuntimeError: maximum recursion depth - exceeded while calling a Python object + Converting a tag to a string can exceed maximum recursion depth |
Changed in beautifulsoup: | |
status: | Confirmed → In Progress |
Oh, also
> pip freeze =4.4.0
beautifulsoup4=
wheel==0.24.0