Beautiful Soup fails to santize unquoted style tags
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
This bug is manifesting it self in the program Sipie from http://
It attempts to parse the page here http://
In the source of this page is a tag <input type="password" name="password" style={
Traceback (most recent call last):
File "/usr/bin/
load_
File "/usr/lib/
for selectable in sipie.getStreams():
File "/usr/lib/
streams = self.tryGetStre
File "/usr/lib/
soup = BeautifulSoup(data)
File "/usr/lib/
BeautifulSt
File "/usr/lib/
self.
File "/usr/lib/
self.
File "/usr/lib/
self.goahead(0)
File "/usr/lib/
k = self.parse_
File "/usr/lib/
% (rawdata[
File "/usr/lib/
raise HTMLParseError(
HTMLParser.
I am currently running Arch Linux with beautiful-soup version 3.1.0.1 but there have been reports on the sourceforge page for Sipie that the problem is occuring on other platforms as well, apparently 3.0.7 was able to sanitize this.
Any other info I can gather I would be glad to give, just ask.
Thank You
Kasuko
description: | updated |
Changed in beautifulsoup: | |
status: | Fix Committed → Fix Released |
The parsers used by BS4 handle this markup correctly.