Crash with html5lib 0.95 when creating BeautifulSoup object

Bug #943246 reported by armakuni on 2012-02-29
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Beautiful Soup

Bug Description

Following code causes a crash:

html = """
<!DOCTYPE html>
        <title> - </title>


soup = BeautifulSoup(html)

I tried to strip everything extra from HTML. The HTML should be valid HTML5. I am using Python 2.7.2 in Windows, Beautiful Soup 4.0.0b9 and html5lib 0.95. Code works with html5lib 0.90.

Traceback is following:
  File "D:\bs\bs4\", line 168, in __init__
  File "D:\bs\bs4\", line 181, in _feed
  File "D:\bs\bs4\builder\", line 37, in feed
    doc = parser.parse(markup, encoding=self.user_specified_encoding)
  File "D:\bs\html5lib\", line 247, in parse
    parseMeta=parseMeta, useChardet=useChardet)
  File "D:\bs\html5lib\", line 115, in _parse
  File "D:\bs\html5lib\", line 209, in mainLoop
    new_token = phase.processStartTag(new_token)
  File "D:\bs\html5lib\", line 514, in processStartTag
    return self.startTagHandler[token["name"]](token)
  File "D:\bs\html5lib\", line 1151, in startTagFormatting
  File "D:\bs\html5lib\", line 1003, in addFormattingElement
    elif self.isMatchingFormattingElement(node, element):
  File "D:\bs\html5lib\", line 984, in isMatchingFormattingElement
    elif len(node1.attributes) != len(node2.attributes):
TypeError: object of type 'AttrList' has no len()

Changed in beautifulsoup:
status: New → Fix Committed
Changed in beautifulsoup:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers