[exchange] BeautifulSoup 3.1 is incompatible with xe.com

Bug #336443 reported by Stefano Rivera
2
Affects Status Importance Assigned to Milestone
Ibid
Fix Released
High
Stefano Rivera

Bug Description

Query: currencies for South Africa
ERROR:scripts.ibid-plugin:Exception occured in Currency processor of lookup plugin
Traceback (most recent call last):
  File "../ibid-plugin-333213/scripts/ibid-plugin", line 81, in <module>
    processor.process(event)
  File "ibid/plugins/__init__.py", line 54, in process
    event = method(event, *match.groups()) or event
  File "/home/stefanor/projects/ibid/misc/ibid/plugins/lookup.py", line 195, in currency
    self._load_currencies()
  File "/home/stefanor/projects/ibid/misc/ibid/plugins/lookup.py", line 176, in _load_currencies
    for tr in soup.find('table', attrs={'class': 'tbl_main'}).table.findAll('tr'):
AttributeError: 'NoneType' object has no attribute 'table'
Response: Excuse me?

ii python-beautifulsoup 3.1.0.1-1 error-tolerant HTML parser for Python

Revision history for this message
Stefano Rivera (stefanor) wrote :

Oh, and:
Query: exchange 100 zar to gbp
ERROR:scripts.ibid-plugin:Exception occured in Currency processor of lookup plugin
Traceback (most recent call last):
  File "../ibid-plugin-333213/scripts/ibid-plugin", line 81, in <module>
    processor.process(event)
  File "ibid/plugins/__init__.py", line 54, in process
    event = method(event, *match.groups()) or event
  File "/home/stefanor/projects/ibid/misc/ibid/plugins/lookup.py", line 190, in exchange
    event.addresponse(soup.findAll('span', attrs={'class': 'XEsmall'})[1].contents[0])
IndexError: list index out of range
Response: Excuse me?

Changed in ibid:
importance: Undecided → High
milestone: none → 0.1
status: New → Triaged
Revision history for this message
Stefano Rivera (stefanor) wrote : Re: [exchange] BeautifulSoup version error?

The blame for this one looks to be squarely on the shoulders of BeautifulSoup >= 3.1:

# Beautiful Soup is now based on HTMLParser rather than SGMLParser, which is gone in Python 3. There's some bad HTML that SGMLParser handled but HTMLParser doesn't, usually to do with attribute values that aren't closed or have brackets inside them:

  <a href="foo</a>, </a><a href="bar">baz</a>
  <a b="<a>">', '<a b="<a>"></a><a>"></a>

A later version of Beautiful Soup will allow you to plug in different parsers to make tradeoffs between speed and the ability to handle bad HTML.

Changed in ibid:
status: Triaged → Confirmed
Changed in ibid:
assignee: nobody → stefanor
status: Confirmed → In Progress
Changed in ibid:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.