bs4 DataLossWarning

Bug #727014 reported by Zach Williams
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Beautiful Soup
Fix Released
Undecided
Unassigned

Bug Description

I'm not sure if this is a bug or not -- maybe I'm doing something wrong? I keep receiving the following error whenever I first start up bs4.

Quick test case:

>>> import urllib2
>>> from bs4 import BeautifulSoup as bs
>>> site = 'http://www.crummy.com/'
>>> url = urllib2.urlopen(site)
>>> soup = bs(url.read())

Then I receive the following warnings:

/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/bs4/builder/_html5lib.py:60: DataLossWarning: namespaceHTMLElements not supported yet
  DataLossWarning)
/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/bs4/builder/_html5lib.py:77: DataLossWarning: BeautifulSoup cannot represent elements in any namespace
  warnings.warn("BeautifulSoup cannot represent elements in any namespace", DataLossWarning)

Should I be doing something different than passing the url.read() into BeautifulSoup? Everything works out -- and the warning never occurs again -- but it always pops up on the first instance I use BeautifulSoup.

Revision history for this message
Leonard Richardson (leonardr) wrote :

html5lib supports namespaced elements (like <namespace:tag>), and Beautiful Soup doesn't yet. These warnings are mostly a reminder to myself that I need to add namespace support. Unless you're actually parsing code that has namespaced tags, there won't be any real data loss.

Changed in beautifulsoup:
status: New → Confirmed
Revision history for this message
Zach Williams (hey-zachwill) wrote :

Nice. Thanks for the quick reply, man.

Revision history for this message
Leonard Richardson (leonardr) wrote :

BS4 beta 8 supports namespaced elements and attributes in a very basic way, so I've removed the warning.

Changed in beautifulsoup:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.