bs4 DataLossWarning
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
I'm not sure if this is a bug or not -- maybe I'm doing something wrong? I keep receiving the following error whenever I first start up bs4.
Quick test case:
>>> import urllib2
>>> from bs4 import BeautifulSoup as bs
>>> site = 'http://
>>> url = urllib2.
>>> soup = bs(url.read())
Then I receive the following warnings:
/Library/
DataLossWarning)
/Library/
warnings.
Should I be doing something different than passing the url.read() into BeautifulSoup? Everything works out -- and the warning never occurs again -- but it always pops up on the first instance I use BeautifulSoup.
html5lib supports namespaced elements (like <namespace:tag>), and Beautiful Soup doesn't yet. These warnings are mostly a reminder to myself that I need to add namespace support. Unless you're actually parsing code that has namespaced tags, there won't be any real data loss.