Warning raised when using lxml.html.soupparser.fromstring

Bug #1752096 reported by Allan Hansen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Triaged
Undecided
Unassigned

Bug Description

There's a warning raised when using lxml.html.soupparser.fromstring:

In [1]: from lxml.html.soupparser import fromstring

In [2]: s = """<!DOCTYPE html>
   ...: <html>
   ...: <head>
   ...: <title>Hi!</title>
   ...: </head>
   ...: <body>
   ...: Foobar
   ...: </body>
   ...: </html>"""

In [3]:

In [3]: fromstring(s)
/Users/allan/homeInstalled/miniconda3/envs/py36/lib/python3.6/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 11 of the file /Users/allan/homeInstalled/miniconda3/envs/py36/bin/ipython. To get rid of this warning, change code that looks like this:

 BeautifulSoup(YOUR_MARKUP})

to this:

 BeautifulSoup(YOUR_MARKUP, "html.parser")

  markup_type=markup_type))

I happen to know that the same warning is raised when using the BeautifulSoup library like so:

In [5]: from bs4 import BeautifulSoup
In [5]: from bs4 import BeautifulSoup

In [6]: s = """<!DOCTYPE html>
   ...: <html>
   ...: <head>
   ...: <title>Hej</title>
   ...: </head>
   ...: <body>
   ...: Foobar
   ...: </body>
   ...: </html>"""

In [7]: BeautifulSoup(s)
/Users/allan/homeInstalled/miniconda3/envs/py36/lib/python3.6/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 11 of the file /Users/allan/homeInstalled/miniconda3/envs/py36/bin/ipython. To get rid of this warning, change code that looks like this:

 BeautifulSoup(YOUR_MARKUP})

to this:

 BeautifulSoup(YOUR_MARKUP, "lxml")

  markup_type=markup_type))
Out[7]:
<!DOCTYPE html>
<html>
<head>
<title>Hej</title>
</head>
<body>
Foobar
</body>
</html>

In [8]: BeautifulSoup(s, 'lxml') # no warning raised
Out[8]:
<!DOCTYPE html>
<html>
<head>
<title>Hej</title>
</head>
<body>
Foobar
</body>
</html>

System information:

Python : sys.version_info(major=3, minor=6, micro=3, releaselevel='final', serial=0)
lxml.etree : (3, 8, 0, 0)
libxml used : (2, 9, 4)
libxml compiled : (2, 9, 4)
libxslt used : (1, 1, 29)
libxslt compiled : (1, 1, 29)

Revision history for this message
Allan Hansen (allanlrh) wrote :
Revision history for this message
scoder (scoder) wrote :

PR welcome.

Changed in lxml:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.