``` from bs4 import BeautifulSoup
markup = """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://新2网址(www.ydsjyj.com)-时时彩平台,(www.xinyushishicai.com)-澳门赌场(www.amdc999.com)"> <html xmlns="http://www.w3.org/1999/xhtml">
<head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> <title>时时彩娱乐-首页</title> <meta content="时时彩娱乐,时时彩娱乐网址,时时彩娱乐平台,时时彩娱乐官网" name="keywords" /> <meta content="时时彩娱乐官网✅✅ 是全网最诚信,口碑最好的彩票平台!提款速度最快,赔率高达9.999 极力为您提供注册、登陆、下载、测速等服务.时时彩娱乐祝您玩的愉快开心。" name="description" /> <title>时时彩娱乐-首页</title> </head>
<body> <h1><a href="http://4b2s.com/">时时彩娱乐</a></h1> </body> </html> """
# Raises Exception TypeError: cannot use a bytes pattern on a string-like object soup = BeautifulSoup(markup, features="lxml")
soup = BeautifulSoup(markup.encode("utf-8"), features="lxml", from_encoding="utf-8") # Print empty string print(str(soup)) ```
Above HTML markup is a small portion from large HTML file
System information Uname Result: 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Python Version: 3.6.8
Libraries beautifulsoup4==4.7.1 lxml==4.3.3
```
from bs4 import BeautifulSoup
markup = """ ydsjyj. com)-时时彩平台, (www.xinyushish icai.com) -澳门赌场(www. amdc999. com)"> www.w3. org/1999/ xhtml">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://新2网址(www.
<html xmlns="http://
<head> "Content- Type" /> 时时彩娱乐-首页< /title> "时时彩娱乐, 时时彩娱乐网址, 时时彩娱乐平台, 时时彩娱乐官网" name="keywords" /> 口碑最好的彩票平台! 提款速度最快, 赔率高达9.999 极力为您提供注册、 登陆、下载、测速等服务. 时时彩娱乐祝您玩的愉快开心。 " name="description" /> 时时彩娱乐-首页< /title>
<meta content="text/html; charset=utf-8" http-equiv=
<title>
<meta content=
<meta content="时时彩娱乐官网✅✅ 是全网最诚信,
<title>
</head>
<body> 4b2s.com/">时时彩娱乐</a></h1>
<h1><a href="http://
</body>
</html>
"""
# Raises Exception TypeError: cannot use a bytes pattern on a string-like object markup, features="lxml")
soup = BeautifulSoup(
soup = BeautifulSoup( markup. encode( "utf-8" ), features="lxml", from_encoding= "utf-8" )
# Print empty string
print(str(soup))
```
Above HTML markup is a small portion from large HTML file
System information
Uname Result: 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Python Version: 3.6.8
Libraries =4.7.1
beautifulsoup4=
lxml==4.3.3