Activity log for bug #1838877

Date Who What changed Old value New value Message
2019-08-04 13:01:39 Kamil Mahmood bug added bug
2019-08-04 13:02:11 Kamil Mahmood description ``` from bs4 import BeautifulSoup markup = """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://新2网址(www.ydsjyj.com)-时时彩平台,(www.xinyushishicai.com)-澳门赌场(www.amdc999.com)"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> <title>时时彩娱乐-首页</title> <meta content="时时彩娱乐,时时彩娱乐网址,时时彩娱乐平台,时时彩娱乐官网" name="keywords" /> <meta content="时时彩娱乐官网✅✅ 是全网最诚信,口碑最好的彩票平台!提款速度最快,赔率高达9.999 极力为您提供注册、登陆、下载、测速等服务.时时彩娱乐祝您玩的愉快开心。" name="description" /> <title>时时彩娱乐-首页</title> </head> <body> <h1><a href="http://4b2s.com/">时时彩娱乐</a></h1> </body> </html> """ # Raises Exception TypeError: cannot use a bytes pattern on a string-like object soup = BeautifulSoup(markup, features="lxml") soup = BeautifulSoup(markup.encode("utf-8"), features="lxml", from_encoding="utf-8") # Print empty string print(str(soup)) ``` Above HTML markup is a small portion from large HTML file System information Uname Result: 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Python Version: 3.6.8 Libraries beautifulsoup4==4.7.1 lxml==4.3.3 from bs4 import BeautifulSoup markup = """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://新2网址(www.ydsjyj.com)-时时彩平台,(www.xinyushishicai.com)-澳门赌场(www.amdc999.com)"> <html xmlns="http://www.w3.org/1999/xhtml"> <head>     <meta content="text/html; charset=utf-8" http-equiv="Content-Type" />     <title>时时彩娱乐-首页</title>     <meta content="时时彩娱乐,时时彩娱乐网址,时时彩娱乐平台,时时彩娱乐官网" name="keywords" />     <meta content="时时彩娱乐官网✅✅ 是全网最诚信,口碑最好的彩票平台!提款速度最快,赔率高达9.999 极力为您提供注册、登陆、下载、测速等服务.时时彩娱乐祝您玩的愉快开心。" name="description" />     <title>时时彩娱乐-首页</title> </head> <body>     <h1><a href="http://4b2s.com/">时时彩娱乐</a></h1> </body> </html> """ # Raises Exception TypeError: cannot use a bytes pattern on a string-like object soup = BeautifulSoup(markup, features="lxml") soup = BeautifulSoup(markup.encode("utf-8"), features="lxml", from_encoding="utf-8") # Print empty string print(str(soup)) Above HTML markup is a small portion from large HTML file System information Uname Result: 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Python Version: 3.6.8 Libraries beautifulsoup4==4.7.1 lxml==4.3.3
2019-08-04 13:03:09 Kamil Mahmood description from bs4 import BeautifulSoup markup = """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://新2网址(www.ydsjyj.com)-时时彩平台,(www.xinyushishicai.com)-澳门赌场(www.amdc999.com)"> <html xmlns="http://www.w3.org/1999/xhtml"> <head>     <meta content="text/html; charset=utf-8" http-equiv="Content-Type" />     <title>时时彩娱乐-首页</title>     <meta content="时时彩娱乐,时时彩娱乐网址,时时彩娱乐平台,时时彩娱乐官网" name="keywords" />     <meta content="时时彩娱乐官网✅✅ 是全网最诚信,口碑最好的彩票平台!提款速度最快,赔率高达9.999 极力为您提供注册、登陆、下载、测速等服务.时时彩娱乐祝您玩的愉快开心。" name="description" />     <title>时时彩娱乐-首页</title> </head> <body>     <h1><a href="http://4b2s.com/">时时彩娱乐</a></h1> </body> </html> """ # Raises Exception TypeError: cannot use a bytes pattern on a string-like object soup = BeautifulSoup(markup, features="lxml") soup = BeautifulSoup(markup.encode("utf-8"), features="lxml", from_encoding="utf-8") # Print empty string print(str(soup)) Above HTML markup is a small portion from large HTML file System information Uname Result: 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Python Version: 3.6.8 Libraries beautifulsoup4==4.7.1 lxml==4.3.3 # Script Start from bs4 import BeautifulSoup markup = """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://新2网址(www.ydsjyj.com)-时时彩平台,(www.xinyushishicai.com)-澳门赌场(www.amdc999.com)"> <html xmlns="http://www.w3.org/1999/xhtml"> <head>     <meta content="text/html; charset=utf-8" http-equiv="Content-Type" />     <title>时时彩娱乐-首页</title>     <meta content="时时彩娱乐,时时彩娱乐网址,时时彩娱乐平台,时时彩娱乐官网" name="keywords" />     <meta content="时时彩娱乐官网✅✅ 是全网最诚信,口碑最好的彩票平台!提款速度最快,赔率高达9.999 极力为您提供注册、登陆、下载、测速等服务.时时彩娱乐祝您玩的愉快开心。" name="description" />     <title>时时彩娱乐-首页</title> </head> <body>     <h1><a href="http://4b2s.com/">时时彩娱乐</a></h1> </body> </html> """ # Raises Exception TypeError: cannot use a bytes pattern on a string-like object soup = BeautifulSoup(markup, features="lxml") soup = BeautifulSoup(markup.encode("utf-8"), features="lxml", from_encoding="utf-8") # Print empty string print(str(soup)) # Script End Above HTML markup is a small portion from large HTML file System information Uname Result: 5.0.0-23-generic #24~18.04.1-Ubuntu SMP Mon Jul 29 16:12:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Python Version: 3.6.8 Libraries beautifulsoup4==4.7.1 lxml==4.3.3
2019-09-02 17:30:52 Leonard Richardson beautifulsoup: status New Fix Committed