segmentation fault on parsing url
Bug #1026381 reported by
nix
This bug report is a duplicate of:
Bug #984936: Segfault when target object defines doctype() and document contains invalid doctype.
Edit
Remove
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
New
|
Undecided
|
Unassigned |
Bug Description
Noticed that BeautifulSoup causes segmentation fault in python interpreter for url mention below in the the steps to reproduce.
Python version : 2.7.1+
OS : Ubuntu (2.6 kernel)
Attached is the script which reproduces this behavior.
This is only for a specific site which is mentioned in the script.
On python shell execute the following steps to reproduce.
import urllib2
from bs4 import BeautifulSoup
sp = BeautifulSoup(
To post a comment you must log in.
This is a bug in lxml. You can work around it by parsing the markup using html.parser or html5lib parser instead of lxml.