segmentation fault on parsing url

Bug #1026381 reported by nix
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Beautiful Soup
New
Undecided
Unassigned

Bug Description

Noticed that BeautifulSoup causes segmentation fault in python interpreter for url mention below in the the steps to reproduce.
Python version : 2.7.1+
OS : Ubuntu (2.6 kernel)

Attached is the script which reproduces this behavior.
This is only for a specific site which is mentioned in the script.

On python shell execute the following steps to reproduce.
import urllib2
from bs4 import BeautifulSoup
sp = BeautifulSoup(urllib2.urlopen('http://keetsa.com/').read())

Revision history for this message
nix (nixdash) wrote :
Revision history for this message
Leonard Richardson (leonardr) wrote :

This is a bug in lxml. You can work around it by parsing the markup using html.parser or html5lib parser instead of lxml.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.