lxml.html.HTMLParser doesn't like html with frameset
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Fix Released
|
Low
|
Stefan Seelmann |
Bug Description
In [4]: print("%-20s: %s" % ('Python', sys.version_info))
Python : (2, 6, 4, 'final', 0)
In [5]: print("%-20s: %s" % ('lxml.etree', etree.LXML_
lxml.etree : (2, 2, 6, 0)
In [6]: print("%-20s: %s" % ('libxml used', etree.LIBXML_
libxml used : (2, 7, 7)
In [7]: print("%-20s: %s" % ('libxml compiled', etree.LIBXML_
libxml compiled : (2, 7, 6)
In [8]: print("%-20s: %s" % ('libxslt used', etree.LIBXSLT_
libxslt used : (1, 1, 26)
In [9]: print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_
libxslt compiled : (1, 1, 26)
import lxml.html
hparser = lxml.html.
content=
<frame src="main.php" name="srcpg"
id="srcpg"
marginheight="0">
</frameset>"""
etree_document = lxml.html.
TypeError Traceback (most recent call last)
/home/sergio/
/usr/lib/
634 other_head.
635 return doc
--> 636 if (len(body) == 1 and (not body.text or not body.text.strip())
637 and (not body[-1].tail or not body[-1]
638 # The body has just one element, so it was probably a single
TypeError: object of type 'NoneType' has no len()
Changed in lxml: | |
status: | New → In Progress |
Changed in lxml: | |
milestone: | none → 3.2 |
Fixed here:
https:/ /github. com/lxml/ lxml/commit/ 7b7958e175f0218 cea58d4f42644f8 ee07437f2e