lxml raises a TypeError on strings without tags as input

Bug #1567526 reported by Allo on 2016-04-07
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
lxml
Low
scoder

Bug Description

>>> import sys
>>> from lxml import etree
>>> print("%-20s: %s" % ('Python', sys.version_info))
Python : sys.version_info(major=2, minor=7, micro=9, releaselevel='final', serial=0)
>>> print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION))
lxml.etree : (3, 6, 0, 0)
>>> print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION))
libxml used : (2, 9, 1)
>>> print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION))
libxml compiled : (2, 9, 1)
>>> print("%-20s: %s" % ('libxslt used', etree.LIBXSLT_VERSION))
libxslt used : (1, 1, 28)
>>> print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_COMPILED_VERSION))
libxslt compiled : (1, 1, 28)
>>> ## -----

Bug is triggered by:
>>> from lxml.html.soupparser import fromstring
>>> fromstring("")

Or
>>> from lxml.html.soupparser import fromstring
>>> fromstring("foo")

Error:
    roots = beautiful_soup_tree.contents[first_element_idx:last_element_idx+1]
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

Gert Burger (gertburger) wrote :

Still a problem in 3.6.6

Gert Burger (gertburger) wrote :

*3.6.4

scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → scoder (scoder)
importance: Undecided → Low
status: New → Fix Committed
milestone: none → 3.9.0
scoder (scoder) on 2017-09-19
Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers