for_element doesn't work for fragments

Bug #1208731 reported by Radu Dan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Confirmed
Medium
Unassigned

Bug Description

When using the for_element property of a label on a tree created from a html fragment that does not contain a body tag, an IndexError is raised:

Traceback (most recent call last):
  File "B:\Workspace\LeadPages\tests\unit\test_parse_form.py", line 9, in setUp
    open("fixtures/input-aweb-form.html").read())
  File "B:\Workspace\LeadPages\leadpages\optins\html.py", line 24, in __init__
    print label.for_element
  File "C:\Program Files (x86)\Python27\lib\site-packages\lxml-3.2.3-py2.7-win32.egg\lxml\html\__init__.py", line 1470, in _for_element__get
    return self.body.get_element_by_id(id)
  File "C:\Program Files (x86)\Python27\lib\site-packages\lxml-3.2.3-py2.7-win32.egg\lxml\html\__init__.py", line 151, in body
    return self.xpath('//body|//x:body', namespaces={'x':XHTML_NAMESPACE})[0]
IndexError: list index out of range

This can be fixed by searching for elements from the fragment root, not the (potentially missing) body tag

Python : sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0)
lxml.etree : (3, 2, 3, 0)
libxml used : (2, 9, 0)
libxml compiled : (2, 9, 0)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)

Revision history for this message
scoder (scoder) wrote :

Agreed that that would be better. Patches welcome.

Changed in lxml:
importance: Undecided → Medium
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.