Activity log for bug #1789041

Date Who What changed Old value New value Message
2018-08-25 19:09:54 Tim Tisdall bug added bug
2018-08-25 19:10:27 Tim Tisdall description ``` >>> from lxml.html import fromstring >>> t = u"""\xef\xbb\xbf<!DOCTYPE html><html><head><title>test</title></head><body><h1>test</h1></body></html>""" >>> tree = fromstring(t) >>> print(tree) <Element div at 0x7fdd6c9de940> >>> tree.head Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/lxml/html/__init__.py", line 298, in head return self.xpath('//head|//x:head', namespaces={'x':XHTML_NAMESPACE})[0] IndexError: list index out of range >>> ``` According to Wikipedia the `EF BB BF` is the BOM for UTF-8 Python : sys.version_info(major=2, minor=7, micro=7, releaselevel='final', serial=0) lxml.etree : (4, 2, 4, 0) libxml used : (2, 9, 8) libxml compiled : (2, 9, 8) libxslt used : (1, 1, 32) libxslt compiled : (1, 1, 32) >>> from lxml.html import fromstring >>> t = u"""\xef\xbb\xbf<!DOCTYPE html><html><head><title>test</title></head><body><h1>test</h1></body></html>""" >>> tree = fromstring(t) >>> print(tree) <Element div at 0x7fdd6c9de940> >>> tree.head Traceback (most recent call last):   File "<console>", line 1, in <module>   File "/usr/local/lib/python2.7/site-packages/lxml/html/__init__.py", line 298, in head     return self.xpath('//head|//x:head', namespaces={'x':XHTML_NAMESPACE})[0] IndexError: list index out of range >>> According to Wikipedia the `EF BB BF` is the BOM for UTF-8 Python : sys.version_info(major=2, minor=7, micro=7, releaselevel='final', serial=0) lxml.etree : (4, 2, 4, 0) libxml used : (2, 9, 8) libxml compiled : (2, 9, 8) libxslt used : (1, 1, 32) libxslt compiled : (1, 1, 32)
2018-08-27 18:51:28 Tim Tisdall lxml: status New Invalid