Please support cssselect() in lxml.etree, not just lxml.html
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Fix Released
|
Low
|
scoder |
Bug Description
cssselect() does work only on html up to now.
It would be nice to have it for xml, too.
Can someone explain why the first call to `root.cssselect()` works, while the second fails?
from lxml.html import fromstring
from lxml import etree
html='<html><a href="http://
root = fromstring(html)
print 'via fromstring', repr(root) # via fromstring <Element html at 0x...>
print root.cssselect("a")
root2 = etree.HTML(html)
print 'via etree.HTML()', repr(root2) # via etree.HTML() <Element html at 0x...>
root2.
I get:
Traceback (most recent call last):
File "/home/
AttributeError: 'lxml.etree.
Version: `lxml==3.4.4`
##########
Python : sys.version_
lxml.etree : (3, 3, 3, 0)
libxml used : (2, 9, 1)
libxml compiled : (2, 9, 1)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)
summary: |
- Please support cssselect() for xml, not just html + Please support cssselect() in lxml.etree, not just lxml.html |
Since cssselect is an external dependency, I'd like to avoid adding it to more places. I consider the cssselect() method in lxml.html unnecessary myself and think that people should use the lxml.cssselect module directly. It's also much more efficient to use pre-compiled CSS selectors.