Please support cssselect() in lxml.etree, not just lxml.html

Bug #1490451 reported by Thomas Güttler
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Fix Released
Low
scoder

Bug Description

cssselect() does work only on html up to now.

It would be nice to have it for xml, too.

Text from http://stackoverflow.com/questions/32264533/lxml-cssselect-attributeerror-lxml-etree-element-object-has-no-attribute

Can someone explain why the first call to `root.cssselect()` works, while the second fails?

    from lxml.html import fromstring
    from lxml import etree

    html='<html><a href="http://example.com">example</a></html'
    root = fromstring(html)
    print 'via fromstring', repr(root) # via fromstring <Element html at 0x...>
    print root.cssselect("a")

    root2 = etree.HTML(html)
    print 'via etree.HTML()', repr(root2) # via etree.HTML() <Element html at 0x...>
    root2.cssselect("a") # --> Exception

I get:

    Traceback (most recent call last):
      File "/home/foo_eins_d/src/foo.py", line 11, in <module>
        root2.cssselect("a")
    AttributeError: 'lxml.etree._Element' object has no attribute 'cssselect'

Version: `lxml==3.4.4`

##########
Python : sys.version_info(major=2, minor=7, micro=6, releaselevel='final', serial=0)
lxml.etree : (3, 3, 3, 0)
libxml used : (2, 9, 1)
libxml compiled : (2, 9, 1)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)

scoder (scoder)
summary: - Please support cssselect() for xml, not just html
+ Please support cssselect() in lxml.etree, not just lxml.html
Revision history for this message
scoder (scoder) wrote :

Since cssselect is an external dependency, I'd like to avoid adding it to more places. I consider the cssselect() method in lxml.html unnecessary myself and think that people should use the lxml.cssselect module directly. It's also much more efficient to use pre-compiled CSS selectors.

Changed in lxml:
importance: Undecided → Low
status: New → Opinion
Revision history for this message
Thomas Güttler (hv-tbz-pariv) wrote :

Most developers are more familiar with css/jquery selectors.

For example I was not aware, that this is a valid xpath selector:

  //mytag[@myattr="myval"]

Are the newcomer friendly docs for xpath? I don't want to read detailed specs,
I want to read some simple cookbook style examples.

Revision history for this message
scoder (scoder) wrote :

Looks like you misunderstood what I said. I actually like cssselect a lot and am well aware of its merits.

Ok, I would accept a pull request that adds a cssselect() method to the lxml.etree._Element class to make it more visible.

I would also accept a pull request that improves the documentation to help people find it more easily by presenting lxml.cssselect.CSSSelector() more prominently in comparison to XPath(). Note: not the method, just the selector. Calling the method is too inefficient to encourage its use, but I'm sure some people will prefer simplicity over efficiency.

Changed in lxml:
status: Opinion → Confirmed
Revision history for this message
Thomas Güttler (hv-tbz-pariv) wrote :

Hi scoder,

I am new to lxml. That's why I have not understood you 100% up to now.

Please provide an example of your preferred way to use a css selector (source code example).

thank you

Revision history for this message
scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → scoder (scoder)
milestone: none → 3.5
status: Confirmed → Fix Committed
Revision history for this message
Thomas Güttler (hv-tbz-pariv) wrote :

Wow, great support.
Thank you very much!

Revision history for this message
scoder (scoder) wrote :

Released in lxml 3.5.0.

Changed in lxml:
status: Fix Committed → Fix Released
Revision history for this message
Thomas Güttler (hv-tbz-pariv) wrote :

Thank you :-)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.