xpath selectors with position() return incorrect data in python

Bug #1795225 reported by Giacomo Debidda
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Invalid
Undecided
Unassigned

Bug Description

I am using lxml with XPath selectors to get the content of a table on this page:

https://opensky-network.org/apidoc/rest.html#response

I need to get the fields of the second table.

If I open Chrome Web tools I can get the expected results with any one of these XPath selectors:

1. $x('//div[@id="all-state-vectors"]/div[@id="response"]/div[position()=2]/table/tbody/tr')
2. $x('//div[@id="all-state-vectors"]/div[@id="response"]/div[2]/table/tbody/tr')
3. $x('//div[@id="all-state-vectors"]/div[@id="response"]/div[position()=last()]/table/tbody/tr')
4. $x('//div[@id="all-state-vectors"]//table/thead/tr/th[contains(text(), "Index")]/parent::*/parent::*/parent::*/tbody/tr')

However, only the 4th selector returns the expected results from a Python script which uses lxml.html.

### Minimal reproducible example

```python
import requests
import lxml.html

PAGE = 'https://opensky-network.org/apidoc/rest.html'
XPATH_SELECTOR = '//div[@id="all-state-vectors"]//table/thead/tr/th[contains(text(), "Index")]/parent::*/parent::*/parent::*/tbody/tr'

res = requests.get(PAGE)
root = lxml.html.fromstring(res.content)
fields = root.xpath(XPATH_SELECTOR)
assert len(fields) == 17
```

### My environment

```sh
Python : sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)
lxml.etree : (4, 2, 5, 0)
libxml used : (2, 9, 8)
libxml compiled : (2, 9, 8)
libxslt used : (1, 1, 32)
libxslt compiled : (1, 1, 32)

```

Revision history for this message
scoder (scoder) wrote :

XPath is implemented in libxml2, not in lxml.

Changed in lxml:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.