find_all() and select() behave differently on markup containing duplicate elements
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Low
|
Unassigned |
Bug Description
While find_all() can find all duplicate elements, select() cannot.
I'm using Python 3.6.2 with BeautifulSoup 4.6.2 and lxml parser
The test snippet:
import os
from bs4 import BeautifulSoup
markup = '<span class="
soup = BeautifulSoup(
print('find_all:')
for element in soup.find_
print(element) # Print 4 elements including duplicate elements
print('Length of "find_all": ' + str(len(
print('select:')
for element in soup.select(
print(element) # Print only 3 elements
print('Length of "select": ' + str(len(
os.system('pause')
tags: | added: error |
tags: |
added: css removed: error |
Changed in beautifulsoup: | |
importance: | Undecided → Low |
Changed in beautifulsoup: | |
status: | Fix Committed → Fix Released |
Thanks for reporting this bug and providing an easy way to duplicate it. The CSS selector system is contributed code and for my own sanity I only add to it when a patch and test are contributed. I'm going to leave this issue open in a 'confirmed' state and if someone provides a patch or pull request I'll merge it.