Matching an attribute against [] does not work as expected

Bug #2045469 reported by Chris Papademetrious
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Beautiful Soup
Fix Committed
Undecided
Unassigned

Bug Description

Let's say I have the following HTML document, where only <c> has an ID attribute:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<a>text<b/>text<c id="c"/></a>', 'lxml')

If I match @id against None, it returns all tags with the attribute *undefined*:

>>> soup.find_all(id = None)
[<html><body><a>text<b></b>text<c id="c"></c></a></body></html>, <body><a>text<b></b>text<c id="c"></c></a></body>, <a>text<b></b>text<c id="c"></c></a>, <b></b>]

If I match @id against [], again it returns all tags with the attribute *undefined*:

>>> soup.find_all(id = [])
[<html><body><a>text<b></b>text<c id="c"></c></a></body></html>, <body><a>text<b></b>text<c id="c"></c></a></body>, <a>text<b></b>text<c id="c"></c></a>, <b></b>]

But my expectation is that it would return all tags with an @id value *defined* that matches a value in the list, which is no tags.

Attribute matches against lists and Booleans have different conventions. We should interpret an empty list using the list convention, not the Boolean convention.

Revision history for this message
Leonard Richardson (leonardr) wrote :

This is already in the 4.13 branch. I fixed the bug without knowing about it, as a side effect of refactoring SoupStrainer. I added a unit test in revision ce2ba56.

Changed in beautifulsoup:
status: New → Fix Committed
Revision history for this message
Chris Papademetrious (chrispitude) wrote :

Thanks for this fix! I tested it in the 4.13 branch and it works great.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.