_findAll() ignores "text" keyword argument
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Version: 3.0.8
_findAll has two special cases that avoid creating a SoupStrainer: when tag=True and findAll(
To fix:
Line 340 of BeautifulSoup.py:
Change
elif not limit and name is True and not attrs and not kwargs:
To
elif not limit and name is True and not attrs and not kwargs \
and not text:
Similarly, on line 345:
Change
elif not limit and isinstance(name, basestring) and not attrs \
and not kwargs:
To
elif not limit and isinstance(name, basestring) and not attrs \
and not kwargs and not text:
Good catch! This is a case where the user has made a mistake by using findAll('tag-name', text=...) or findAll(True, text=...). According to the API, text=... should override the tag searching.
I put a fix in a private branch. I also combined some of the checks in the two elif tests.
There was one possible issue with using 'not text': it matches for an empty string (""). That is fixed by using 'text is None' instead.