Change find_all(text='something') to match .text instead .string
Bug #1366856 reported by
Martin Häcker
This bug affects 5 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
Right now it is very confusing that find_all matches .string instead of .text when given a text argument. This produces the problem that you cannot search for tags that have a specific text inside a sub tag as .string will just return None in that case.
As an added benefit, that change would make BS4 behave more like jQuery which would make it easier to learn.
Changed in beautifulsoup: | |
status: | Won't Fix → Confirmed |
Changed in beautifulsoup: | |
status: | Fix Committed → Fix Released |
Changed in beautifulsoup: | |
status: | Fix Released → Won't Fix |
To post a comment you must log in.
+1
The fix is simple from a code point of view. 'found.string' has to be changed to 'found.text' on this line: http:// bazaar. launchpad. net/~leonardr/ beautifulsoup/ bs4/view/ head:/bs4/ element. py#L1520 .
Ensuring that it doesn't break compatibility might be an issue since it's in the SoupStrainer which is used for basically everything to do with searching.
Still, it seems intuitive that 'find(text="blah")' should match the text property, not the string, especially since if you needed to match the string, you could just use 'find(string= "blah") '.