Add a way to make an object's generators generate the object itself at the start of the iteration
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Triaged
|
Wishlist
|
Unassigned |
Bug Description
Split out from bug #2052936.
There's a request to make it possible for a PageElement's search methods to consider the PageElement itself as well as its siblings, descendants, etc.
There are three possible ways of doing this:
1. Add a boolean argument called something like 'consider_self' to all of the find* methods. Since the actual implementation happens pretty far down in the call stack, this probably means doing one of the other implementations as well.
find_parents() and find_parent() take an argument called 'include_self' which could be a model for this. I really don't like the name 'include_self', though, because it implies the element in question will _always_ be included in the results, rather than merely considered for inclusion.
2. Add new generator properties to PageElement. We already have two such properties: self_and_
3. Add a method PageElement.
I really don't want to go to #1 because this isn't a very common request and I'd be changing a whole lot of method signatures.
#3 doesn't look as good as I initially thought, because the method is _too_ flexible. You could pass in a generator that has nothing to do with Beautiful Soup.
Since there are already two self_and_* generators I'm okay with going route #2 and adding some more:
self_and_parents (already exists)
self_and_
self_and_
self_and_
self_and_
self_and_
This would be implemented with a private version of the self_and() method from strategy #3. At some future point we can look at usage of these methods and reevaluate strategy #1.
Now that I've noticed it, I also want to deprecate 'include_self' and rename it 'consider_self'. Since that argument has been there for many years I won't remove support for it entirely.
Changed in beautifulsoup: | |
status: | New → Triaged |
importance: | Undecided → Wishlist |
It seems like you're exploring different approaches to enhance PageElement's search methods to include the element itself along with its siblings and descendants. Option #2, adding new generator properties like self_and_ next_elements and self_and_ next_siblings, alongside the existing self_and_ descendants and self_and_parents, appears to be a pragmatic choice. This method could leverage the PageElement. filter( ) method effectively for comprehensive searches. Deprecating 'include_self' in favor of 'consider_self' reflects a thoughtful evolution in method naming, maintaining backward compatibility while aligning with your refined approach.
www.gbpro.pro