Activity log for bug #2044284

Date Who What changed Old value New value Message
2023-11-22 17:36:59 Chris Papademetrious bug added bug
2023-11-22 17:37:19 Chris Papademetrious description This is a wishlist item. Beautiful Soup has a wrap() function that wraps a single element in a tag. Super! There are various Beautiful Soup requests for wrapping all elements contained *inside* a parent element (wrapping the inside instead of the outside): https://stackoverflow.com/questions/20789798/how-to-use-beautifulsoup-to-wrap-body-contents-with-div-container https://stackoverflow.com/questions/22632355/wrap-the-contents-of-a-tag-with-beautifulsoup https://stackoverflow.com/questions/26448605/how-to-wrap-multiple-tags-under-a-new-tag-in-beautifulsoup There are even more requests to wrap sequences of elements in a parent element that match a given criteria: https://stackoverflow.com/questions/17605801/wrap-all-next-elements-in-beautifulsoup https://stackoverflow.com/questions/73902333/wrap-groupings-of-tags-with-python-beautifulsoup https://stackoverflow.com/questions/73913938/how-to-wrap-a-new-tag-around-multiple-tags-with-beautifulsoup https://stackoverflow.com/questions/32274222/wrap-multiple-tags-with-beautifulsoup https://stackoverflow.com/questions/59033884/wrap-multiple-list-items-in-a-new-tag-ul-ol-using-beautiful-soup https://stackoverflow.com/questions/45009059/how-to-wrap-with-adjacent-tag-with-beautiful-soup Most of the latter requests are about rebuilding hierarchical structure from flat HTML content using heading (<h1> through <h6>) elements: #### html_doc = """ <body> <h1>ABC Topic</h1> <p/> <h2>AB Subtopic</h2> <p/> <h3>AB Subsubtopic</h2> <p/> <h2>C Subtopic</h2> <p/> <h1>XYZ Topic</h1> <p/> <h2>XY Subtopic</h2> <p/> <h2>Z Subtopic</h2> </body> """ #### It would be great if Beautiful Soup had some kind of clever wrap_children() method to wrap sequences of elements meeting some kind of criteria. To wrap all contents, the child element criteria would simply be True. For more complex cases, the criteria could be a tag list or a function -- the usual Soupy ways. With this, you could build structured HTML from flat HTML using a simple bottom-up loop: #### from bs4 import BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') # h6 sections starts at h6, stops at not(h1-h6) # h5 sections starts at h5, stops at not(h1-h5) # h4 sections starts at h4, stops at not(h1-h4) # ...etc... for h in reversed(range(1, 6+1)): soup.body.wrap_children(***MAGIC***, 'article') print(soup.prettify()) #### In addition to any user-specified arguments, the function would also somehow need (1) the current candidate object and (2) the current set of accumulated objects (if any), so that the proper decisions could be made. These could be passed to the function using a documented **kwargs convention ("candidate", "accumulated"). This is a wishlist item. Beautiful Soup has a wrap() method that wraps a single element in a tag. Super! There are various Beautiful Soup requests for wrapping all elements contained *inside* a parent element (wrapping the inside instead of the outside):     https://stackoverflow.com/questions/20789798/how-to-use-beautifulsoup-to-wrap-body-contents-with-div-container     https://stackoverflow.com/questions/22632355/wrap-the-contents-of-a-tag-with-beautifulsoup     https://stackoverflow.com/questions/26448605/how-to-wrap-multiple-tags-under-a-new-tag-in-beautifulsoup There are even more requests to wrap sequences of elements in a parent element that match a given criteria:     https://stackoverflow.com/questions/17605801/wrap-all-next-elements-in-beautifulsoup     https://stackoverflow.com/questions/73902333/wrap-groupings-of-tags-with-python-beautifulsoup     https://stackoverflow.com/questions/73913938/how-to-wrap-a-new-tag-around-multiple-tags-with-beautifulsoup     https://stackoverflow.com/questions/32274222/wrap-multiple-tags-with-beautifulsoup     https://stackoverflow.com/questions/59033884/wrap-multiple-list-items-in-a-new-tag-ul-ol-using-beautiful-soup     https://stackoverflow.com/questions/45009059/how-to-wrap-with-adjacent-tag-with-beautiful-soup Most of the latter requests are about rebuilding hierarchical structure from flat HTML content using heading (<h1> through <h6>) elements: #### html_doc = """ <body>   <h1>ABC Topic</h1>   <p/>   <h2>AB Subtopic</h2>   <p/>   <h3>AB Subsubtopic</h2>   <p/>   <h2>C Subtopic</h2>   <p/>   <h1>XYZ Topic</h1>   <p/>   <h2>XY Subtopic</h2>   <p/>   <h2>Z Subtopic</h2> </body> """ #### It would be great if Beautiful Soup had some kind of clever wrap_children() method to wrap sequences of elements meeting some kind of criteria. To wrap all contents, the child element criteria would simply be True. For more complex cases, the criteria could be a tag list or a function -- the usual Soupy ways. With this, you could build structured HTML from flat HTML using a simple bottom-up loop: #### from bs4 import BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') # h6 sections starts at h6, stops at not(h1-h6) # h5 sections starts at h5, stops at not(h1-h5) # h4 sections starts at h4, stops at not(h1-h4) # ...etc... for h in reversed(range(1, 6+1)):     soup.body.wrap_children(***MAGIC***, 'article') print(soup.prettify()) #### In addition to any user-specified arguments, the function would also somehow need (1) the current candidate object and (2) the current set of accumulated objects (if any), so that the proper decisions could be made. These could be passed to the function using a documented **kwargs convention ("candidate", "accumulated").
2024-02-13 17:18:24 Leonard Richardson beautifulsoup: importance Undecided Wishlist
2024-05-27 21:22:14 Leonard Richardson beautifulsoup: status New Triaged