Beautiful Soup

Overview
Code
Bugs
Blueprints
Translations
Answers

Bug #1930164
Comment #1

Comment 1 for bug 1930164

Revision history for this message

Leonard Richardson (leonardr) wrote on 2021-05-30:

This could be good news. BS4 has been passing strip_cdata=False into lxml for a very long time (see https://bugs.launchpad.net/beautifulsoup/+bug/1275085) but CData blocks were always stripped anyway.

However the way the data is passed from lxml to Beautiful Soup might make it impossible to recognize the CDATA _as_ a CDATA block rather than regular markup.

Can you try replacing strip_cdata=False with strip_cdata=True in bs4/builder/_lxml.py and see if that makes a difference?