Comment 1 for bug 1930164

Revision history for this message
Leonard Richardson (leonardr) wrote :

This could be good news. BS4 has been passing strip_cdata=False into lxml for a very long time (see https://bugs.launchpad.net/beautifulsoup/+bug/1275085) but CData blocks were always stripped anyway.

However the way the data is passed from lxml to Beautiful Soup might make it impossible to recognize the CDATA _as_ a CDATA block rather than regular markup.

Can you try replacing strip_cdata=False with strip_cdata=True in bs4/builder/_lxml.py and see if that makes a difference?