lxml

Overview
Code
Bugs
Blueprints
Translations
Answers

Bug #1930224
Comment #3

Comment 3 for bug 1930224

Revision history for this message

Leonard Richardson (leonardr) wrote on 2021-05-31:

As the author of Beautiful Soup let me say that I would probably prefer the new behavior. I haven't been able to get CDATA sections from lxml the way I have been from html.parser and html5lib.

I've been using the strip_cdata=False argument mentioned here:
https://lxml.de/api.html#cdata

But in the context in which I'm using it, it's never worked:
https://bugs.launchpad.net/beautifulsoup/+bug/1275085

I say I'd _probably_ prefer the new behavior because the way in which the CDATA section is being sent over -- as chunked data blocks -- means I don't think I can recognize it as CDATA and create a special CData object on my side. But I'd definitely rather have the data than not.