Comment 1 for bug 2067707

Revision history for this message
scoder (scoder) wrote : Re: HTMLParser loses CDATA content

"CDATA" is an XML thing. It has no meaning in HTML:
https://developer.mozilla.org/en-US/docs/Web/API/CDATASection#Specifications

Try this with plain libxml2:
"""
$ echo '<html><head><title><![CDATA[title]]></title></head><body><![CDATA[body]]></body></html>' | xmllint --html -
-:1: HTML parser error : htmlParseStartTag: invalid element name
<html><head><title><![CDATA[title]]></title></head><body><![CDATA[body]]></body>
                    ^
-:1: HTML parser error : htmlParseStartTag: invalid element name
<html><head><title><![CDATA[title]]></title></head><body><![CDATA[body]]></body>
                                                          ^
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><title></title></head>
<body></body>
</html>
"""