Comment 9 for bug 1882606

Revision history for this message
scoder (scoder) wrote :

> it seems like your workaround doesn't work either

It works if you have a single element instead of a comment and an element:

>>> cleaner.clean_html('<div><!-- comment --><a href="asdf">test</a></div>')
'<div><!-- comment --><a href="asdf">test</a></div>'

So, comments that precede the element that it cleans are still discarded. I'm not sure right now what the best way would be to make that work for you as well. It really feels like no-one ever wanted comments to come out of the Cleaner. :)

> It's a bit surprising because it's not like *I* set remove_unknown_tags to True, it's the default

Right, I agree that that's surprising. I think changing the default of "remove_unknown_tags" to a sentinel value and checking if it was passed at all would be better here. If users don't pass it, take a decision based on whether "allow_tags" was passed or not. It's good that passing both is currently an error, that makes it safe to change the combination. And yes, we should do that in the constructor, not later.

Care to provide a PR?