The first part is fixed by adding a new option "handle_failures" here:
https://github.com/lxml/lxml/commit/ab497930d74c7bcf4b725809508a1fefef453faa
The second part is more tricky. The right fix would be to generally handle encoding problems in parsed broken HTML trees better.
The first part is fixed by adding a new option "handle_failures" here:
https:/ /github. com/lxml/ lxml/commit/ ab497930d74c7bc f4b725809508a1f efef453faa
The second part is more tricky. The right fix would be to generally handle encoding problems in parsed broken HTML trees better.