Comment 1 for bug 1949271

Revision history for this message
James Addison (jaddison) wrote :

I've encountered similar behaviour on MacOS 11.7 (Big Sur) when parsing an example UTF-8 encoded HTML file that contains at least two multibyte characters.

One detail learned while attempting to narrow down the cause: the problem disappears when the 'lxml' dependency is installed from binary wheel.

A near-minimal repro case is available at https://github.com/jayaddison/macos-lxml-issue-repro.git/