Comment 4 for bug 898072

Revision history for this message
lilydjwg (lilydjwg) wrote : Re: lxml.html.parse treats encoding as Latin1 when reading from file-objects directly

@Stefan Behnel, I don't think lxml need to know which encoding the file is in, because in Python 3.x, `open` handles this when opening in text mode. What lxml read from that file object is encoding indenpendent---it's unicode. lxml somehow decodes this into UTF-8, then takes it as Latin1.