Comment 9 for bug 1780071

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Garbage in, barbage out. I think opening things and reading as utf-8 is good. For added resilience, one has to also espace unicode encoding errors too, and replace them with anything readable. I.e. one may choose to read in binary, and then decode as utf-8 with any errors replaced, ie.:

b'somebinary string'.decode('utf-8', 'replace')