Comment 2 for bug 931667

Revision history for this message
Andrew Bogott (andrewbogott) wrote :

It looks like we are generally fast and loose with encoding in nova. I have a patch that fixes one particular case, but we need a general solution:

1) Adopt a standard for what encoding is used for all implicitly-encoded IO. (I would propose that that standard be UTF8 rather than ASCII.)

2) At all points where strings enter Python (e.g. commandline args) immediately decode them into unicode (which can unambiguously contain all possible 8-bit encodings.)

3) At all points where 'unicodes' exit Python (being written to stdout or a log file or a database) explicitly encode them as appropriate (generally UTF8, again, especially if we're ever going to read them back in.)