Comment 4 for bug 1045455

Revision history for this message
Eddie Sheffield (eddie-sheffield) wrote :

I tried as you suggested and hacked in a static '\xe2\x84\xa2' (added a property to the image just before being turned to JSON in the v2 image serializer). Even using curl, this new property came through as '\u2122'. The confusion seems to be where the content-type comes into play - it defines the encoding of the raw data between the server and and the client, not how the client then interprets the unencoded data. By the time JS (or curl, or whatever client) is using the data, the underlying libs have translated that utf-8 data into whatever the internal string representation is (maybe utf-8, or utf-16, or ascii, or ...) At the application level you really shouldn't be seeing UTF-8 - you want unicode because that's the general form of the characters. UTF-8 is largely an underlying implementation detail that as long as the server and client agree on it's use as the wire format (via the content-type), the higher level code will just see a string in whatever its native format happens to be.

Also, looking at http://www.json.org/ the '\xe2\x84\xa2' representation is not valid. \xHH is not a valid JSON escape code.