Comment 3 for bug 1287726

Revision history for this message
Mads Martin Jørgensen (mmj-mmj) wrote :

So the upstream json-c developers don't consider this a bug:

"Oh, I see. You want json-c to validate that the input being parsed actually is UTF-8. While that might seem reasonable to do at first glance, json-c has historically supported something closer to exact, uninterpreted bytes for strings rather than strict "characters", and unconditionally changing this now will be a significant change. Although not strictly to the spec, in many cases I see a value in being able to handle arbitrary data. This is also in conflict to efforts to support even less string-like data as mentioned in Issue#108.
Given this, and the fact that performing the additional validation will likely add more overhead to the parsing, any checks to ensure that strings only contain valid UTF-8 sequences would need to be explicitly requested, perhaps by setting the JSON_TOKENER_STRICT flag.

Do you happen to have a patch to cause this validation to be done?"