Comment 10 for bug 1287726

Revision history for this message
Ice (f-launchpad-deep-freeze-ca) wrote :

On a fundamental level, the provided replacement doesn’t match PHP Documentation about behaviour.
Json_decode() is described to return NULL on invalid UTF-8 sequence, and set an error message of JSON_ERROR_UTF8.
Current behaviour with json-c is returns JSON_OK, and returns an invalid string that cannot be used in any PHP Intl function, database, without throwing exceptions or generating invalid data that can’t be easily fixed in the database. (Or, it throws more exceptions)

Thus, properly coded applications that expect the result of any JSON data to be able be used throughout the application are now able to be exploited or to crash by sending invalid JSON data, whereas before a “if (json_last_error() !== JSON_OK)” or even “if (!$decodedJson) /// No Request ” would cleanly filter these degenerate strings.
This affects 100% of code that uses json_decode(). As it stands, attempting to deploy this out will generate too many errors, and possibly create attack vectors for allowing invalid UTF-8 sequences through.

All the code in ZF2, Apigility, and presumably others where the decoding of the HTTP POST Request expect invalid UTF-8 sequences to be caught by the json_decode, and thus the values are used internally as valid UTF-8 without any additional checking on the content before passing it to other methods, functions, databases, etc.

All PHP functions that deal with internationalization require a valid UTF-8 input string, any string that makes it past any API controller, thus generates many errors due to invalid UTF-8 sequences, whereas before it was caught by the decoded request being NULL => "".

At least in our case, the only viable solution was to recompile php-json from the PHP sources, and deploy that to our server-farms nationwide. However, I would assume that 95% of the end-users of Ubuntu will not realise that 100% of their user-input JSON cannot be used with any framework out of the box, and should be wrapped with a
  $userInput = json_decode($rawJson);
  $userInput = (json_encode($userInput) === false) ? NULL : $userInput; // Required to ensure compatibility with jsonc

And this MUST be done on every instance of a json-decoded variable otherwise hard-to-track bugs and potential attack-vectors are exposed.

To fix many frameworks, the workaround code will now have to check every response as JSON_OK no longer means “decode OK, valid UTF-8”.

In addition, requiring a valid JSON request to the API will no longer be providing clean filtered input to the controllers, as before the framework would generate a "Input required" (!empty($json)) or "Request contained invalid UTF-8 sequence" (JSON_ERROR_UTF8). But, now, all invalid input is passed back to the code that was coded so that these cases would never be an issue.

As we are a large ISP, we have test cases to test for invalid input, and attack vectors, but I would assume that the majority of users DO NOT have testing for these cases which are defined by PHP DOCUMENTATION as NOT POSSIBLE.
As, which sensible user will double check that JSON_OK is in fact an error condition?