Wrong evaluation whether json is valid or not

Bug #1287726 reported by Mads Martin Jørgensen
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
php-json (Debian)
Fix Released
Unknown
php-json (Ubuntu)
Fix Released
Medium
Nish Aravamudan
Trusty
Incomplete
Undecided
Unassigned
Wily
Incomplete
Undecided
Unassigned

Bug Description

# php -r 'var_dump(json_decode("\"\xff\""));'
string(1) "�"

Should be:

# php -r 'var_dump(json_decode("\"\xff\""));'
NULL

Since a large (and growing) number of services use JSON for communication, it's quite bad.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: php5-json 1.3.2-2build1
ProcVersionSignature: Ubuntu 3.13.0-15.35-generic 3.13.5
Uname: Linux 3.13.0-15-generic x86_64
ApportVersion: 2.13.2-0ubuntu5
Architecture: amd64
Date: Tue Mar 4 14:34:43 2014
Ec2AMI: ami-8df705fa
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: eu-west-1b
Ec2InstanceType: c3.xlarge
Ec2Kernel: aki-52a34525
Ec2Ramdisk: unavailable
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: php-json
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Mads Martin Jørgensen (mmj-mmj) wrote :
Revision history for this message
Robie Basak (racb) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

Looks like this is upstream bug https://github.com/remicollet/pecl-json-c/issues/9, and it is suggested that the problem needs to be fixed in libjson-c which is at https://github.com/json-c/json-c

This bug will probably not make any progress in Ubuntu until it is fixed upstream. Any volunteers should seek to fix this upstream for all distributions at once.

Changed in php-json (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Mads Martin Jørgensen (mmj-mmj) wrote :

So the upstream json-c developers don't consider this a bug:

"Oh, I see. You want json-c to validate that the input being parsed actually is UTF-8. While that might seem reasonable to do at first glance, json-c has historically supported something closer to exact, uninterpreted bytes for strings rather than strict "characters", and unconditionally changing this now will be a significant change. Although not strictly to the spec, in many cases I see a value in being able to handle arbitrary data. This is also in conflict to efforts to support even less string-like data as mentioned in Issue#108.
Given this, and the fact that performing the additional validation will likely add more overhead to the parsing, any checks to ensure that strings only contain valid UTF-8 sequences would need to be explicitly requested, perhaps by setting the JSON_TOKENER_STRICT flag.

Do you happen to have a patch to cause this validation to be done?"

Revision history for this message
Mads Martin Jørgensen (mmj-mmj) wrote :

This also means, that unless the Ubuntu maintainer fixes it either way, then php-json has a rather important change in behaviour between 12.04 LTS and 14.04 LTS.

Revision history for this message
Robie Basak (racb) wrote :

Thank you for following this up.

14.04 is just round the corner now, so it seems unlikely that any further changes can be made now.

It sounds like json-c upstream are willing to have the behaviour in the presence of the flag. Does this mean that php-json could use the flag by default, and thus restore the original behaviour?

Though a change in behaviour is undesirable to fix in an SRU after release.

So it looks like we'll have to stick with what we have, unless anybody has a better suggestion? Mads, could you please explain the impact of this change to end-users, whether you think this should be mentioned in the release notes, and if so then suggest some appropriate text? Are there any security implications in this change?

It seems to me that this is an unfortunate fallout of upstream PHP refusing to ship a free software json module by default. So Debian had to switch to an alternate version, and some behavioural change is the consequence.

Revision history for this message
Mads Martin Jørgensen (mmj-mmj) wrote :

Problem is that the old php-json package conforms to the json standard, and the new one does not. If someone went through the effort (we do not have the resources, unfortunately) to make json-c conform to the json standard with said flag, then yes - you could simply compile the package with that flag enabled, and all would be good.

The impact of the bug? Json is used all over the place for many years, since it's a very good lightweight format for exchanging data. All of the services provided on a Ubuntu system will potentially break, since it does not conform to the json standard that everyone else does. We had to switch our servers that was running Ubuntu since 2010 to NetBSD to get a modern PHP with proper json handling. Our unit tests broke immediately when we switched to Ubuntu 13.10.

There's a good thread with proper comments about the thing here, which shows several aspects of the thing:

http://www.reddit.com/r/PHP/comments/1ksnzw/php_json_removed_in_php_55/

Do not forget the fact, that the reason it was removed was that it had this line together with its license: "The Software shall be used for Good, not Evil." which makes it 'non-free' software. Have a look at the entire license here, and see for yourself.

http://www.json.org/license.html

It's still a possibility to simply ship the php-json included in php, instead of potentially breaking thousands of server setups that rely on a php-json conforming to the json standard.

Revision history for this message
Robie Basak (racb) wrote : Re: [Bug 1287726] Re: Wrong evaluation whether json is valid or not

On Mon, Apr 14, 2014 at 06:10:31PM -0000, Mads Martin Jørgensen wrote:

[...]

> ...All of
> the services provided on a Ubuntu system will potentially break, since
> it does not conform to the json standard that everyone else does.

That doesn't really give me a helpful real world understanding of bug
impact. Instead of "it will potentially break", I'd prefer to hear
something like "it will break in the specific case X, which is widely
deployed because of popular framework Y, so Z% of users will see a
regression".

Without an understanding similar to this, I am not informed enough to
prioritise this at all.

[...]

> Do not forget the fact, that the reason it was removed was that it had
> this line together with its license: "The Software shall be used for
> Good, not Evil." which makes it 'non-free' software. Have a look at the
> entire license here, and see for yourself.
>
> http://www.json.org/license.html

I'm not interested in debating legals in this bug. Please take your
argument to Debian, or to ubuntu-devel if you want Ubuntu to differ from
Debian on this point. It would not be reasonable to unilaterally make a
potentially legal-impacting change on the basis of a few proponents in
this bug. We should not do it without consulting the wider community.

> It's still a possibility to simply ship the php-json included in php,
> instead of potentially breaking thousands of server setups that rely on
> a php-json conforming to the json standard.

Sorry for responding to this after Trusty's release. I had to take some
time off for personal reasons. But three days before release was not a
reasonable timeframe to achieve consensus and make the change anyway,
had I responded earlier.

The two options you have now are:

1) Convince the Debian or Ubuntu projects to take a different legal
stance on whether the json module embedded in upstream php is
redistributable. This bug is not the appropriate venue for this; please
take it to mailing lists.

2) Fix the replacement json module to match php upstream's behaviour.
This approach seems much easier to me. It sounds like upstream would be
happy to accept patches, and there will be no legal minefield or
flamewar to negotiate either. We can certainly include such a fix in the
next release, and I think a fix to Trusty may be reasonable, too.

Revision history for this message
Mads Martin Jørgensen (mmj-mmj) wrote :

I cannot specify it further, since it's our own API that communicates via JSON that broke, due to php-json in Ubuntu not conforming to the JSON standard. Time will tell how many will get hit by this bug. Might be zero, might be thousands. Fact of the matter is still that it accepts JSON input that is not valid JSON according to the JSON standard.

I see my two options, and I can certainly understand your stance. We're not able to fix this with the resources we have, and convincing Debian to take a different legal stance is a holy war that I cannot win. so my only option is:

3) Move all servers that use JSON away from Ubuntu and on to something with a JSON standard conforming php-json.

Thanks for your time!

P.S Will this bug stay open until it's maybe fixed some day? Then I know when I can consider using Ubuntu again.

Revision history for this message
Robie Basak (racb) wrote :

On Tue, Apr 22, 2014 at 11:31:36AM -0000, Mads Martin Jørgensen wrote:
> 3) Move all servers that use JSON away from Ubuntu and on to something
> with a JSON standard conforming php-json.

For other readers, I would point out that AIUI Fedora (and thus RHEL and
CentOS) and Debian (and thus Ubuntu) have all taken the same stance on
this. I do not know about SUSE.

> P.S Will this bug stay open until it's maybe fixed some day? Then I know
> when I can consider using Ubuntu again.

Sure. But it is also not uncommon that upstream fixes a bug, and we do
not notice it in the distribution while we carry the fix anyway.

Revision history for this message
Ice (f-launchpad-deep-freeze-ca) wrote :

On a fundamental level, the provided replacement doesn’t match PHP Documentation about behaviour.
Json_decode() is described to return NULL on invalid UTF-8 sequence, and set an error message of JSON_ERROR_UTF8.
Current behaviour with json-c is returns JSON_OK, and returns an invalid string that cannot be used in any PHP Intl function, database, without throwing exceptions or generating invalid data that can’t be easily fixed in the database. (Or, it throws more exceptions)

Thus, properly coded applications that expect the result of any JSON data to be able be used throughout the application are now able to be exploited or to crash by sending invalid JSON data, whereas before a “if (json_last_error() !== JSON_OK)” or even “if (!$decodedJson) /// No Request ” would cleanly filter these degenerate strings.
This affects 100% of code that uses json_decode(). As it stands, attempting to deploy this out will generate too many errors, and possibly create attack vectors for allowing invalid UTF-8 sequences through.

All the code in ZF2, Apigility, and presumably others where the decoding of the HTTP POST Request expect invalid UTF-8 sequences to be caught by the json_decode, and thus the values are used internally as valid UTF-8 without any additional checking on the content before passing it to other methods, functions, databases, etc.

All PHP functions that deal with internationalization require a valid UTF-8 input string, any string that makes it past any API controller, thus generates many errors due to invalid UTF-8 sequences, whereas before it was caught by the decoded request being NULL => "".

At least in our case, the only viable solution was to recompile php-json from the PHP sources, and deploy that to our server-farms nationwide. However, I would assume that 95% of the end-users of Ubuntu will not realise that 100% of their user-input JSON cannot be used with any framework out of the box, and should be wrapped with a
  $userInput = json_decode($rawJson);
  $userInput = (json_encode($userInput) === false) ? NULL : $userInput; // Required to ensure compatibility with jsonc

And this MUST be done on every instance of a json-decoded variable otherwise hard-to-track bugs and potential attack-vectors are exposed.

To fix many frameworks, the workaround code will now have to check every response as JSON_OK no longer means “decode OK, valid UTF-8”.

In addition, requiring a valid JSON request to the API will no longer be providing clean filtered input to the controllers, as before the framework would generate a "Input required" (!empty($json)) or "Request contained invalid UTF-8 sequence" (JSON_ERROR_UTF8). But, now, all invalid input is passed back to the code that was coded so that these cases would never be an issue.

As we are a large ISP, we have test cases to test for invalid input, and attack vectors, but I would assume that the majority of users DO NOT have testing for these cases which are defined by PHP DOCUMENTATION as NOT POSSIBLE.
As, which sensible user will double check that JSON_OK is in fact an error condition?

Revision history for this message
shankao (shankao) wrote :

This seems to have been recently fixed in upstream json-c: see last comments in https://github.com/remicollet/pecl-json-c/issues/9

Changed in php-json (Debian):
status: Unknown → New
Revision history for this message
Nish Aravamudan (nacc) wrote :

@shankoa Thank you for notifying us that the fix has occurred upstream!

@anyone affected: I have backported that fix and the test to Trusty and WIly versions of php-json, available at: https://launchpad.net/~nacc/+archive/ubuntu/lp1287726. Could you please test and confirm if these resolve the issue for you?

Note that Xenial has PHP7.0 only, so it is not affected by the same, as I understand it.

I will attach debdiffs to this bug shortly.

Changed in php-json (Ubuntu):
assignee: nobody → Nish Aravamudan (nacc)
Revision history for this message
Nish Aravamudan (nacc) wrote :
Revision history for this message
Nish Aravamudan (nacc) wrote :
Changed in php-json (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Nish Aravamudan (nacc) wrote :

Note that in PHP7.0, the provided test behaves slightly differently, I would appreciate any insight into understanding if it is correct:

var_dump(bin2hex(json_decode($json, true, 2)));
 returns string(0) ""

$t = json_decode($json, true, 2); for the Index and Property cases returns NULL.

Mathew Hodson (mhodson)
tags: added: patch
Revision history for this message
Robie Basak (racb) wrote :

This is Fix Released in Xenial and Yakkety by virtue of PHP 7 not using php-json any more I presume?

Changed in php-json (Ubuntu Trusty):
status: New → Triaged
Changed in php-json (Ubuntu Wily):
status: New → Triaged
Changed in php-json (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Robie Basak (racb) wrote :

@Mathew,

I see that you subscribed ~ubuntu-sponsors. But I don't think Nish intended an upload yet. The SRU paperwork is not done yet, and there are outstanding questions as to whether his proposed fix works.

I think this is awaiting someone affected to report back, so I'm marking this Incomplete for Trusty and Wily for now, to make it clear that this bug will not make progress until a reporter gets back to us. Affected users: when you've reported success or failure, please change the bug status back to New.

Revision history for this message
Robie Basak (racb) wrote :

(and I'm unsubscribing ~ubuntu-sponsors for now)

Changed in php-json (Ubuntu Trusty):
status: Triaged → Incomplete
Changed in php-json (Ubuntu Wily):
status: Triaged → Incomplete
Changed in php-json (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.