htmlentities returns nothing

Bug #1058669 reported by Juan Montoya
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
php5 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

The html function returns an empty string or maybe null when a latin1 encoded string is provided.

<?php
print htmlentities('bye'); //outputs 'bye'
print htmlentites('adiós'); //outputs nothing when it's supposed to return 'adi&oacute;s'
?>

This happened in Ubuntu 12.10b2 32-bit, and it does not happen in ubuntu 12.04

For the moment, the solution is to specify the encoding:
<? htmlentities('adiós', ENT_COMPAT, 'iso-8859-1'); //outputs 'adi&oacute;' as expected

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: php5 5.4.6-1ubuntu1
ProcVersionSignature: Ubuntu 3.5.0-15.23-generic 3.5.4
Uname: Linux 3.5.0-15-generic i686
NonfreeKernelModules: wl
ApportVersion: 2.5.3-0ubuntu1
Architecture: i386
Date: Sat Sep 29 10:05:39 2012
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Beta i386 (20120926)
PackageArchitecture: all
ProcEnviron:
 LANGUAGE=es_PE:es
 TERM=xterm
 PATH=(custom, no user)
 LANG=es_PE.UTF-8
 SHELL=/bin/bash
SourcePackage: php5
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Juan Montoya (th3pr0ph3t) wrote :
Revision history for this message
Robie Basak (racb) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

From the manual at http://php.net/manual/en/function.htmlentities.php:

htmlentities() takes an optional third argument encoding which defines encoding used in conversion. If omitted, the default value for this argument is ISO-8859-1 in versions of PHP prior to 5.4.0, and UTF-8 from PHP 5.4.0 onwards. Although this argument is technically optional, you are highly encouraged to specify the correct value for your code.

As the default encoding is documented to have changed between 5.3.10 (12.04) and 5.4.6 (12.10), my understanding of the manual is that what you're seeing is expected, and if you want to decode ISO-8859-1 then you are required to specify your encoding as you have described.

Since this appears to be expected behaviour and not a bug, I'm marking this bug as Invalid. If this is wrong, please do comment and reopen.

Changed in php5 (Ubuntu):
status: New → Invalid
Revision history for this message
Juan Montoya (th3pr0ph3t) wrote :

Expected or not, it does not work as before, then it does not work as expected, therefore it is a bug. And a valid one.
I cannot imagine a case where the expected returned value of a function is an empty string. Maybe it should raise an error if the encoding was not specified.

I think this bug was hastily marked as invalid.

Revision history for this message
Robie Basak (racb) wrote :

Sorry, I should have been clearer. I understand that the behaviour has changed and you did not expect this.

If you'd like to see this changed, the appropriate venue is the PHP project itself, rather than Ubuntu's packaging of PHP.

In Ubuntu, we can only really go with what the PHP project has decided. If we were to change the behaviour of PHP especially for Ubuntu to work around this, then this would cause even more confusion and bug reports as people expect the same PHP script to behave in the same way for the same PHP version across distributions.

Thus this bug in the packaging of PHP *in Ubuntu* is Invalid, since we can't change the behaviour just in Ubuntu. But this doesn't stop you from taking this up with the PHP project itself. Based on their documentation, it sounds like they have already made the decision on this, but you are welcome to contact them about it. The PHP project's bug reporting page is at https://bugs.php.net/ and general support at http://www.php.net/support.php

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.