Launchpad crashes badly if the client says it doesn't accept 'utf-8'

Bug #40494 reported by Björn Tillenius
4
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Björn Tillenius

Bug Description

Many of our pages require utf-8, and if a browser says that it doesn't accept utf-8, Launchpad will crash badly, producing an OOPS like OOPS-109C83.

I can think of tree ways of solving this:

  * Return some HTTP response code, I can't remember which, but
    I'm quite sure there is a special one for encoding errors.

  * Encode the page in utf-8 anyway, letting the client handle
    the error handling

  * Encode the page in a charset the browser accepts, replacing
    all unencodable characters with '?' or something like that.

The latter is probably the most acceptable solution, since most clients do handle utf-8 even though they don't say it, and it saves them from seeing an error page.

We do have to consider the encoding of the returned data though. If we send the data as ISO-8859-1, the browser will most likely send POST data as ISO-8859-1 as well. What happens if the user tries to POST non-ISO-8859-1 characters. In which encoding will the POST data be sent if the browser says utf-8 isn't acceptable, but we encode the page in utf-8 anyway?

This is a bug in Zope, and has been reported as http://www.zope.org/Collectors/Zope3-dev/588.

description: updated
description: updated
Revision history for this message
James Henstridge (jamesh) wrote :

In the case of text/html, there is a better solution than replacing unconvertable characters to '?': use a character reference. That should be valid anywhere a non-ASCII character is valid.

I'd recommend sending the data in UTF-8 or not at all though. Otherwise we need to make sure we have accept-charset="utf-8" added to all our forms (and then still run the risk of bad browsers not returning honouring the attribute).

Revision history for this message
Björn Tillenius (bjornt) wrote : Re: [Bug 40494] Re: Launchpad crashes badly if the client says it doesn't accept 'utf-8'

On Fri, Apr 21, 2006 at 07:43:17AM -0000, James Henstridge wrote:
> I'd recommend sending the data in UTF-8 or not at all though.
> Otherwise we need to make sure we have accept-charset="utf-8" added to
> all our forms (and then still run the risk of bad browsers not
> returning honouring the attribute).

I agree. Maybe the correct thing to do would be to have Zope return the
appropriate HTTP response code if it can't encode a page. If we want
Zope always to encode the page as utf-8, we can override the
IUserPreferredCharsets adapter that chooses the encoding.

Changed in launchpad:
status: Unconfirmed → Confirmed
Changed in launchpad:
assignee: nobody → bjornt
Changed in launchpad:
importance: Medium → High
Revision history for this message
Björn Tillenius (bjornt) wrote :

A fix is in the review queue.

Changed in launchpad:
status: Confirmed → In Progress
Changed in launchpad:
status: In Progress → Fix Committed
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.