Launchpad itself

HTML entity code (&whatever;) in bug descriptions is repeatedly unescaped

Bug #6446 reported by Matthew Paul Thomas on 2006-01-05

6

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	Fix Released	Medium	Steve Alexander

Bug Description

<foo> &
<foo> &

Copy and paste those two lines into a bug report, and they should be displayed quite differently from each other. Currently they produce an identical result, because strings recognized as entities are being converted to their equivalent characters. This should not happen.

Each time the description is edited, the conversion is performed again. If the original description had &amp;amp;amp;, after being edited once it will have &amp;amp;, after being edited twice it will have &amp;, and so on.

This problem appeared in bug 2021.

See original description

Tags:

Matthew Paul Thomas (mpt) on 2006-01-05

description:	updated
description:	updated

Matthew Paul Thomas (mpt) on 2006-01-05

description:

updated

Revision history for this message

Björn Tillenius (bjornt) wrote on 2006-01-05:

#1

This looks like it's http://www.zope.org/Collectors/Zope3-dev/468, which has been fixed in latest zope3. I'm not sure the fix got backported to the version of zope3 we're going to update to, so we might have to backport the fix ourselves.

Changed in malone:
assignee:	nobody → stevea
status:	New → Accepted

Revision history for this message

James Henstridge (jamesh) wrote on 2006-01-05:

#2

I am pretty sure fmt:text-to-html is not to blame.

The Zope TextWidget probably has the current behaviour to work around problems with non UTF-8 form submission: if I have an HTML form that will submit in latin1 (e.g. if the page is latin1), but I enter a non-latin1 character into a form field, it will be sent to the server entity escaped. The web server has no way to tell if the user entered the chracter or the entity itself.

Since our pages are served as UTF-8, we should never see the confusing behaviour, so the unescaping performed by Zope is always an error for us.

Revision history for this message

Björn Tillenius (bjornt) wrote on 2006-01-05: Re: [Bug 6446] HTML entity code (&whatever; ) in bug descriptions is repeatedly unescaped

#3

On Thu, Jan 05, 2006 at 01:05:00PM -0000, James Henstridge wrote:
> The Zope TextWidget probably has the current behaviour to work around
> problems with non UTF-8 form submission: if I have an HTML form that
> will submit in latin1 (e.g. if the page is latin1), but I enter a non-
> latin1 character into a form field, it will be sent to the server entity
> escaped. The web server has no way to tell if the user entered the
> chracter or the entity itself.

Oh, didn't know that. It sounds like a valid use case for unescaping the
string. But that's not the reason it gets unescaped, since
xml.sax.saxutils.unescape() is used, which converts only a very small
subset of all HTML entities, so it's definitely a bug.

> Since our pages are served as UTF-8, we should never see the confusing
> behaviour, so the unescaping performed by Zope is always an error for
> us.

Actually, I think it's possible for a client to request another
encoding. Although it's quite safe to assume that we serve only UTF-8.

Revision history for this message

James Henstridge (jamesh) wrote on 2006-01-10:

#4

It is possible force the encoding by adding an accept-encoding="UTF-8" attribute to the <form> element, which is supported by the major browsers (even if the user switches encoding), but I don't think this is ever likely to be an issue in practice.

Revision history for this message

Björn Tillenius (bjornt) wrote on 2006-04-26:

#5

I'm quite sure this bug has been fixed now in the version of Zope3 we're using.

Revision history for this message

Christian Reis (kiko) wrote on 2006-04-27:

#6

Tested on staging using the string supplied and it works. Ensuring it is so:

<foo> &
<foo> &

Changed in malone:
status:	Confirmed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.