GNU Mailman

edithtml.py saves en templates using html entity reference with raw iso-8859-1 character

Bug #1779445 reported by Yasuhito FUTATSUKI at POEM on 2018-06-30

6

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	GNU Mailman	Fix Released	Medium	Mark Sapiro	GNU Mailman 2.1.28

Bug Description

In Mailman's web administrative interface, edithtml page saves en language templates by using iso-8859-1 raw character if the template uses html entity reference like " ".

For example, If "General list information page" (templates/en/listinfo.html), which contains  ", has been saved without modification from web UI, the lists template en/listinfo.html will contain raw '\xa0' character. If Adding " in text area and submit changes twice, it will turn into "".

I'm not sure the patch attached is a good way to fix it because I don't know these entity reference characters are always ISO-8859-1 character, but for reference.

Related branches

Revision history for this message

Yasuhito FUTATSUKI at POEM (futatuki) wrote on 2018-06-30:

#1

edithtml-save-as-ascii-patch.txt Edit (690 bytes, text/plain)

Revision history for this message

Mark Sapiro (msapiro) wrote on 2018-07-03:

#2

Actually, this behavior was caused by rev. 1188. Unfortunately, I don't recall specifically why I made that change. I will attach a patch of what I have so far. Because the call to websafe comes from htmlformat.TextArea(), I need more testing to see if the other uses of TextArea are adversely impacted.

Changed in mailman:
assignee:	nobody → Mark Sapiro (msapiro)
importance:	Undecided → Medium
milestone:	none → 2.1.28
status:	New → In Progress

Revision history for this message

Mark Sapiro (msapiro) wrote on 2018-07-04:

#4

Possible fix. Edit (1.5 KiB, text/plain)

Revised possible fix patch. I think the main reason for not double escaping HTML entities was to make HTML text displayed in the admindb interface more readable. This patch will avoid double escaping only in readonly TextArea.

Revision history for this message

Yasuhito FUTATSUKI at POEM (futatuki) wrote on 2018-07-08:

#5

I understand that your fix is to preserve character entity reference in the text of TextArea through the post method and I made sure it have been fixed in Rev 1788. Thank you.

I think one more problem about charset of query strings from Text or TextArea which is not restricted to ascii text for all language. If a text contains raw non-ascii character, its charset depends on implementation of browsers, even if the HTML 4.01 specification mentions its default is "UNKNOWN", which means "User agents may interpret this value as the character encoding that was used to transmit the document containing this FORM element." (https://www.w3.org/TR/html401/interact/forms.html)

It seems that it is not a problem in most case on browsers nowadays respecting the specification, but it is still problem in some case. At least I put into non-breaking space ('\xa0' in iso-8859-1) character in Text field in us-ascii form using Firefox 61 on FreeBSD, it encoded as '%A0' in query string although characters in Unicode are encoded as numeric character references. The code to handle this special care for 'us-ascii' is found in Utils.canonstr(), so it may be needed to use it in some place including TextArea in edithtml.py (Though using non-ascii characters in us-ascii form is irregular, of course)

Revision history for this message

Mark Sapiro (msapiro) wrote on 2018-07-08:

#6

I think the issue in the original description is fixed and that described in comment #5 is a different issue. If you think this is a significant issue that needs to be fixed, please open a new bug for it.

Revision history for this message

Yasuhito FUTATSUKI at POEM (futatuki) wrote on 2018-07-08:

#7

I don't think it is a significant, as I mentioned comment #5 in last sentence within the ()'s. So I won't open a bug for it. I'm sorry to bother you.

Mark Sapiro (msapiro) on 2018-07-23

Changed in mailman:
status:	In Progress → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Patches

Add patch

Remote bug watches

Bug watches keep track of this bug in other bug trackers.