UnicodeDecodeError in ActionInformation.py

Bug #267356 reported by Andreas Jung
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zope CMF buildout
Confirmed
Low
Charlie_X

Bug Description

Plone 3.1.5.1, CMF 2.1.1

We see the following error when the "Home" action within the portal_tabs category contains a string with umlauts inside the 'Title':
UnicodeDecodeError
Sorry, a site error occurred.

Traceback (innermost last):

    * Module ZPublisher.Publish, line 202, in publish_module_standard
    * Module ZPublisher.Publish, line 150, in publish
    * Module plone.app.linkintegrity.monkey, line 21, in zpublisher_exception_hook_wrapper
    * Module Zope2.App.startup, line 221, in zpublisher_exception_hook
    * Module ZPublisher.Publish, line 119, in publish
    * Module ZPublisher.mapply, line 88, in mapply
    * Module ZPublisher.Publish, line 42, in call_object
    * Module Shared.DC.Scripts.Bindings, line 313, in __call__
    * Module Shared.DC.Scripts.Bindings, line 350, in _bindAndExec
    * Module Products.CMFCore.FSPageTemplate, line 216, in _exec
    * Module Products.CMFCore.FSPageTemplate, line 155, in pt_render
    * Module Products.PageTemplates.PageTemplate, line 89, in pt_render
    * Module zope.pagetemplate.pagetemplate, line 117, in pt_render
    * Module zope.tal.talinterpreter, line 271, in __call__
    * Module zope.tal.talinterpreter, line 346, in interpret
    * Module zope.tal.talinterpreter, line 891, in do_useMacro
    * Module zope.tal.talinterpreter, line 346, in interpret
    * Module zope.tal.talinterpreter, line 536, in do_optTag_tal
    * Module zope.tal.talinterpreter, line 521, in do_optTag
    * Module zope.tal.talinterpreter, line 516, in no_tag
    * Module zope.tal.talinterpreter, line 346, in interpret
    * Module zope.tal.talinterpreter, line 891, in do_useMacro
    * Module zope.tal.talinterpreter, line 346, in interpret
    * Module zope.tal.talinterpreter, line 586, in do_setLocal_tal
    * Module zope.tales.tales, line 696, in evaluate
      URL: file:/home/develop/sandboxes/plone3.1/parts/plone/CMFPlone/skins/plone_templates/global_defines.pt
      Line 8, Column 0
      Expression: <PathExpr standard:u'plone_view/globalize'>
      Names:

      {'container': <PloneSite at /mm>,
       'context': <ATDocument at /mm/front-page>,
       'default': <object object at 0x2ba921100200>,
       'here': <ATDocument at /mm/front-page>,
       'loop': {},
       'nothing': None,
       'options': {'args': ()},
       'repeat': <Products.PageTemplates.Expressions.SafeMapping object at 0x844b248>,
       'request': <HTTPRequest, URL=http://b:8080/mm/front-page/document_view>,
       'root': <Application at >,
       'template': <FSPageTemplate at /mm/document_view used for /mm/front-page>,
       'traverse_subpath': [],
       'user': <PropertiedUser 'admin'>}

    * Module zope.tales.expressions, line 217, in __call__
    * Module Products.PageTemplates.Expressions, line 161, in _eval
    * Module Products.PageTemplates.Expressions, line 123, in render
    * Module Products.CMFPlone.browser.ploneview, line 67, in globalize
    * Module Products.CMFPlone.browser.ploneview, line 116, in _initializeData
    * Module plone.memoize.view, line 55, in memogetter
    * Module plone.app.layout.globals.context, line 197, in actions
    * Module Products.CMFPlone.ActionsTool, line 114, in listFilteredActionsFor
    * Module Products.CMFPlone.ActionsTool, line 55, in listActionInfos
    * Module Products.CMFCore.ActionInformation, line 184, in __init__
    * Module Products.CMFCore.ActionInformation, line 151, in getInfoData

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) (Also, the following error occurred while attempting to render the standard error message, please see the event log for full details: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128))

> /home/develop/sandboxes/plone3.1/parts/plone/CMFCore/ActionInformation.py(155)getInfoData()
-> lazy_map[id] = val
(Pdb) args
self = <Action at index_html>
(Pdb) list
150 elif self.i18n_domain and id in ('title', 'description'):
151 try:
152 val = Message(val, self.i18n_domain)
153 except:
154 import pdb; pdb.set_trace()
155 -> lazy_map[id] = val
156
157 return (lazy_map, lazy_keys)
158
159 InitializeClass(Action)
160
(Pdb) print val
üöä
(Pdb) repr(val)
"'\\xc3\\xbc\\xc3\\xb6\\xc3\\xa4'"

'val' passed to Message() is a valid UTF-8 string (üöä in German). The Plone site encoding is UTF8 and also the browser shows UTF-8 as encoding for the ZMI pages.

Tags: bug cmfcore
Revision history for this message
yuppie (yuppie3) wrote :

Importance is low because usually message IDs are ascii strings.

Non-ascii properties are encoded using the default-zpublisher-encoding, which is iso-8859-15 by default and utf-8 in Plone. ZMI page encoding and site encoding are not relevant here.

Changed in zope-cmf:
importance: Undecided → Low
status: New → Confirmed
Revision history for this message
Tres Seaver (tseaver) wrote : Re: [Bug 267356] Re: UnicodeDecodeError in ActionInformation.py

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

yuppie wrote:
> Importance is low because usually message IDs are ascii strings.
>
> Non-ascii properties are encoded using the default-zpublisher-encoding,
> which is iso-8859-15 by default and utf-8 in Plone. ZMI page encoding
> and site encoding are not relevant here.

The attached patch creates a testcase with the same encoded string for
the title: it doesn't fail for me when run against CMF 2.1.

Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 <email address hidden>
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIxA1S+gerLs4ltQ4RAgTFAKDORLujv7z7diZ/ynU3oBONfdzC2ACfbXeK
Tv9t7SWdlt8uEhL56uYbXEc=
=EeHw
-----END PGP SIGNATURE-----

Revision history for this message
Andreas Jung (ajung) wrote :

>Non-ascii properties are encoded using the default-zpublisher-encoding, which is iso-8859-15 by default and utf-8 in Plone.
> ZMI page > encoding and site encoding are not relevant here.

I think that this approach is wrong. default-zpublisher-encoding is global setting that does not reflect the actual local settings of the ZMI encoding. This parameter must be used since it is in charge for setting the encoding of the ZMI .

Revision history for this message
yuppie (yuppie3) wrote :

@ Tres: Your test doesn't fail because you don't set i18n_domain.

@ Andreas: What are the "actual local settings of the ZMI encoding"? AFAICS ZPublisher.HTTPRequest.default_encoding is always used if you change non-unicode properties through the ZMI.

Revision history for this message
Andreas Jung (ajung) wrote :

The ZMI respects the 'management_page_charset' property to determine the input/output encoding of the ZMI within a particular folder and below. This property is being lookuped through acquisition. So a Plone site defines utf-8 using 'management_page_charset' for the site and below.

Revision history for this message
Andreas Jung (ajung) wrote :

Here is the related code in Plone (from Portal.py):

 96 def _management_page_charset(self):$
 97 """ Returns default_charset for management screens """$
 98 properties = getToolByName(self, 'portal_properties', None)$
 99 # Let's be a bit careful here because we don't want to break the ZMI$
100 # just because people screw up their Plone sites (however thoroughly).$
101 if properties is not None:$
102 site_properties = getattr(properties, 'site_properties', None)$
103 if site_properties is not None:$
104 getProperty = getattr(site_properties, 'getProperty', None)$
105 if getProperty is not None:$
106 return getProperty('default_charset', 'utf-8')$
107 return 'utf-8'$
108 $
109 management_page_charset = ComputedAttribute(_management_page_charset, 1)$

Revision history for this message
Mantas Zimnickas (sirex) wrote :

I just had same error. By fallowing this howto: http://plone.org/documentation/how-to/changing-tabs I tried to change top menu. First of all in ZMI (/site/portal_actions/portal_tabs/index_html), I tried to change „Home“ to lithuanian title „Pradžia“, and got same UnicodeError.

Maybe there is another way, how to mange top menu?

Revision history for this message
Danilo Dellaquila (ddellaquila) wrote :

Plone 3.2.2 and CMFCore 2.1.2 reported the same error while using spanish accented letters.

I have fix it using unicode() constructor to create a Unicode string. This is the patch:

--- eggs/Products.CMFCore-2.1.2-py2.4.egg/Products/CMFCore/ActionInformation.py.orig 2010-03-22 12:36:14.000000000 +0100
+++ eggs/Products.CMFCore-2.1.2-py2.4.egg/Products/CMFCore/ActionInformation.py 2010-03-22 12:37:37.000000000 +0100
@@ -148,7 +148,7 @@
             elif id == 'i18n_domain':
                 continue
             elif self.i18n_domain and id in ('title', 'description'):
- val = Message(str(val), self.i18n_domain)
+ val = Message(unicode(val, encoding='utf-8'), self.i18n_domain)
             lazy_map[id] = val

         return (lazy_map, lazy_keys)

Danilo

Revision history for this message
yuppie (yuppie3) wrote :

@ Danilo: The point is that we can't be sure 'utf-8' is the right encoding. It depends on ZPublisher.HTTPRequest.default_encoding or management_page_charset.

BTW: Do you really want to use non-ascii message *IDs* or did you just forget to clear the i18n_domain property?

Charlie_X (charlie)
Changed in zope-cmf:
assignee: nobody → Charlie_X (charlie)
Revision history for this message
Charlie_X (charlie) wrote :

Reading the comments I think the key issue is the fact that an Action title is a message id to be handled by the i18n machinery is not well-known. People are conditioned to giving Zope objects ASCII ids but encoded titles. And for mono-lingual sites this is the right thing to do™. As long as the title is a message Id then it should be ASCII only but the user should be informed of this and maybe even pointed to how they can translate it.

On a more general note, can it be that Zope 2.12 defaults to UTF-8 for the default_publisher_encoding or is this just some magic worked by plone.recipe.zope2instance?

Revision history for this message
yuppie (yuppie3) wrote :

Zope 2.12 defaults to iso-8859-15. mkzopeinstance creates a default instance.

The Action 'title' is only used as message ID *if* 'i18n_domain' is specified. If you want to use encoded strings instead of message IDs, just leave 'i18n_domain' empty.

Looks like some documentation is missing.

Revision history for this message
Charlie_X (charlie) wrote :

You're right about the encoding.

I guess the problem is related to the Properties panel which does not enforce ASCII for string properties. Would setting the property to be a ustring solve the problem?

Better documentation always helps!

Revision history for this message
yuppie (yuppie3) wrote :

At the end of the pipe - in the templates and views - we already require unicode. In the long run all encoded strings should be replaced by unicode.

But first we have to make sure that all the code that handles Actions works with unicode. E.g. the GenericSetup handlers are not tested with ustring properties. And we need a migration strategy for the persistent strings.

The current code is not broken, it's just not obvious how to use it. Adding unicode support for tool settings is a complete new feature.

Revision history for this message
Charlie_X (charlie) wrote :

Am 01.10.2010, 11:58 Uhr, schrieb yuppie <email address hidden>:

> But first we have to make sure that all the code that handles Actions
> works with unicode. E.g. the GenericSetup handlers are not tested with
> ustring properties. And we need a migration strategy for the persistent
> strings.

> The current code is not broken, it's just not obvious how to use it.
> Adding unicode support for tool settings is a complete new feature.

Would it make sense to write a proposal or a blueprint for this?

I've experience the problems of non-unicode support in GenericSetup with a
mono-lingual German site. Of course, it's natural to use "native" titles
for actions, etc.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Helmholtzstr. 20
Düsseldorf
D- 40215
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Revision history for this message
Simone Orsi (simone-orsi) wrote :

Zope2-2.12.18 + Plone 4.0.7 + Products.CMFCore-2.2.4

Using italian "è,à,ìò,ù" etc into actions' title breaks everything.

I solved it by using:

from Products.CMFPlone.utils import safe_unicode

val = Message(safe_unicode(val), self.i18n_domain)

Yes, I know it's not the right solution since it depends on CMFPlone BUT why the hell users are supposed not to use accented letters into actions' title from ZMI??

Is there any progress on that?

Revision history for this message
yuppie (yuppie3) wrote :

Did you read comment #11? Are you sure you want to use non-ascii message *IDs*? Or does your site work if you remove the i18n_domain from your actions?

Revision history for this message
Simone Orsi (simone-orsi) wrote :

I'm a f***ng idiot :/ I missed that comment. Removing i18n_domain makes it working!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.