Page Template: TALES string expr. unicode err.

Bug #246983 reported by Nils Jungclaus
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zope 2
Fix Released
Medium
LeoRochael
2.12
Fix Released
Medium
LeoRochael
2.13
Fix Released
Medium
LeoRochael

Bug Description

Using results from python scripts in string-expressions fails with unicode error in Zope 2.10.x.

The following buglet show the problem:

In a page template do something like:

<div tal:define="va python: here.pscr('Änderung')"
     tal:content="string: Die ist eine ${va}"></div>

where pscr is a python script that simply returns the argument

Obviously, the string returned form the python script is non unicode (utf-8 in my case) and the string expression tries to insert the value into a unicode string.

Andreas Jung (ajung)
Changed in zope2:
assignee: nobody → ajung
Revision history for this message
Andreas Jung (ajung) wrote :

Exception raised:

  File "/opt/python-2.4.4/lib/python2.4/unittest.py", line 260, in run
    testMethod()
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Products/PageTemplates/tests/testZopePageTemplate.py", line 230, in testBug246983
    zpt()
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Shared/DC/Scripts/Bindings.py", line 321, in __call__
    return self._bindAndExec(args, kw, None)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Shared/DC/Scripts/Bindings.py", line 358, in _bindAndExec
    return self._exec(bound_data, args, kw)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 330, in _exec
    result = self.pt_render(extra_context=bound_names)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 427, in pt_render
    result = PageTemplate.pt_render(self, source, extra_context)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Products/PageTemplates/PageTemplate.py", line 78, in pt_render
    showtal=showtal)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/zope/pagetemplate/pagetemplate.py", line 115, in pt_render
    strictinsert=0, sourceAnnotations=sourceAnnotations)()
  File "/home/develop/sandboxes/Zope/Zope/lib/python/zope/tal/talinterpreter.py", line 271, in __call__
    self.interpret(self.program)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/zope/tal/talinterpreter.py", line 346, in interpret
    handlers[opcode](self, args)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/zope/tal/talinterpreter.py", line 623, in do_insertText_tal
    text = self.engine.evaluateText(stuff[0])
  File "/home/develop/sandboxes/Zope/Zope/lib/python/Products/PageTemplates/Expressions.py", line 196, in evaluateText
    text = self.evaluate(expr)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/zope/tales/tales.py", line 696, in evaluate
    return expression(self)
  File "/home/develop/sandboxes/Zope/Zope/lib/python/zope/tales/expressions.py", line 263, in __call__
    return self._expr % tuple(vvals)

Revision history for this message
Andreas Jung (ajung) wrote :

Unittest:

    def testBug246983(self):
        """ Test for LP #246983 """
        manage_addPageTemplate(self.app, 'test',
                               text="<div tal:define=\"va python: here.script('Änderung')\""
                                    " tal:content=\"string: Die ist eine ${va}\"></div>",
                               encoding='iso-8859-15')
        zpt = self.app['test']
        manage_addPythonScript(self.app, 'script')
        script = self.app['script']
        script.ZPythonScript_edit(params='dummy', body='return dummy')
        result = zpt.pt_render()

Revision history for this message
Andreas Jung (ajung) wrote :

The workaround is to use

here.script(u'Änderung').

After checking the codebase I don't see how to resolve this in a clean way.

Revision history for this message
Nils Jungclaus (nils-jungclaus) wrote :

The proposed workaround will work for the buglet, that's right. But in the real application, the script will be used in many other situations and will return some strings extracted from a postgreSQL DB.

So the script has to know the coding of the calling page template to determine whether to convert the result to unicode or not.

Wouldn't it be possible to use the unicedeconflictresolver for the result of the script if the tal string-expression fails with the unicode-error?

Im willing to try this but I needed a hint where to look for a proper place to catch the error and to integrate the conflictresolver.

Thanks

   Nils

Revision history for this message
Andreas Jung (ajung) wrote :

I don't get the point with this last comment.

Revision history for this message
LeoRochael (leorochael) wrote :

I can confirm this bug affects a lot of legacy code that I'm handling and porting and that originally counted on ZPTs being evaluated as 8-bit strings instead of Unicode. Now that ZPTs on Zope2 have taken up the responsibility of handling unicode differences with the unicode conflict resolver, this case should be handled as well.

Changed in zope2:
status: New → Confirmed
Revision history for this message
LeoRochael (leorochael) wrote :

There is a relatively simple fix for this problem.

The last segment in the traceback above is caused by this small zope.tales.expressions.StringExpr.__call__() method on zope.tales-3.4.0-py2.6.egg/zope/tales/expressions.py:

    258 def __call__(self, econtext):
    259 vvals = []
    260 for var in self._vars:
    261 v = var(econtext)
    262 vvals.append(v)
    263 return self._expr % tuple(vvals)

The problem is that self._expr is always a unicode string on recent versions of Zope2, whereas "self._vars" is a list of PathExpr objects, and "v = var(econtext)" will return whatever type of objects the path expressions resolve, even 8-bit strings. Which means the line "return self._expr % tuple(vvals)" will always cause a UnicodeDecodingError if any of the values is a string with 8-bit chars (save for Python default encoding tricks).

If line 261 above is replaced by:

  v = econtext.evaluateText(var)

In Zope2 this will return a Unicode object using the unicode conflict resolver registered with Zope2. In plain zope.tales, the zope.tales.tales.Context.evaluateText() method converts the object to unicode directly unless it is already a text-ish object (like a i18n message), which I believe is what makes sense in the context of String Expressions.

I can commit a fix with tests based on this solution.

Revision history for this message
Tres Seaver (tseaver) wrote : Re: [zope2-tracker] [Bug 246983] Re: Page Template: TALES string expr. unicode err.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

LeoRochael wrote:
> There is a relatively simple fix for this problem.
>
> The last segment in the traceback above is caused by this small
> zope.tales.expressions.StringExpr.__call__() method on
> zope.tales-3.4.0-py2.6.egg/zope/tales/expressions.py:
>
> 258 def __call__(self, econtext):
> 259 vvals = []
> 260 for var in self._vars:
> 261 v = var(econtext)
> 262 vvals.append(v)
> 263 return self._expr % tuple(vvals)
>
>
> The problem is that self._expr is always a unicode string on recent versions of Zope2, whereas "self._vars" is a list of PathExpr objects, and "v = var(econtext)" will return whatever type of objects the path expressions resolve, even 8-bit strings. Which means the line "return self._expr % tuple(vvals)" will always cause a UnicodeDecodingError if any of the values is a string with 8-bit chars (save for Python default encoding tricks).
>
> If line 261 above is replaced by:
>
> v = econtext.evaluateText(var)
>
> In Zope2 this will return a Unicode object using the unicode conflict
> resolver registered with Zope2. In plain zope.tales, the
> zope.tales.tales.Context.evaluateText() method converts the object to
> unicode directly unless it is already a text-ish object (like a i18n
> message), which I believe is what makes sense in the context of String
> Expressions.
>
> I can commit a fix with tests based on this solution.

Can you upload a patch showing those tests? I'm afraid my seritonin
level is too low to figure out what you are fixing without seeing the tests.

Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 <email address hidden>
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAks9E8QACgkQ+gerLs4ltQ75ggCfS5KEFzqcRoZDdFwgp+HenZKp
yl8AoJZi1OCG/eTWnDsI4kG0xrB3SMTu
=IM2x
-----END PGP SIGNATURE-----

Revision history for this message
LeoRochael (leorochael) wrote :

This patch adds a test showing the current behaviour for direct inclusion of encoded strings along with the expected behaviour for encoded strings in a "string:" expression. It fails on the second assertion.

Notice the patch is encoded in iso-8859-15 like the rest of the file where it applies.

Revision history for this message
LeoRochael (leorochael) wrote :

This other patch includes the test in the patch above plus more tests and a solution to this issue. The solution is a 15 line StringExpr subclass that conditionally coerces the values to unicode using the UnicodeConflictResolver.

Differently than I proposed, this patch is completely contained in Zope2, and does not touch zope.tales. I'll probably start a discussion later on zope-dev about the fact that the behaviour of string expressions with Unicode strings is not completely defined in that package.

If that discussion proves fruitful, the 15 line line fix above will no longer be needed. but meanwhile, I'll assume silence is consent and commit this patch to the Zope 2.12 branch and trunk if I don't receive any comments to the contrary in 24h, as it fixes the breakages I had in the body of code I'm maintaining.

Nils: I'd appreciate if you could check whether the fix in the patch above also works for you.

Revision history for this message
Tres Seaver (tseaver) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

LeoRochael wrote:
> This other patch includes the test in the patch above plus more tests
> and a solution to this issue. The solution is a 15 line StringExpr
> subclass that conditionally coerces the values to unicode using the
> UnicodeConflictResolver.
>
> Differently than I proposed, this patch is completely contained in
> Zope2, and does not touch zope.tales. I'll probably start a discussion
> later on zope-dev about the fact that the behaviour of string
> expressions with Unicode strings is not completely defined in that
> package.
>
> If that discussion proves fruitful, the 15 line line fix above will no
> longer be needed. but meanwhile, I'll assume silence is consent and
> commit this patch to the Zope 2.12 branch and trunk if I don't receive
> any comments to the contrary in 24h, as it fixes the breakages I had in
> the body of code I'm maintaining.

+1 from me -- the tests look fine.

Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 <email address hidden>
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktDhQQACgkQ+gerLs4ltQ7eUwCgtc7hry5LrS5ICu+zqLg8sGj+
0GAAoLcWGilpot4P2VY7heEYhHm05Rsb
=uNhL
-----END PGP SIGNATURE-----

Revision history for this message
LeoRochael (leorochael) wrote :

fix is commited to trunk. Since trunk is never released by itself, setting status to fix-released

Revision history for this message
LeoRochael (leorochael) wrote :

Fix commited on 2.12 branch

Revision history for this message
Tres Seaver (tseaver) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.