StructuredText and locale

Bug #143878 reported by Jesus Cea
2
Affects Status Importance Assigned to Milestone
Zope 2
Invalid
Wishlist
Unassigned

Bug Description

"letter" concept in StructuredText is not affected by the "locale" directive in "zope.conf".

This worked fine in Zope 2.9.*

Steps to reproduce:

1. Set your system locale to "C".
2. Configure your "zope.conf" to "es_ES" locale, for example
3. Create a StructuredText page with a ascii-only emphatized text, and another emphatized line with some spanish/european characters.
4. The first line will be emphatized. The second one, not!.

Tags: bug zope
Revision history for this message
Andreas Jung (ajung) wrote :

Changes: submitter email, importance (critical => medium)

Revision history for this message
Charlie_X (charlie) wrote :

I've hit essentially the same problem with Zope 2.9. From my investigation it looks like this is down to the fact that no the comparison does not take encoding into account.

STletters has propagates a constant "letters" which while it will reflect the locale setting may have problems because of the encoding. The locale de_DE has an encoding of ISO8859-1. The search function in DocumentClass will not find any non-ascii that is encoded in utf-8 which is why the search will end at the first word with non-ascii characters.

I've tested a workaround for the letters constant:
letters = string.letters.decode("latin-1").encode("utf-8")
Obviously these values should come from the environment with the first encoding being locale.getlocale()[1]. I'm not sure where the second value can be ascertained: default_zpublisher_encoding would be the likely candidate but that is borked on my machine as zope.conf has defualt-zpublisher-encoding.

On the other hand constants for rest_input and _output encodings are available and can be defined in zope.conf so it might make sense to make use of them instead.

However, back to the function in DocumentClass - isn't this excessively restrictive being limited only to what string.letters returns? For example € symbol will always fail no matter what the locale setting but also ® and ™ and presumably other characters that won't be picked up by this simplistic scheme and in any case the same as \w which would also respond to locale?

Changed in zope2:
importance: Medium → Wishlist
status: New → Confirmed
Revision history for this message
Colin Watson (cjwatson) wrote :

The zope2 project on Launchpad has been archived at the request of the Zope developers (see https://answers.launchpad.net/launchpad/+question/683589 and https://answers.launchpad.net/launchpad/+question/685285). If this bug is still relevant, please refile it at https://github.com/zopefoundation/zope2.

Changed in zope2:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.