translation domains are assumed to be ascii but there is no restriction for that

Bug #214175 reported by Carlos Perelló Marín
8
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

Right now, our language pack export code assumes that potemplate.translation_domain and pofile.language.code are both ASCII text, however, there are no restrictions neither in the database nor the forms to edit those values that prevents us to introduce non-ASCII values.

Changed in rosetta:
importance: Undecided → High
status: New → Confirmed
Changed in rosetta:
milestone: none → 1.2.5
Changed in rosetta:
milestone: 1.2.5 → 1.2.6
Changed in rosetta:
milestone: 1.2.6 → none
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Lowering priority since language codes must be one of a limited, known, ASCII-only set, and translation domains are entered by admins only.

If we want to use the database's valid_name() function to check translation domains, we must be careful that there be no important use cases for underscores or other reasonable characters it disallows.

Changed in rosetta:
importance: High → Medium
visibility: private → public
Changed in rosetta:
importance: Medium → High
Revision history for this message
Данило Шеган (danilo) wrote :

Is there any reason we wouldn't fix language pack export code to support more than ASCII instead? For filesystem encoding, we can use UTF-8 and we'd be fine.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

I have to dig pretty deep to come up with any. So far I have:

 * "A Windows project might expect its POTFILES.in or whatever to be little-endian UTF-16." But something tells me GNU gettext probably solves that by telling Windows users that it's really a proprietary binary format that just happens to look a bit like text. Which from what I understand is the kind of thing users of proprietary operating systems like to hear anyway.

 * "What if a name gets normalized somewhere along the way and ends up not comparing equal to an un-normalized version of the same name?" I have no idea how likely this would be. It could be annoying if it did happen.

Where you say "support more than ASCII" in "language pack export code," I'd like to add that I'm thinking translation domains. I was only joking when I suggested non-ASCII locale names. :-)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.