Part of a translated substring is matched and translated again by a subsequent translation rule

Bug #1867069 reported by Guido Longoni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gufw
Won't Fix
Undecided
Unassigned

Bug Description

Symptoms:
In Italian, "OUT" is translated "IN USCITA". Subsequently, the substring "IN" from the previous translation is retranslated with "IN ENTRATA". The resulting translated rule is "IN ENTRATA USCITA", which sounds approximately like this: "INCOMING OUTGOING", which is very wrong and misleading, especially for the unsuspecting user...

Analysis:
File gufw/gufw/view/gufw.py
Lines 712-734

The string in "translated_rule" variable is manipulated recursively - I mean that the output of one replace() becomes the input of the next - and in this way the entire translated text is also compared with the search string of the next replace(). This causes the bug.
Initially, I experimented a little bit and tried to replace all the translations at once, but some of them actually RELY on some overlap between the previously translated text and the next one, because the search string is surrounded by spaces (I suspect the intention is to match only whole words) and the replace string contains the same spaces. But if you happen to have to translate two adjacent words separately, they only have one space between them, and it is in common.

Solution:
in the hope of doing something useful, I wrote a little patch.
It takes account of any overlapping of spaces and ensures the separation between the translated text and the text still to be translated.

A little caveat:
This is my very first contribution to an opensource project and I am not a bazaar user, so I did my best to provide a solution along with the problem.

Revision history for this message
Guido Longoni (guidolongoni) wrote :
Revision history for this message
costales (costales) wrote :

Hi Guido,

Thanks for the patch, but I don't like to change code because one language has one issue.

Could you ask to your team translators about a fix changing those strings? I think that would be the right solution.

Thanks in advance!

Changed in gui-ufw:
status: New → Opinion
Revision history for this message
Guido Longoni (guidolongoni) wrote :

Hello!
Mine was just an example. The code translates part of strings already translated and this makes it extremely brittle and also prone to breakage due to different translations.
I understand that most of the time everything works anyway, but it's by accident: changing the translation can break the code unexpectedly in any language.
Please, if you are not convinced about my patch don't use it, but seriously consider changing the translation procedure.

I'm not part of any team of translators, so I wouldn't even know how to ask for changes, but as a native-speaking italian user I assure you that "IN" -> "IN" and "OUT" -> "OUT" are already good translations.

Thank you
Guido Longoni (a random user)

Revision history for this message
Guido Longoni (guidolongoni) wrote :

"
I assure you that "IN" -> "IN" and "OUT" -> "OUT"
"

Sorry, I meant "IN" -> "IN ENTRATA" and "OUT" -> "IN USCITA"

Changed in gui-ufw:
status: Opinion → New
Revision history for this message
Bib (bybeu) wrote :

Hi Guido & Costales. I found, digging /usr/share/locale/fr/LC_MESSAGES/gufw.mo that this real bug was workarounded in at least one French string translation : they translated "Home" as "Dossier personnel" (note the blank space), but in the GUI it shows "Dossier_personnel" and I don't know where the underscore comes from (BTW this is a bad translation that seems made by either something automatic or by one that doesn't understand the context, should be Maison/Domicile (Casa) and further is too long for the embedded length check). So for Italian it may be IN_ENTRATA and IN_USCITA.

Although, for a project that is targeted to unskilled users AND is mainly a GUI, AND addresses security concerns, I feel costales opinion about not changing code because of a single one language issue (I'd add "ATM") reveals poor understanding of what languages are : to the sake of targeted audience, natural languages specifics should always be respected, not assuming/forcing one source word will match one target word. Boundaries should be whole phrases, or word definition should include any printable character plus white space. Your point that translation should NOT break the code is the definitive reason. The funny thing is there are many Italian names in this bug subscribers list... that the package maintainer is another Italian name, Devid Antonio Filoni, according to Synaptic in Trusty.
I found contact addresses of translators in the .mo file, you maybe try to PM them:
" Alessandro Ghione https://launchpad.net/~alex81\n"
" Alessandro Menti https://launchpad.net/~elgaton\n"
" Aliak https://launchpad.net/~aliak-93\n"
" Andrea Luciano Damico https://launchpad.net/~lehti\n"
" Claudio Arseni https://launchpad.net/~claudio.arseni\n"
" Devid Antonio Filoni https://launchpad.net/~d.filoni\n"
" Edoardo Vanin https://launchpad.net/~edoardo-vanin\n"
" Gianluca https://launchpad.net/~albatrosslive\n"
" Gualtiero https://launchpad.net/~gualtiero-testa\n"
" Guybrush88 https://launchpad.net/~guybrush\n"
" Luca Ferretti https://launchpad.net/~elle.uca\n"
" Lvcio https://launchpad.net/~lvcio\n"
" Mario Gatti https://launchpad.net/~parismarioinformatique\n"
" Wonderfulheart https://launchpad.net/~wonderfulheart\n"
" costales https://launchpad.net/~costales\n"
" flux https://launchpad.net/~luigimarco\n"
" giulianom89 https://launchpad.net/~giulianom89\n"
" lang-it https://launchpad.net/~lang-it\n"
" mattia.b89 https://launchpad.net/~mattia-b89\n"
" rudy79 https://launchpad.net/~rudy79"

Revision history for this message
costales (costales) wrote :

Hi,
Sorry, but I will not fix this one. Translators should know the program they are translating and if 1 language has conflict with normal string, it's not a reason to change that code.
Best regards.

Changed in gui-ufw:
status: New → Won't Fix
Revision history for this message
Guido Longoni (guidolongoni) wrote :

Find and replace is not how you localize a software.

Ah well, what can I say... congratulations. Let's ignore a fairly resounding bug because, yeah, translators need to know the software and its bugs and know how to get around them. Let's snub other people's languages (but someone will laugh when this problem comes up again in the project leader's native language) and let's say that the problem doesn't exist.
Let's throw away patches written by others because they are not secure enough (or is it maybe because those who read them don't know exactly what they do?) and continue to write SECURE CODE by translating with the "find and replace" method. Congratulations!
Are there really no other maintainers of this software who have something to say about it? Is everything okay the way it is?
I'm starting to worry... If, on the one hand I can safely (as I will do) uninstall GUFW because seeing how it is written and how it is managed I don't trust to run such a software on my pc, on the other hand I'm worried because if all ubuntu projects are managed and written like this, it's better to change distribution and this is a waste of time and energy...SHAME

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.