Translation validation with plural forms ignores plural form expression.

Bug #548440 reported by Henning Eggers on 2010-03-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Low
Unassigned

Bug Description

The following would be validated by 'msgfmt -c' for languages whose plural form 0 is only used for one value of n (e.g. n!=1):

msgid "English %d"
msgid_plural "Englishs %d"
msgstr[0] "Translation"
msgstr[1] "%d Translations"

But for languages that don't have such a plural form, the validation would fail because the missing format specifier in msgtr[0]. This is possible because msgfmt reads and evaluates the plural form expression from the PO file header.

When entering these values in the Launchpad web UI, they will be validated in any language. The validation code does not take the language into account and there seems to assume n!=1 for a plural form expression. It should use the plural expression when validating, though.

El Do 25 de 03 de 2010 a les 23:23 +0000, en/na Henning Eggers va
escriure:
> Public bug reported:
>
> The following would be validated by 'msgfmt -c' for languages whose
> plural form 0 is only used for one value of n (e.g. n!=1):
>
> msgid "English %d"
> msgid_plural "Englishs %d"
> msgstr[0] "Translation"
> msgstr[1] "%d Translations"
>

Note some things:

     1. If the message is a c-format string, this is a valid
        translation.
     2. If the message is a python-format string, this is _not_ a valid
        translation. The Python interpreter checks for unnamed arguments
        and the program will crash.
     3. 'msgfmt -c' will validate both of the above cases, although the
        Python one will lead to a crash. This should probably be
        reported as a bug in gettext upstream
     4. Some languages, such as Arabic and Hebrew, need to be able to
        specify translations as msgstr[0] "Translation", so they rely on
        the feature of being able to discard arguments in C.

More details here:

https://bugs.launchpad.net/ubuntu/+source/language-selector/+bug/409785/comments/9

> But for languages that don't have such a plural form, the validation
> would fail because the missing format specifier in msgtr[0]. This is
> possible because msgfmt reads and evaluates the plural form expression
> from the PO file header.
>

Are you sure validation would fail? When debugging the Arabic crashes in
the bug from the above comment, I remember that 'msgfmt -c' validated
the translations in all cases, even in Arabic, which has got a different
plural expression. But that was some time ago, so maybe I should
recheck.

> When entering these values in the Launchpad web UI, they will be
> validated in any language. The validation code does not take the
> language into account and there seems to assume n!=1 for a plural form
> expression. It should use the plural expression when validating, though.
>

I do not know the validation code Launchpad uses, but it was my
understanding that it used msgfmt. If there is a change to be made, I'd
suggest to simply tighten the check in python-format strings (point 2.
above). I haven't checked all other *-format cases, though.

Launchpad uses libgettextpo, but only for validation, thus it doesn't provide it with full PO file metadata (like plural-forms header). So, problems which msgfmt deals well with in such cases are not supported by Launchpad validation code, and thus, this bug.

Btw, it'd definitely need new bug filed on gettext upstream for the python-format support.

Changed in rosetta:
status: New → Triaged
importance: Undecided → High
Robert Collins (lifeless) wrote :

This would be nice to handle; possibly extending the library interface to let us use more of it would do it. Patches gratefully accepted!

Changed in launchpad:
importance: High → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers