Comment 6 for bug 710148

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

Hi David,
You make a few points, that would have been very valid if we were to
loudly state the "main dialect" for all languages in the world. But we
aren't. The file I pointed at is aimed at only being used behind the
scenes in order to help

1. prevent confusion and buggy behavior in some cases with respect to
   the locale name assigned to LC_MESSAGES, and

2. improve the accuracy of the lists of language options in
   language-selector and GDM.

The aspects of accuracy I have in mind are:

- Avoid to show e.g. both 'de' and 'de_DE' or both 'es' and 'es_ES',
  since users would consider them being duplicates of effectively the
  same option.

- "Spanish (Spain)" is a clearer label than just "Spanish" for an option
  that means "Spanish, as spoken in Spain" as opposed to e.g. "Spanish,
  as spoken in Mexico".

- Prevent the risk due to bug 700213 that certain msgid translations
  can't be easily accessed.

The map file isn't perfect; it will never be. I still believe it will
serve its intended purposes for 98% or so of the Ubuntu users, without
messing it up for the other 2%. And if/when people report related bugs,
it will typically be easy to fix them with small code changes, since
there is a tool in place.

On 2011-02-01 09:34, David Planella wrote:
> What would you do in the case of English for example? What would be the
> main country, the UK or US?

The UK, of course. Or would you consider Mexico to be the main country
of Spanish because of its large population? ;-)

Seriously, I'm treating English as a special case, since 'en' is always
present in the LANGUAGE list by design (as the last item). For that
reason, 'en' should always be included in the UI lists of language options.

If you look at /usr/share/language-selector/data/main-countries, you see
the intentionally vague expression "main or origin country". For the
purpose of determining the LC_MESSAGES locale name, I included 'en' =>
'en_GB'. The Americans won't likely object to the claim that English
originated in England, UK...

> There is no es_ES language, neither in /usr/share/locale nor
> in /usr/share/locale-langpack,

If you take the universe packages into account, there may well be
translations in /usr/share/locale/es_ES. I have a package installed with
translations in all of 'es_AR', 'es_CO', 'es_CR', 'es_ES' and 'es_MX'.

Initially I didn't think of the impact of universe, but it's now taken
into consideration due to the discussion at bug 693337.

> However, what I see is that there is always at least an 'll' code, i.e.
> there is no situation where there are only 'll_CC' codes for a language.

I know of one (or rather two): Chinese. For that reason, only 'zh'
should never be included in the lists of language options.

> In any case, it's a tricky problem.

Yes it is, and I for one appreciate this conversation, through which we
may identify and consider some of the not so obvious pitfalls. Needless
to say, your expertise with respect to language related matters is of
great value to this design discussion.

Attached please find the file "language-options" with the proposed GDM
code for generating the option list. I don't know to which extent the
code itself is useful to you, but there are also several comments to
facilitate the understanding of what's happening when the code is run.

Hope we can reach a consensus on this topic.