Set formats related LC_* variables when applicable instead of LC_MESSAGES, LC_CTYPE and LC_COLLATE

Bug #926207 reported by Gunnar Hjalmarsson on 2012-02-03
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Translations
Undecided
Unassigned
accountsservice (Ubuntu)
High
Martin Pitt
language-selector (Ubuntu)
High
Martin Pitt
localechooser (Ubuntu)
High
Colin Watson
ubiquity (Ubuntu)
High
Colin Watson

Bug Description

Up to Oneiric the LANG environment variable has in Ubuntu been considered to represent the regional formats. Hence, in cases where a user wants to use different locales for language respective formats, LC_MESSAGES, LC_CTYPE and LC_COLLATE has been set explicitly. Consequently the installer has set those three variables in /etc/default/locale when applicable.

In Precise we are making a conceptual change, meaning that LANG now is considered to represent the display language, and the formats related LC_* variables are set explicitly when needed to distinguish between language and formats. As mentioned at bug #590108, this is the rationale for the switch:

* It's how GNOME does it in g-c-c, and considering that we are moving
  towards replacing the language-selector UI with "Region and Language"
  in g-c-c, it would eliminate one of the current differences in
  approach between Ubuntu and GNOME.

* There seems to be quite a few desktop apps/tools, or parts of apps,
  that ignore both LANGUAGE and LC_MESSAGES for the display language,
  and let LANG solely determine the display language. (My LANG usually
  contains a Swedish locale, while my display language is English, and
  I often see Swedish translations in dialogs and menus.)

* Some distributions may prefer the simplistic approach to equal l10n
  with simply picking a locale name and assigning it to LANG. If we
  would switch to let LANG represent the language, the LANG variable
  would be used for language all over, which would reduce the risk for
  confusion with respect to locale/language settings.

Changes reflecting this switch have recently been uploaded to accountsservice and language-selector. Previous settings in /etc/default/locale and /etc/environment of LC_MESSAGES, LC_CTYPE and LC_COLLATE are deleted via accountsservice.postinst.
http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/precise/accountsservice/precise/view/head:/debian/accountsservice.postinst

For fresh installs that code is skipped, so it's important that the corresponding changes are made to the installer:

  In cases when the user makes choices at installation meaning that the
  locale for display language differs from the locale for regional
  formats, the installer should set LC_NUMERIC, LC_TIME, LC_MONETARY,
  LC_PAPER, LC_NAME, LC_ADDRESS, LC_TELEPHONE, LC_MEASUREMENT and
  refrain from setting LC_MESSAGES, LC_CTYPE and LC_COLLATE.

Gunnar Hjalmarsson (gunnarhj) wrote :

The eight formats related LC_* variables above are currently written to /etc/default/locale and /etc/environment by language-selector, and to ~/.pam_environment by accountsservice. One question is whether it's appropriate to 'pollute' the environment with all of them.

In GNOME these environment variables are set via g-s-d:
  LC_TIME
  LC_NUMERIC
  LC_MONETARY
  LC_MEASUREMENT

This is the set of formats related variables that I personally find most important:
  LC_TIME
  LC_NUMERIC
  LC_PAPER

Or maybe it's unnecessary to exclude some of them, since the possibly redundant variables do no harm.

Al 03/02/12 19:06, En/na Gunnar Hjalmarsson ha escrit:
> The eight formats related LC_* variables above are currently written to
> /etc/default/locale and /etc/environment by language-selector, and to
> ~/.pam_environment by accountsservice. One question is whether it's
> appropriate to 'pollute' the environment with all of them.
>
> In GNOME these environment variables are set via g-s-d:
> LC_TIME
> LC_NUMERIC
> LC_MONETARY
> LC_MEASUREMENT
>

Quick question Gunnar: when you're saying GNOME here, I'm guessing this
applies to Ubuntu, or are you alternatively referring to the upstream
language plugin from the Control Center (currently replaced by language
selector in Ubuntu)?

> This is the set of formats related variables that I personally find most important:
> LC_TIME
> LC_NUMERIC
> LC_PAPER
>
> Or maybe it's unnecessary to exclude some of them, since the possibly
> redundant variables do no harm.
>

--
David Planella
Ubuntu Translations Coordinator
www.ubuntu.com / www.davidplanella.wordpress.com
www.identi.ca/dplanella / www.twitter.com/dplanella

Gunnar Hjalmarsson (gunnarhj) wrote :

On 2012-02-04 10:15, David Planella wrote:
> Quick question Gunnar: when you're saying GNOME here, I'm guessing this
> applies to Ubuntu, or are you alternatively referring to the upstream
> language plugin from the Control Center (currently replaced by language
> selector in Ubuntu)?

The latter is what I meant. In Ubuntu we currently (since yesterday) set all eight formats related LC_* variables. I still think that what GNOME upstream does is relevant considering the plan to use the g-c-c region module in Ubuntu later on.

Martin Pitt (pitti) on 2012-03-14
tags: added: rls-p-tracking
Colin Watson (cjwatson) wrote :

Nobody apparently ever attempted to address my concern in bug 590108. If the user says that they speak English and live in Switzerland, what is the correct value for LC_NUMERIC? Please justify your answer carefully and consider how the installer might make this judgement automatically without having to hardcode knowledge of many particular locales.

Changed in localechooser (Ubuntu):
status: New → Incomplete
Changed in ubiquity (Ubuntu):
status: New → Incomplete
Martin Pitt (pitti) wrote :

In the current ubiquity installation LANG represents the region, and LC_MESSAGES (and friends) the language. Thus LC_NUMERIC would default to $LANG, i. e. the region. My gut feeling is that it should have been set to the same value as LC_MESSAGES instead, as it's more related to the displayed language than the region. However, I think at this point we should perhaps not change the semantics.

If ubiquity now considers LANG as the language and then sets LC_MONETARY, LC_PAPER, and also LC_NUMERIC to the region, then the net effect is identical after installation as in previous releases. But it would be compatible with how language-selector and also gnome-control-center set the locale categories.

Colin Watson (cjwatson) wrote :

Martin, Gunnar is proposing changing the semantics, and says that accountservice.postinst has (effectively) already been changed in a way that assumes that we have already done so.

Also, I chose my example carefully. What exact string value for a region-oriented locale variable (never mind exactly which one) do you propose for the example I gave?

Gunnar Hjalmarsson (gunnarhj) wrote :

To me it looks like different issues are unnecessarily mixed up here.

If I understand it correctly, the response to bug #590108 was to open a way to use two locale names in a few specified cases. For instance:

If a user selects simplified Chinese as the language, followed by Taiwan as the location, /etc/default/locale should contain:

  LANG="zh_TW.UTF-8"
  LC_MESSAGES="zh_CN.UTF-8"
  LC_CTYPE="zh_CN.UTF-8"
  LC_COLLATE="zh_CN.UTF-8"

The proposal in this bug report is to change that, so that the same choices result in these /etc/default/locale entries instead:

  LANG="zh_CN.UTF-8"
  LC_NUMERIC="zh_TW.UTF-8"
  LC_TIME="zh_TW.UTF-8"
  LC_MONETARY="zh_TW.UTF-8"
  LC_PAPER="zh_TW.UTF-8"
  LC_NAME="zh_TW.UTF-8"
  LC_ADDRESS="zh_TW.UTF-8"
  LC_TELEPHONE="zh_TW.UTF-8"
  LC_MEASUREMENT="zh_TW.UTF-8"

Both ways give us (almost) the same end result, and I don't think the change is of a semantical nature. Making the change is still important, since language-selector no longer sets LC_MESSAGES, LC_CTYPE and LC_COLLATE. Future attempts to change the system language from language-selector would else fail.

On 2012-03-19 21:58, Colin Watson wrote:
> If the user says that they speak English and live in Switzerland, what
> is the correct value for LC_NUMERIC?

Leaving Martin's remark aside for now (even if I agree it can be discussed), I suppose it's either of 'de_CH.UTF-8', 'fr_CH.UTF-8' or 'it_CH.UTF-8'. I think I understand the complexity you want to call our attention to, and I'd be happy to discuss it further. But IMO it's a separate matter, i.e. the installer's way in general to deal with localisation, and beyond the scope of this bug.

Changed in localechooser (Ubuntu):
status: Incomplete → New
Changed in ubiquity (Ubuntu):
status: Incomplete → New
Martin Pitt (pitti) wrote :

> If the user says that they speak English and live in Switzerland, what is the correct value for LC_NUMERIC?

I'd say, whatever value the installer currently sets for LC_{MESSAGES,CTYPE,COLLATE}.

> Gunnar is proposing changing the semantics

That's what I'd like to avoid for precise at least (i. e. the bit whether numeric should be tied to the language or the region; it is tied to the region right now, so let's not rip this apart at this point in the release cycle).

The net effect should be the same, the change is just which half of the LC_* categories we define to differ from LANG in case the user-selected language doesn't match the language of the LANG locale. Right now we change the "language" part, and with the proposed change we'd change the "region" part.

I might misunderstand what you are saying, of course, but I don't see the semantics change?

Gunnar Hjalmarsson (gunnarhj) wrote :

On 2012-03-22 20:16, Martin Pitt wrote:
>> If the user says that they speak English and live in Switzerland, what
>> is the correct value for LC_NUMERIC?
>
> I'd say, whatever value the installer currently sets for
> LC_{MESSAGES,CTYPE,COLLATE}.

I think you typed too fast there, Martin. ;-)

>> Gunnar is proposing changing the semantics
>
> That's what I'd like to avoid for precise at least (i. e. the bit
> whether numeric should be tied to the language or the region; it is tied
> to the region right now, so let's not rip this apart at this point in
> the release cycle).
>
> The net effect should be the same, the change is just which half of the
> LC_* categories we define to differ from LANG in case the user-selected
> language doesn't match the language of the LANG locale. Right now we
> change the "language" part, and with the proposed change we'd change the
> "region" part.

Agreed 100%. That's precisely what this is about.

Colin Watson (cjwatson) wrote :

OK, fair enough. I think you missed LC_IDENTIFICATION in your list in comment 7; I'll add that too.

Changed in localechooser (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in ubiquity (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in localechooser (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
Changed in ubiquity (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
Martin Pitt (pitti) wrote :

Adding accountsservice task for me as a reminder to:

 * also add LC_IDENTIFICATION, i. e. sync up with what ubiquity does
 * Bump postinst version comparison to make sure we get a proper migration from previous installs.

Changed in accountsservice (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
importance: Undecided → High
status: New → Triaged
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package localechooser - 2.39ubuntu2

---------------
localechooser (2.39ubuntu2) precise; urgency=low

  * Invert the set of locale categories set in case of a language/location
    conflict, so we now set LC_NUMERIC, LC_TIME, LC_MONETARY, LC_PAPER,
    LC_NAME, LC_ADDRESS, LC_TELEPHONE, LC_MEASUREMENT, and LC_IDENTIFICATION
    instead (LP: #926207).
 -- Colin Watson <email address hidden> Fri, 23 Mar 2012 16:32:43 +0000

Changed in localechooser (Ubuntu):
status: Triaged → Fix Released
Gunnar Hjalmarsson (gunnarhj) wrote :

It was intentionally I excluded LC_IDENTIFICATION in comment #7, since I don't see how that category is related to regional formats. OTOH I agree that it needs to be included to be consistent in not changing the semantics.

Including LC_IDENTIFICATION also affects language-selector, so I added yet another task.

Changed in language-selector (Ubuntu):
assignee: nobody → Gunnar Hjalmarsson (gunnarhj)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubiquity - 2.10.2

---------------
ubiquity (2.10.2) precise; urgency=low

  [ Colin Watson ]
  * Fix test_misc.GrubDefaultTests.test_avoid_cdrom.
  * Invert the set of locale categories set in case of a language/location
    conflict, so we now set LC_NUMERIC, LC_TIME, LC_MONETARY, LC_PAPER,
    LC_NAME, LC_ADDRESS, LC_TELEPHONE, LC_MEASUREMENT, and LC_IDENTIFICATION
    instead (LP: #926207).
  * Automatic update of included source packages: localechooser 2.39ubuntu2.

  [ Brian Murray ]
  * UTF-8-encode debug messages which are written to stderr (LP: #960278)
 -- Colin Watson <email address hidden> Fri, 23 Mar 2012 18:00:36 +0000

Changed in ubiquity (Ubuntu):
status: Triaged → Fix Released
tags: removed: rls-p-tracking
Martin Pitt (pitti) on 2012-03-27
tags: added: rls-p-tracking
Martin Pitt (pitti) wrote :

accountsservice fix uploaded to unapproved.

Changed in accountsservice (Ubuntu):
status: Triaged → Fix Committed
Martin Pitt (pitti) wrote :

I updated language-selector as well in bzr, while I was at it.

Changed in language-selector (Ubuntu):
assignee: Gunnar Hjalmarsson (gunnarhj) → Martin Pitt (pitti)
status: New → Fix Committed
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package accountsservice - 0.6.15-2ubuntu9

---------------
accountsservice (0.6.15-2ubuntu9) precise; urgency=low

  * 0009-language-tools.patch: Also include LC_IDENTIFICATION, to comply to
    how Ubiquity sets the locale. (LP: #926207)
  * debian/accountsservice.postinst: Also migrate LC_IDENTIFICATION. Bump
    version comparison to this version to ensure the migration happens after
    Ubiquity got fixed, and include the new variable.
 -- Martin Pitt <email address hidden> Tue, 27 Mar 2012 07:31:53 +0200

Changed in accountsservice (Ubuntu):
status: Fix Committed → Fix Released
Gabor Kelemen (kelemeng) on 2012-03-27
Changed in ubuntu-translations:
status: New → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package language-selector - 0.77

---------------
language-selector (0.77) precise; urgency=low

  * dbus_backend/ls-dbus-backend: Set LC_IDENTIFICATION as well, to comply
    with what Ubiquity does. (LP: #926207)
  * tests/test_language_support_pkgs.py: Fix
    test_by_package_and_locale_noinstalled() for current pkg_depends: gedit
    does not need aspell any more, use abiword as trigger package.
  * language_support_pkgs.py, _expand_pkg_pattern(): Special-case "zh-han[st]"
    values for the locale. These are not actual locales, but e. g. Ubiquity
    assumes this works. So let these mean "zh_CN" and "zh_TW" respectively.
    Add a test case to tests/test_language_support_pkgs.py. (LP: #963460)
 -- Martin Pitt <email address hidden> Fri, 30 Mar 2012 12:42:15 +0200

Changed in language-selector (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers