2011-03-08 01:31:45 |
charlestu |
description |
Launchpad now uses two-letter ISO-639-1 codes where available and three-letter ISO-639-3 codes for other languages. This allows for some confusion where multiple scripts are available [i.e. nan vs zh_TW, both of which refer to minnan chinese but with different scripts] and where codes would overlap [i.e. zh_CH with cmn-CH, nan-CH, yue-CH, and wuu-CH].
Converting to IETF (2009) language tags [http://en.wikipedia.org/wiki/IETF_language_tag], which consist of an ISO 639-3 alpha-3 language code [http://en.wikipedia.org/wiki/ISO_639-3] plus an ISO 15924 alpha-4 script code [http://en.wikipedia.org/wiki/ISO_15924] plus an ISO 3166-1 alpha-2 country code [http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2], would resolve such ambiguities and would promote compliance with current standards.
Adopting standard language names with the script appended, where appropriate, and the country in parenthesis would avoid ambiguity (i.e. "Chinese (Hong Kong)" and "Chinese (Traditional)" are different languages which both use traditional characters).
Some examples of the proposed change:
current name - label -> proposed name - eitf tag
Arabic - ar -> Arabic (MSA: Egypt) - arb-Arab-EG
Bengali - bn -> Bengali (Bengladesh) - ben-Beng-BD
Brazilian Portuguese - pt-BR -> Portugese (Brazil) - por-Latn-BR
Chinese (Hong Kong) - zh_HK -> Yue Chinese: Traditional Han (Hong Kong) - yue-Hant-HK
Chinese (Simplified) zh_CN -> Mandarin Chinese: Simplified Han (China) - cmn-Hans-CH
Chinese (Traditional) - zh_TW -> Min Nan Chinese: Traditional Han (Taiwan) - nan-Hant-TW
English (Australia) - en_AU -> English (Australia) - eng-Latn-AU
English (Canada) - en_CA -> English (Canada) - eng-Latn-CA
English (United Kingdom) - en_GB -> English (United Kingdom) - eng-Latn-GB
Hindi - hi -> Hindi (India) - hin-Deva-IN
Japanese - ja -> Japanese (Japan) - jpn-Jpan-JP
Min Nan Chinese - nan -> Min Nan Chinese: Pe̍h-ōe-jī (Taiwan) - nan-Latn-TW
Moroccan Arabic - ary -> Arabic (Morocco) - ary-Arab-MA
Portuguese - pt -> Portugese (Portugal) - por-Latn-PT
Russian - ru -> Russian (Russia) - rus-Cyrl-RU
Spanish - es -> Spanish (Spain) - spa-Latn-ES |
Launchpad now uses two-letter ISO-639-1 codes where available and three-letter ISO-639-3 codes for other languages. This allows for some confusion where multiple scripts are available [i.e. nan vs zh_TW, both of which refer to min nan chinese but with different scripts] and where codes would overlap [i.e. zh_CN with cmn-CN, nan-CN, yue-CN, and wuu-CN].
Converting to IETF (2009) language tags [http://en.wikipedia.org/wiki/IETF_language_tag], which consist of an ISO 639-3 alpha-3 language code [http://en.wikipedia.org/wiki/ISO_639-3] plus an ISO 15924 alpha-4 script code [http://en.wikipedia.org/wiki/ISO_15924] plus an ISO 3166-1 alpha-2 country code [http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2], would resolve such ambiguities and would promote compliance with current standards.
Adopting standard language names with the script appended, where appropriate, and the country in parenthesis would avoid ambiguity (i.e. "Chinese (Hong Kong)" and "Chinese (Traditional)" are different languages which both use traditional characters).
Some examples of the proposed change:
current name - label -> proposed name - eitf tag
Arabic - ar -> Arabic (MSA: Egypt) - arb-Arab-EG
Bengali - bn -> Bengali (Bengladesh) - ben-Beng-BD
Brazilian Portuguese - pt-BR -> Portuguese (Brazil) - por-Latn-BR
Chinese (Hong Kong) - zh_HK -> Yue Chinese: Traditional Han (Hong Kong) - yue-Hant-HK
Chinese (Simplified) zh_CN -> Mandarin Chinese: Simplified Han (China) - cmn-Hans-CN
Chinese (Traditional) - zh_TW -> Min Nan Chinese: Traditional Han (Taiwan) - nan-Hant-TW
English (Australia) - en_AU -> English (Australia) - eng-Latn-AU
English (Canada) - en_CA -> English (Canada) - eng-Latn-CA
English (United Kingdom) - en_GB -> English (United Kingdom) - eng-Latn-GB
Hindi - hi -> Hindi (India) - hin-Deva-IN
Japanese - ja -> Japanese (Japan) - jpn-Jpan-JP
Min Nan Chinese - nan -> Min Nan Chinese: Pe̍h-ōe-jī (Taiwan) - nan-Latn-TW
Moroccan Arabic - ary -> Arabic (Morocco) - ary-Arab-MA
Portuguese - pt -> Portuguese (Portugal) - por-Latn-PT
Russian - ru -> Russian (Russia) - rus-Cyrl-RU
Spanish - es -> Spanish (Spain) - spa-Latn-ES |
|