Request for new language packages for Kurdish Sorani (ckb)

Bug #1388808 reported by Aras
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
GLibC
Fix Released
Wishlist
langpack-o-matic
Fix Released
Undecided
Gunnar Hjalmarsson
langpack-locales (Ubuntu)
Fix Released
Medium
Gunnar Hjalmarsson

Bug Description

Related Bug: 266975
The Sorani dialect team wished to add its locale to the library so as to get finaly Kurdish-sorani version.
 Kurdish Sorani is a Right-to-Left script language like Arabic spoken mainly in Kurdistan region of Iran and Iraq.

Related branches

Revision history for this message
In , Aras (aras-noori) wrote :

Kurdish (Iraq) needs an own locale definition, since Kurdish is an official
language in Iraq since 200x

duo to:
https://bugs.launchpad.net/ubuntu/+source/langpack-locales/+bug/266975

the locale are defined: http://www.zkurd.org/aras/ckb_IQ.txt

Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 3704
localedata for Kurdish Sorani (CKB)

the localedata for Kurdish Sorani (CKB)

Revision history for this message
In , Erdal Ronahi (erdalronahi) wrote :
Download full text (6.0 KiB)

Comment on attachment 3704
localedata for Kurdish Sorani (CKB)

comment_char %
escape_char /
% Kurdish (Sorani) language locale for Iraq and Iran.
% Contributed by Aras Noori <email address hidden> and
% Erdal Ronahi<email address hidden>.
% Contact: Aras
% Language: ku
% Date: 2009-01-29
% Distribution and use is free, also
% for commercial purposes.
% History:
%

LC_IDENTIFICATION
title "Kurdish language locale for Sorani dialects"
source ""
address ""
contact "Aras"
email "<email address hidden>, <email address hidden>"
tel ""
fax ""
language "Kurdish"
territory "Iraq"
revision "1.0"
date "2009-01-29"
%
category "ckb_IQ:2000";LC_IDENTIFICATION
category "ckb_IQ:2000";LC_CTYPE
category "ckb_IQ:2000";LC_COLLATE
category "ckb_IQ:2000";LC_TIME
category "ckb_IQ:2000";LC_NUMERIC
category "ckb_IQ:2000";LC_MONETARY
category "ckb_IQ:2000";LC_MESSAGES
category "ckb_IQ:2000";LC_PAPER
category "ckb_IQ:2000";LC_NAME
category "ckb_IQ:2000";LC_ADDRESS
category "ckb_IQ:2000";LC_TELEPHONE
category "ku_IQ:2000";LC_MEASUREMENT

END LC_IDENTIFICATION

LC_CTYPE
copy "i18n"
END LC_CTYPE

LC_COLLATE

% Copy the template from ISO/IEC 14651
copy "iso14651_t1"

END LC_COLLATE

LC_MONETARY
% This is the POSIX Locale definition the LC_MONETARY category.
% These are generated based on XML base Locale difintion file
% for IBM Class for Unicode/Java
%
int_curr_symbol "<U0049><U0051><U0044><U0020>"
currency_symbol "<U062F><U002E><U0639><U002E>"
mon_decimal_point "<U002E>"
mon_thousands_sep "<U002C>"
mon_grouping 3
positive_sign ""
negative_sign "<U002D>"
int_frac_digits 3
frac_digits 3
p_cs_precedes 1
p_sep_by_space 1
n_cs_precedes 1
n_sep_by_space 1
p_sign_posn 1
n_sign_posn 2
%
END LC_MONETARY

LC_NUMERIC
% This is the POSIX Locale definition for the LC_NUMERIC category.
%
decimal_point "<U002E>"
thousands_sep "<U002C>"
grouping 3
%
END LC_NUMERIC

LC_TIME
% This is the POSIX Locale definition for the LC_TIME category.
% These are generated based on XML base Locale difintion file
% for IBM Class for Unicode/Java
%
% Abbreviated weekday names (%a)
abday "<U062D>";"<U0646>";/
     "<U062B>";"<U0631>";/
     "<U062E>";"<U062C>";/
     "<U0633>"
%
% Full weekday names (%A)
day "<U06CC><U06D5><U0643><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U062F><U0648><U0648><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U0633><U06CE><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U0686><U0648><U0624><U0631><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U067E><U06CE><U0646><U062C><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U0647><U06D5><U06CC><U0646><U06CC>";/
     "<U0634><U06D5><U0645><U0645><U06D5>";/
%
% Abbreviated month names (%b)
abmon "<U064A><U0646><U0627>";"<U0641><U0628><U0631>";/
     "<U0645><U0627><U0631>";"<U0623><U0628><U0631>";/
     "<U0645><U0627><U064A>";"<U064A><U0648><U0646>";/
     "<U064A><U0648><U0644>";"<U0623><U063A><U0633>";/
     "<U0633><U0628><U062A>";"<U0623><U0643><U062A>";/
     "<U0646><U0648><U0641>";"<U062F><U064A><U0633>"
%
% Full month names (%B)
mon "<U064A><U0646><U0627><U064A><U0631>";/
  ...

Read more...

Revision history for this message
In , Erdal Ronahi (erdalronahi) wrote :

The correspondig bug in the Launchpad bug tracker for Ubuntu is
https://bugs.launchpad.net/ubuntu/+source/langpack-locales/+bug/266975

Revision history for this message
In , Erdal Ronahi (erdalronahi) wrote :

Aras is the reporter, not the assignee.

Revision history for this message
In , Drepper-fsp (drepper-fsp) wrote :

The file is ill-formed. The file is named ckb_IQ and you use it in "copy" in
various categories? You also have copy and definitions in categories like
LC_PAPER. You have to fix it up.

Revision history for this message
In , Aras (aras-noori) wrote :
Download full text (7.7 KiB)

Comment on attachment 3704
localedata for Kurdish Sorani (CKB)

escape_char /
comment_char %
% Kurdish (Sorani) language locale for Iraq and Iran.
% Contributed by Aras Noori <email address hidden> and
% Erdal Ronahi<email address hidden>.
% Contact: Aras Noori
% Language: ku
% Date: 2009-04-14
% Distribution and use is free, also
% for commercial purposes.
% History:
% January 2009: Defining CKB locale
% March 2009: Adding rule for CKB
%

LC_IDENTIFICATION
title "Kurdish language locale for Sorani dialects - Central Kurdish"
source ""
address ""
contact "Aras"
email "<email address hidden>"
tel ""
fax ""
language "Kurdish"
territory "Iraq"
revision "1.1"
date "2009-04-15"
%

category "ckb_IQ:2000";LC_IDENTIFICATION
category "ckb_IQ:2000";LC_CTYPE
category "ckb_IQ:2000";LC_COLLATE
category "ckb_IQ:2000";LC_TIME
category "ckb_IQ:2000";LC_NUMERIC
category "ckb_IQ:2000";LC_MONETARY
category "ckb_IQ:2000";LC_MESSAGES
category "ckb_IQ:2000";LC_PAPER
category "ckb_IQ:2000";LC_NAME
category "ckb_IQ:2000";LC_ADDRESS
category "ckb_IQ:2000";LC_TELEPHONE
category "ckb_IQ:2000";LC_MEASUREMENT

END LC_IDENTIFICATION

LC_CTYPE
copy "i18n"
END LC_CTYPE

LC_COLLATE
% The Sorani Kurdish dialect is mainly written using a modified Arabic-based
alphabet with 33 letters.
% Unlike the regular Arabic alphabet, which is an abjad, Sorani is an alphabet
in which vowels are mandatory, making the script easy to read.
%
% The CKB (Sorani) alphabet order is:
% in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q,
r, rr, s, sh, t, u, uu, v, w, x, y, z
% ئ، ب، پ، ت، ج، چ، ح، خ، د، ر، ڕ، ز، ژ، س، ش،
ف، ڤ، ق، ع، غ، ك، گ، ل، ڵ، م، ن، و، وو، ۆ، هـ،
ی، ێ
% vowels: A, E, I, O, U, UU
% پیتەبزوێنەكان ئەمانەن: ئ، ا، ە، و، وو، ۆ،
ی، ێ،
%
% Copy the template from ISO/IEC 14651
copy "iso14651_t1"

collating-element <ئا> from <U0626><U0627>
collating-element <وو> from <U0648><U0648>
collating-element <لا> from <U0644><U0627>

collating-symbol <U0628>
collating-symbol <U062C>
collating-symbol <U0631>
collating-symbol <U0632>
collating-symbol <U0641>
collating-symbol <U0643>
collating-symbol <U0644>
collating-symbol <U0648>
collating-symbol <U06CC>

reorder-after <U0628> <U067E>
reorder-after <U062C><U0686>
reorder-after <U0631><U0695>
reorder-after <U0632><U0698>
reorder-after <U0641><U06A4>
reorder-after <U0643><U06AF>
reorder-after <U0644><U06B5>
reorder-after <U0648><U06C6>
reorder-after <U06CC><U06CE>

% Kurdish digits same as Arabic ones: they are the basic forms.
reorder-after <U0660>
<U0660> <0>;<PCL>;<MIN>;IGNORE
<U0661> <1>;<PCL>;<MIN>;IGNORE
<U0662> <2>;<PCL>;<MIN>;IGNORE
<U0663> <3>;<PCL>;<MIN>;IGNORE
<U0664> <4>;<PCL>;<MIN>;IGNORE
<U0665> <5>;<PCL>;<MIN>;IGNORE
<U0666> <6>;<PCL>;<MIN>;IGNORE
<U0667> <7>;<PCL>;<MIN>;IGNORE
<U0668> <8>;<PCL>;<MIN>;IGNORE
<U0669> <9>;<PCL>;<MIN>;IGNORE

reorder-end

END LC_COLLATE

LC_MONETARY
% This is the POSIX Locale definition the LC_MONETARY category.
% These are generated based on XML base Locale difintion file
% for IBM Class for Unicode/Java
%
int_curr_symbol "<U0049><U0051><U0044><U0020>"
currency_symbol "<U062F><U002E><U0639><U002E>"
mon_decimal_point "<U00...

Read more...

Revision history for this message
In , Aras (aras-noori) wrote :

I updated the file. please check if still has a bad format.

Revision history for this message
In , Aras (aras-noori) wrote :

Subject: Re: Please add Kurdish locale for Kurdish
 Sorani (CKB)

Hi
I update the locale couple weeks ago, would you please check it again

http://www.sourceware.org/ml/libc-locales/2009-q2/msg00021.html

Its my pleasure to hear from you a feedback.

Regards
Aras

On Sat, Feb 7, 2009 at 5:53 AM, drepper at redhat dot
com<email address hidden> wrote:
>
> ------- Additional Comments From drepper at redhat dot com  2009-02-07 03:53 -------
> The file is ill-formed.  The file is named ckb_IQ and you use it in "copy" in
> various categories?  You also have copy and definitions in categories like
> LC_PAPER.  You have to fix it up.
>
> --
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>             Status|NEW                         |WAITING
>
>
> http://sourceware.org/bugzilla/show_bug.cgi?id=9809
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>

Revision history for this message
In , Martin Pitt (pitti) wrote :

Can you please attach the current version instead of adding it as a comment? The
latter destroys all the non-ASCII characters.

Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 4013
new Locale info for CKB

hier is the new local info for CKB

Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 4168
new fixed Locale info for CKB

fixed bugs in Message category

Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 4228
Locale info for CKB

Revision history for this message
In , Martin Pitt (pitti) wrote :

Info is provided, changing back to NEW

Revision history for this message
In , Martin Pitt (pitti) wrote :

I get a lot of errors when I try to build this locale:

locales/ckb_IQ:62: LC_COLLATE: syntax error
locales/ckb_IQ:63: LC_COLLATE: syntax error
locales/ckb_IQ:64: LC_COLLATE: syntax error
locales/ckb_IQ:66: LC_COLLATE: syntax error
locales/ckb_IQ:67: LC_COLLATE: syntax error
locales/ckb_IQ:68: LC_COLLATE: syntax error
locales/ckb_IQ:69: LC_COLLATE: syntax error
locales/ckb_IQ:70: LC_COLLATE: syntax error
locales/ckb_IQ:71: LC_COLLATE: syntax error
locales/ckb_IQ:72: LC_COLLATE: syntax error
locales/ckb_IQ:73: LC_COLLATE: syntax error
locales/ckb_IQ:74: LC_COLLATE: syntax error
locales/ckb_IQ:76: trailing garbage at end of line
locales/ckb_IQ:77: trailing garbage at end of line
locales/ckb_IQ:78: trailing garbage at end of line
locales/ckb_IQ:79: trailing garbage at end of line
locales/ckb_IQ:80: trailing garbage at end of line
locales/ckb_IQ:81: trailing garbage at end of line
locales/ckb_IQ:82: trailing garbage at end of line
locales/ckb_IQ:83: trailing garbage at end of line
locales/ckb_IQ:84: LC_COLLATE: cannot reorder after U000006CC: symbol not known
locales/ckb_IQ:155: extra trailing semicolon
LC_NAME: invalid escape sequence in field `name_fmt'
LC_ADDRESS: invalid escape `%I' sequence in field `postal_fmt'
LC_ADDRESS: `lang_ab' value does not match `lang_term' value
LC_ADDRESS: `lang_lib' value does not match `lang_term' value
LC_ADDRESS: `country_ab2' value does not match `country_num' value
LC_ADDRESS: `country_ab3' value does not match `country_num' value

Revision history for this message
In , Aras (aras-noori) wrote :

I am analyzing the Errors now and try to fix them as soon as I can.

Thanks & Regards
Aras

Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 4357
CKB locale - updated

Hi,
I fixed some errors due to the occured Errors. How can I test it by myself
before release to bugzilla?

Regards

Revision history for this message
In , Aras (aras-noori) wrote :

Any progress?

(In reply to comment #16)
> Created an attachment (id=4357)
> CKB locale - updated
>
> Hi,
> I fixed some errors due to the occured Errors. How can I test it by myself
> before release to bugzilla?
>
> Regards
>

Revision history for this message
In , Petr Baudis (pasky) wrote :

Use localedef to compile your locale, and $LOCPATH if you don't want to install
it system-wide in order to test it.

Revision history for this message
In , Drepper-fsp (drepper-fsp) wrote :

The file isn't usable as-is, there are many problems when compiling it.

First, it must be in UTF-8.

Second, the collation rules seem all pretty bogus since there already are rules for all the characters defined. If needed, you have to redefine the relocation.

Third, there are many syntax errors.

Fourth, all the values for the fields must use the <U....> notation, not real strings.

Fifth, the values for some fields is plain wrong. localedef will tell you.

I did add the language code to localedef now.

Just run localedef like

   localedef -i ./YOURFILE -f UTF-8 ./SOMEDIR

Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 5727
CKB-IQ locale info (Kurdish Sorani)

CKB-IQ locale info (Kurdish Sorani)

Revision history for this message
In , Aras (aras-noori) wrote :

Hi,
I updated the file, it was full of syntax Errors, I repair most of them, hope its works now.
I observed many locale files used slach / others using backslash \, compiler distinguished between them!, I learned its should not be so. The file is saved as UTF-8 also with Unix format of EOL.

best regards
Aras

Revision history for this message
In , Drepper-fsp (drepper-fsp) wrote :

You haven't fixed the collation information. You have to use the generic collation data (include it, do not copy it) and then, if necessary at all, define modifications using reorder_after etc.

Revision history for this message
In , Aras (aras-noori) wrote :

(In reply to comment #22)
> You haven't fixed the collation information. You have to use the generic
> collation data (include it, do not copy it) and then, if necessary at all,
> define modifications using reorder_after etc.

did you mean
% collating-element <LAM WITH SMALL V-ALEF> from <U06B5><U0627>

they are already commented.

regards
Aras

Revision history for this message
In , Erdal Ronahi (erdalronahi) wrote :

Hallo Aras,

gut zu sehen, dass Du wieder an der Datei arbeitest.

Weißt Dul was "collation" ist und wozu es gut ist? Es geht dabei um die
Reihenfolge der alphabetischen Sortierung.

Liebe Grüße
Erdal

On 16 May 2011 17:24, aras.noori at gmail dot com <
<email address hidden>> wrote:

> http://sourceware.org/bugzilla/show_bug.cgi?id=9809
>
> --- Comment #23 from Aras Noori <aras.noori at gmail dot com> 2011-05-16
> 15:23:50 UTC ---
> (In reply to comment #22)
> > You haven't fixed the collation information. You have to use the generic
> > collation data (include it, do not copy it) and then, if necessary at
> all,
> > define modifications using reorder_after etc.
>
> did you mean
> % collating-element <LAM WITH SMALL V-ALEF> from <U06B5><U0627>
>
> they are already commented.
>
> regards
> Aras
>
> --
> Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>

Revision history for this message
In , Erdal Ronahi (erdalronahi) wrote :

Sorry for posting in German here, wasn't aware that replies go directly to
the bugmail.

Revision history for this message
In , Drepper-fsp (drepper-fsp) wrote :

(In reply to comment #23)
> (In reply to comment #22)
> > You haven't fixed the collation information. You have to use the generic
> > collation data (include it, do not copy it) and then, if necessary at all,
> > define modifications using reorder_after etc.
>
> did you mean
> % collating-element <LAM WITH SMALL V-ALEF> from <U06B5><U0627>
>
> they are already commented.

No. Look at the other files. There are some broken ones but those I caught in time are using

copy "iso14651_t1"

and then use if necessary reorder_after.

Revision history for this message
In , Aras (aras-noori) wrote :

Yes Erdal I also defined 3 collations, they were had syntax Error as Mr. Ulrich Drepper says. I would upload the new version later. Thanks for your efforts.

Revision history for this message
In , Petr Baudis (pasky) wrote :

WAITING for almost a year now. Please reopen the bug when you have a new patch.

Revision history for this message
In , Petr Baudis (pasky) wrote :

Sorry for the clicko; this was not fixed.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Can you Post problems?

Revision history for this message
In , Jackie-rosen (jackie-rosen) wrote :

*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.

Revision history for this message
Aras (aras-noori) wrote :
description: updated
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

Hi Aras, and thanks for the locale definition file. I could successfully compile it.

However, just like last time (https://launchpad.net/bugs/266975) we would like you to submit it upstream too before we add it to Ubuntu. So can you please file an upstream bug report and post the URL to it here.

Changed in langpack-locales (Ubuntu):
status: New → Incomplete
Revision history for this message
In , Aras (aras-noori) wrote :

Created attachment 7887
CKB-IQ locale info (Kurdish Sorani)

Revision history for this message
In , Aras (aras-noori) wrote :

Hi All,
I attached the new version, bug free.

I also renewed the Bug on Launchpad at:
https://bugs.launchpad.net/ubuntu/+source/langpack-locales/+bug/1388808

Best regards for all your efforts in past.

Regards
Aras

Revision history for this message
Aras (aras-noori) wrote :

Hello Gunnar,
sure, I update the Bug (https://sourceware.org/bugzilla/show_bug.cgi?id=9809) and attached the new file.

Thank you
Aras

Revision history for this message
In , Gunnar Hjalmarsson (gunnarhj) wrote :

Please see comment #33 by Aras Noori.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :
tags: added: patch
Changed in langpack-locales (Ubuntu):
assignee: nobody → Gunnar Hjalmarsson (gunnarhj)
importance: Undecided → Medium
status: Incomplete → In Progress
Changed in langpack-o-matic:
assignee: nobody → Gunnar Hjalmarsson (gunnarhj)
status: New → In Progress
Revision history for this message
Martin Pitt (pitti) wrote :

Uploaded the locale, thanks!

Changed in langpack-locales (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package langpack-locales - 2.13+git20120306-18

---------------
langpack-locales (2.13+git20120306-18) vivid; urgency=low

  * debian/patches/ubuntu-ckb_IQ-new_locale.patch:
    Addition of the ckb_IQ locale (LP: #1388808).
 -- Gunnar Hjalmarsson <email address hidden> Tue, 04 Nov 2014 02:45:00 +0100

Changed in langpack-locales (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote :

Merged and rolled out langpack-o-matic. Thanks!

Changed in langpack-o-matic:
status: In Progress → Fix Released
Revision history for this message
In , Mike Frysinger (vapier) wrote :

first, please update the header of the file to match all the current locales. the first ~10 lines should be the same (e.g. as en_US). you should sync to latest git as the last release is out of date already.

> title "Kurdish language locale based on Arabic letters"

this should be:
Central Kurdish language locale for Iraq

> tel "+49 17629857380"

leave this field blank

> language "Kurdish"

change to "Central Kurdish"

> territory "Iraq, Iran"

drop Iran. this locale is only for Iraq.

> category "ckb_IQ:2000";LC_IDENTIFICATION

you'll need to fix all these category fields. copy them from en_US for their correct values.

> LC_COLLATE

please rebase this to start with:
copy "iso14651_t1"

> % This is the POSIX Locale definition for the LC_NUMERIC category.

delete these old comments from the LC_NUMERIC, LC_TIME, LC_NAME, and LC_ADDRESS categories

> LC_TIME

are you sure about the day/abday/mon/abmon translations ? CLDR says they're different.

make sure day/abday start on Sunday

> am_pm

does Iraq really use am/pm notation ? if not, leave these fields blank.

> first_workday 7

change this to 2 and add this line:
week 7;19971130;1

> yesexpr "<U0628><U06D5><U06B5><U06CE>"
> noexpr "<U0646><U06D5><U062E><U06CE><U0631>"

these need to be updated. these should be regular expressions to match a yes/no answer. see the current en_US value as an example.

please also provide yesstr/nostr translations

> LC_PAPER
> LC_MEASUREMENT

change both of these categories to simply:
copy "ar_IQ"

> name_gen "<U002D><U0073><U0061><U006E>"

this is "-san". is that correct ?

> country_car "<U0049><U0051>"

shouldn't this be "IRQ" instead of "IQ" ?

> LC_ADDRESS

please define country_name (localized translation for Iraq)

> tel_int_fmt "+%c ;%a ;%l"

pretty sure this should be:
  +%c %a%t%l

> tel_dom_fmt "<U202A><U0025><U0041><U2012><U0025><U006C><U202C>"

are you sure this is correct ?

Changed in glibc:
importance: Unknown → Wishlist
status: Unknown → Incomplete
Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Created attachment 12173
Localedata file for ckb_IQ

Here is new version of localedata file for ckb.thanks

Revision history for this message
In , Aras (aras-noori) wrote :

(In reply to Mike Frysinger from comment #35)
> first, please update the header of the file to match all the current
> locales. the first ~10 lines should be the same (e.g. as en_US). you
> should sync to latest git as the last release is out of date already.
>
> > title "Kurdish language locale based on Arabic letters"
>
> this should be:
> Central Kurdish language locale for Iraq
>
> > tel "+49 17629857380"
>
> leave this field blank
>
> > language "Kurdish"
>
> change to "Central Kurdish"
>
> > territory "Iraq, Iran"
>
> drop Iran. this locale is only for Iraq.
>
> > category "ckb_IQ:2000";LC_IDENTIFICATION
>
> you'll need to fix all these category fields. copy them from en_US for
> their correct values.
>
> > LC_COLLATE
>
> please rebase this to start with:
> copy "iso14651_t1"
>
> > % This is the POSIX Locale definition for the LC_NUMERIC category.
>
> delete these old comments from the LC_NUMERIC, LC_TIME, LC_NAME, and
> LC_ADDRESS categories
>
> > LC_TIME
>
> are you sure about the day/abday/mon/abmon translations ? CLDR says they're
> different.
>
> make sure day/abday start on Sunday
>
> > am_pm
>
> does Iraq really use am/pm notation ? if not, leave these fields blank.
>
> > first_workday 7
>
> change this to 2 and add this line:
> week 7;19971130;1
>
> > yesexpr "<U0628><U06D5><U06B5><U06CE>"
> > noexpr "<U0646><U06D5><U062E><U06CE><U0631>"
>
> these need to be updated. these should be regular expressions to match a
> yes/no answer. see the current en_US value as an example.
>
> please also provide yesstr/nostr translations
>
> > LC_PAPER
> > LC_MEASUREMENT
>
> change both of these categories to simply:
> copy "ar_IQ"
>
> > name_gen "<U002D><U0073><U0061><U006E>"
>
> this is "-san". is that correct ?
>
> > country_car "<U0049><U0051>"
>
> shouldn't this be "IRQ" instead of "IQ" ?
>
> > LC_ADDRESS
>
> please define country_name (localized translation for Iraq)
>
> > tel_int_fmt "+%c ;%a ;%l"
>
> pretty sure this should be:
> +%c %a%t%l
>
> > tel_dom_fmt "<U202A><U0025><U0041><U2012><U0025><U006C><U202C>"
>
> are you sure this is correct ?

Thank you for your tipps, @Jwtiayr and I fixed the bugs.

Revision history for this message
In , Aras (aras-noori) wrote :

(In reply to Jwtiayr Nariman from comment #36)
> Created attachment 12173 [details]
> Localedata file for ckb_IQ
>
> Here is new version of localedata file for ckb.thanks

Great Work.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

> LC_COLLATE
> % The Kurdish Sorani, Bahdini, and others dialects is mainly written using a modified (Arabic-based alphabet) with 33 letters.
> % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an alphabet in which vowels are mandatory, making the script easy to read.
> %
> % The kurdish alphabet order is:
> % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z
> % vowels: A, E, I, O, U, UU
> %
> % Copy the template from ISO/IEC 14651
>
> order_start forward; forward
> %
> % Kurdish numeric characters.
> %
> <U0660> <U0660>

You still did not base the collation on iso14651_t1.

Your LC_COLLATE section should start like this:

    LC_COLLATE
    copy "iso14651_t1"

and then you should only reorder the characters which are not correctly
ordered already, i.e. you should only do modifications to the default
collation order comming from "iso14651_t1", *not* write everything from
scratch.

I can try to help you with that and try to rewrite your LC_COLLATE.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

T(In reply to Mike FABIAN from comment #39)
> > LC_COLLATE
> > % The Kurdish Sorani, Bahdini, and others dialects is mainly written using a modified (Arabic-based alphabet) with 33 letters.
> > % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an alphabet in which vowels are mandatory, making the script easy to read.
> > %
> > % The kurdish alphabet order is:
> > % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z
> > % vowels: A, E, I, O, U, UU
> > %
> > % Copy the template from ISO/IEC 14651
> >
> > order_start forward; forward
> > %
> > % Kurdish numeric characters.
> > %
> > <U0660> <U0660>
>
> You still did not base the collation on iso14651_t1.
>
> Your LC_COLLATE section should start like this:
>
> LC_COLLATE
> copy "iso14651_t1"
>
> and then you should only reorder the characters which are not correctly
> ordered already, i.e. you should only do modifications to the default
> collation order comming from "iso14651_t1", *not* write everything from
> scratch.
>
> I can try to help you with that and try to rewrite your LC_COLLATE.

Thank you mike, its little complicated i think i don't understand your point.
But if you can do its really appreciated.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Created attachment 12190
attachment-64689-0.html

What we do now dear mike?

On Wed, Jan 8, 2020, 20:35 maiku.fabian at gmail dot com <
<email address hidden>> wrote:

> https://sourceware.org/bugzilla/show_bug.cgi?id=9809
>
> --- Comment #39 from Mike FABIAN <maiku.fabian at gmail dot com> ---
> > LC_COLLATE
> > % The Kurdish Sorani, Bahdini, and others dialects is mainly written
> using a modified (Arabic-based alphabet) with 33 letters.
> > % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an
> alphabet in which vowels are mandatory, making the script easy to read.
> > %
> > % The kurdish alphabet order is:
> > % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p,
> q, r, rr, s, sh, t, u, uu, v, w, x, y, z
> > % vowels: A, E, I, O, U, UU
> > %
> > % Copy the template from ISO/IEC 14651
> >
> > order_start forward; forward
> > %
> > % Kurdish numeric characters.
> > %
> > <U0660> <U0660>
>
> You still did not base the collation on iso14651_t1.
>
> Your LC_COLLATE section should start like this:
>
> LC_COLLATE
> copy "iso14651_t1"
>
> and then you should only reorder the characters which are not correctly
> ordered already, i.e. you should only do modifications to the default
> collation order comming from "iso14651_t1", *not* write everything from
> scratch.
>
> I can try to help you with that and try to rewrite your LC_COLLATE.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

You have

<U0640> IGNORE

in your sort order.

U+0640 ARABIC TATWEEL

Why IGNORE?

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

  %
  %
  % Other control characters etc. upto order_end
  %

Why do you sort control characters? These have nothing to do with
the Kurdish Sorani language.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12192
0001-Add-ckb_IQ-locale.patch

That is your original locale file as a patch

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12193
0002-Fix-ckb_IQ-Add-ckb_IQ-to-SUPPORTED-file-Add-ckb_IQ.U.patch

My suggested changes.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

    LC_MONETARY
   -int_curr_symbol "<U0049><U0051><U0044><U0020>"
   +int_curr_symbol "IQD "
    currency_symbol "<U062F><U002E><U0639>"
   -mon_decimal_point "<U002E>"
   -mon_thousands_sep "<U002C>"
   +mon_decimal_point "."
   +mon_thousands_sep ","
    mon_grouping 3
    positive_sign ""
   -negative_sign "<U002D>"
   +negative_sign "-"
    int_frac_digits 3
    frac_digits 3
    p_cs_precedes 1

For everything which is ASCII, it is allowed (and preferred) to write
the ASCII directly and not the code points.

I.e. it is better (because more readable) to write "-" instead of "<U002D>".

I hope in future this will be allowed also for non-ASCII characters,
at the moment it is only allowed for ASCII.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

     LC_MESSAGES
    -yesexpr "<U0628><U06D5><U06B5><U06CE>"
    -noexpr "<U0646><U06D5><U062E><U06CE><U0631>"
    +yesexpr "^[+1yY<U0628>]"
    +noexpr "^[-0nN<U0646>]"
     yesstr "<U0628><U06D5><U06B5><U06CE>"
     nostr "<U0646><U06D5><U062E><U06CE><U0631>"
     END LC_MESSAGES

"yesstr" and "nostr" are the words for "yes" and "no" in your language.

"yesexpr" should *not* be the same as "yesstr".

"yesexpr" should be a regular expression matching single letters
which could be typed as the response for "yes" when you get a prompt asking something like:

    "Do you want ...? (y/n)"

and when you type "y" in English, this means yes.

In *all* glibc locales we include +1yY to the "yesexpr" as long as this does not conflict with the language of that locale.
If "y" would suggest "no" in that language we can not add it to "yesexpr" but in all other cases we add it.

Similar ofr "noexpr".

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

     LC_ADDRESS
     postal_fmt "%z%c%T%s%b%e%r"
    -country_name "Iraq"
    -country_ab2 "<U0049><U0051>"
    -country_ab3 "<U0049><U0052><U0051>"
    -country_post "<U0049><U0052><U0051>"
    +country_name "<U0639><U06CE><U0631><U0627><U0642>"
    +country_ab2 "IQ"
    +country_ab3 "IRQ"
    +country_post "IRQ"
     country_num 368
    -country_car "<U0049><U0051>"
    +country_car "IQ"
    +lang_name "<U06A9><U0648><U0631><U062F><U06CC><U06CC> <U0646><U0627><U0648><U06D5><U0646><U
    +lang_term "ckb"
    +lang_lib "ckb"
     %
     END LC_ADDRESS

country_name should be the name of the country in your language (Sorani), *not* in English.

The English name is already in:

    territory "Iraq"

lang_name should be the the name of your language in your language.

The English name is already in:

    language "Central Kurdish"

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

I rewrote the LC_COLLATE section to contain only the absolutely necessary stuff. Now it looks like this:

   LC_COLLATE
   % The Kurdish Sorani, Bahdini, and others dialects is mainly written using a modified (Arabic-based alphabet) with 33 letters.
   % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an alphabet in which vowels are mandatory, making the script easy to read.
   %
   % The kurdish alphabet order is:
   % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z
   % vowels: A, E, I, O, U, UU
   %

   % Copy the template from ISO/IEC 14651
   copy "iso14651_t1"

   reorder-after <S0631> % ر
   <S0695> % ڕ

   reorder-after <S0646> % ن
   <S0648> % و
   <S06C6> % ۆ

   END LC_COLLATE

I.e. this sorts U+0695, U+0648, and U+06C6 differently from the default sort order.

The default sort order comes from

    copy "iso14651_t1"

You use this line to copy the default sort order and then add changes needed for your language.

According to what you wrote in your locale, the 3 characters U+0695, U+0648, and U+06C6 sort
differently than the default sort order for Arabic characters, all the reset sort the same
as in the default sort order.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

If you do *not* use

   copy "iso14651_t1"

this is bad because then almost all Unicode characters which you do not cover by your own sort order will sort incorrectly. You want a reasonable default and apply the changes for your language to that default.

Of course your locale should sort Kurdish Sorani correctly, but it should not sort other characters (Cyrillic, Devanagari, ... whatever) completely silly.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Your locale also sorted many control characters and ASCII punctuation characters.

I think there is no reason to deviate from the default for these characters, therefore I removed them.

If you have a good reason why some of these need to be sorted differently for Kurdish, please tell me.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Your locale sorted the Kurdish numbers at the top, i.e. before the
Western numbers. The default order (as you can see in the ckb_IQ.UTF-8.in sorting test file in my patch) sorts these in between the Western numbers. Like this:

    0
    ٠
    1
    ١
    2
    ٢
    3
    ٣
    4
    ٤
    5
    ٥
    6
    ٦
    7
    ٧
    8
    ٨
    9
    ٩

That is reasonably good, isn’t it?

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Your locale also resorted all the ASCII letters to make upper case letters come first.

I.e.

A
a

instead of

a
A

Lower case first is what comes from

    copy "iso14651_t1"

When using CLDR for sorting, one can use an option
[caseFirst upper], see for example:

https://github.com/unicode-org/cldr/blob/master/common/collation/da.xml

glibc has no easy option to do that at the moment.

It is *possible* do sort A-Za-z differently in your locale *but*
if you do that you will get a weird order for all Latin characters you forget.
I.e. if you do not include äÄ in your sort order as well, they would still sort
lower case first. It is a lot of work to do this correctly for *all* Latin characters without a convenient option like CLDR’s [caseFirst upper],
I would recommend not doing that if it is not absolutely required.

Revision history for this message
In , Aras (aras-noori) wrote :

(In reply to Mike FABIAN from comment #53)
> Your locale also resorted all the ASCII letters to make upper case letters
> come first.
>
> I.e.
>
> A
> a
>
> instead of
>
> a
> A
>
> Lower case first is what comes from
>
> copy "iso14651_t1"
>
> When using CLDR for sorting, one can use an option
> [caseFirst upper], see for example:
>
> https://github.com/unicode-org/cldr/blob/master/common/collation/da.xml
>
> glibc has no easy option to do that at the moment.
>
> It is *possible* do sort A-Za-z differently in your locale *but*
> if you do that you will get a weird order for all Latin characters you
> forget.
> I.e. if you do not include äÄ in your sort order as well, they would still
> sort
> lower case first. It is a lot of work to do this correctly for *all* Latin
> characters without a convenient option like CLDR’s [caseFirst upper],
> I would recommend not doing that if it is not absolutely required.

Hello Fabian,
thanks to your suggestions and notice. You are right with sorting (aA) as well with Numbers, this should be modified.
The kurdish alphabet order is:

ئ
U+0626

ا
U+0627

ب
U+0628

پ
U+067E

ت
U+062A

ج
U+062C

چ
U+0686

ح
U+062D

خ
U+062E

د
U+062F

ر
U+0631

ڕ
U+0695

ز
U+0632

ژ
U+0698

س
U+0633

ش
U+0634

ع
U+0639

غ
U+063A

ف
U+0641

ڤ
U+06A4

ق

U+0642

ک
U+06A9

گ
U+06AF
ل
U+0644
ڵ
U+06B5
م
U+0645
ن
U+0646
و
U+0648
ۆ
U+06C6
ھ
U+0647
ە
U+06D5
ی
U+06CC
ێ
U+06CE

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

thank you mike you is really appreciated i have pointed all my answers according to your question and suggestion to our locale as follow:

1. For positive sign and negative i agree with you let it be + and - .
2. For regular expression i didn't know how to type it in my language hope to hekp me solve this.
we have "ب" for Y in English and "ن" for N in English .
3.You right we type Iraq in Kurdish(Sorani) now changed.
4.We have Kurdish alphabet as Aras Noori wrote before my reply and i look at iso14651_t1 now all characters which is used in Kurdish are exist, these characters that you did add them are from Arabic language not Kurdish.

Can you send the .dat file with your last changes?

Best Regards

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Thank you mike you your help is really appreciated
I have pointed all my answers according to your question and suggestion to our locale as follow:

1. For positive sign and negative i agree with you let it be + and - .
2. For regular expression i didn't know how to type it in my language hope to hekp me solve this.
we have "ب" for Y in English and "ن" for N in English .
3.You right we type Iraq in Kurdish(Sorani) now changed.
4.We have Kurdish alphabet as Aras Noori wrote before my reply and i look at iso14651_t1 now all characters which is used in Kurdish are exist, these characters that you did add them are from Arabic language not Kurdish.

Can you send the .dat file with your last changes?

Best Regards

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

> thanks to your suggestions and notice. You are right with sorting (aA) as
> well with Numbers, this should be modified.

So sorting

a
A

and

0
٠
1
١
...

is OK? I hope so ...

> The kurdish alphabet order is:

To achieve that order, this is enough:

   copy "iso14651_t1"

   reorder-after <S0631> % ر
   <S0695> % ڕ

   reorder-after <S0646> % ن
   <S0648> % و
   <S06C6> % ۆ

I added the test file ckb_IQ.UTF-8.in in my patch, this file is sorted
using the rules of my patched ckb_IQ locale, the sorted result should
be the same as the original file, otherwise the test fails.

As the test passes, the above collation rules work and achieve the
order as in the ckb_IQ.UTF-8.in test file.

I’ll paste this test file here again for your easy refererence:

0
٠
1
١
2
٢
3
٣
4
٤
5
٥
6
٦
7
٧
8
٨
9
٩
a
A
b
B
c
C
d
D
e
E
f
F
g
G
h
H
i
I
j
J
k
K
l
L
m
M
n
N
o
O
p
P
q
Q
r
R
s
S
t
T
u
U
v
V
w
W
x
X
y
Y
z
Z
ئ
ا
ب
پ
ت
ج
چ
ح
خ
د
ر
ڕ
ز
ژ
س
ش
ع
غ
ف
ڤ
ق
ک
گ
ل
ڵ
م
ن
و
ۆ
ه
ە
ی
ێ

Other characters not in this test file are sorted according to the defaults from

    copy "iso14651_t1"

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Jwtiyar Nariman from comment #56)
> Thank you mike you your help is really appreciated
> I have pointed all my answers according to your question and suggestion to
> our locale as follow:
>
> 1. For positive sign and negative i agree with you let it be + and - .

Your original locale had the positive sign empty.
Probably a mistake. So I’ll make it + now.

> 2. For regular expression i didn't know how to type it in my language hope
> to hekp me solve this.
> we have "ب" for Y in English and "ن" for N in English .

That is what I used:

yesexpr "^[+1yY<U0628>]"
noexpr "^[-0nN<U0646>]"

So these regular expressions except +, 1, y, Y, and ب as a yes answer.
And -, 0, n, N, and ن as a no answer.

> 3.You right we type Iraq in Kurdish(Sorani) now changed.
> 4.We have Kurdish alphabet as Aras Noori wrote before my reply and i look at
> iso14651_t1 now all characters which is used in Kurdish are exist, these
> characters that you did add them are from Arabic language not Kurdish.

I don’t understand. Most of these characters are used both in Arabic *and* Kurdish.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12194
ckb_IQ

> Can you send the .dat file with your last changes?

Here is the latest file with the changes I made.
I just added the + as the positive_sign.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12195
0001-Add-ckb_IQ-locale.patch

Updated patch.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12196
0002-Fix-ckb_IQ-Add-ckb_IQ-to-SUPPORTED-file-Add-ckb_IQ.U.patch

Updated patch.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

(In reply to Mike FABIAN from comment #57)
> > thanks to your suggestions and notice. You are right with sorting (aA) as
> > well with Numbers, this should be modified.
>
> So sorting
>
> a
> A
>
> and
>
> 0
> ٠
> 1
> ١
> ...
>
> is OK? I hope so ...
>
> > The kurdish alphabet order is:
>
> To achieve that order, this is enough:
>
> copy "iso14651_t1"
>
> reorder-after <S0631> % ر
> <S0695> % ڕ
>
> reorder-after <S0646> % ن
> <S0648> % و
> <S06C6> % ۆ
>
> I added the test file ckb_IQ.UTF-8.in in my patch, this file is sorted
> using the rules of my patched ckb_IQ locale, the sorted result should
> be the same as the original file, otherwise the test fails.
>
> As the test passes, the above collation rules work and achieve the
> order as in the ckb_IQ.UTF-8.in test file.
>
> I’ll paste this test file here again for your easy refererence:
>
> 0
> ٠
> 1
> ١
> 2
> ٢
> 3
> ٣
> 4
> ٤
> 5
> ٥
> 6
> ٦
> 7
> ٧
> 8
> ٨
> 9
> ٩
> a
> A
> b
> B
> c
> C
> d
> D
> e
> E
> f
> F
> g
> G
> h
> H
> i
> I
> j
> J
> k
> K
> l
> L
> m
> M
> n
> N
> o
> O
> p
> P
> q
> Q
> r
> R
> s
> S
> t
> T
> u
> U
> v
> V
> w
> W
> x
> X
> y
> Y
> z
> Z
> ئ
> ا
> ب
> پ
> ت
> ج
> چ
> ح
> خ
> د
> ر
> ڕ
> ز
> ژ
> س
> ش
> ع
> غ
> ف
> ڤ
> ق
> ک
> گ
> ل
> ڵ
> م
> ن
> و
> ۆ
> ه
> ە
> ی
> ێ
>
> Other characters not in this test file are sorted according to the defaults
> from
>
> copy "iso14651_t1"

Sorting is good now, but adding these
  reorder-after <S0631> % ر
> <S0695> % ڕ
>
> reorder-after <S0646> % ن
> <S0648> % و
> <S06C6> % ۆ
iam not understanding because for example this " <S0695> % ڕ " how you order it?

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Jwtiyar Nariman from comment #62)

> > Other characters not in this test file are sorted according to the defaults
> > from
> >
> > copy "iso14651_t1"
>
> Sorting is good now, but adding these
> reorder-after <S0631> % ر
> > <S0695> % ڕ
> >
> > reorder-after <S0646> % ن
> > <S0648> % و
> > <S06C6> % ۆ
> iam not understanding because for example this " <S0695> % ڕ " how you
> order it?

copy "iso14651_t1"

contains

copy "iso14651_t1_common"

and some modifications which affect only Chinese and Japanese.

So we look into the iso14651_t1_common file to see what the default sort order is.

We find for example:

...
<S0631> % ARABIC LETTER REH
<S0632> % ARABIC LETTER ZAIN
<S0691> % ARABIC LETTER RREH
<S0692> % ARABIC LETTER REH WITH SMALL V
<S0693> % ARABIC LETTER REH WITH RING
<S0694> % ARABIC LETTER REH WITH DOT BELOW
<S0695> % ARABIC LETTER REH WITH SMALL V BELOW
<S0696> % ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE
...

Looking at this you see that ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW
is sorted right after ڔ U+0694 ARABIC LETTER REH WITH DOT BELOW by default.
That is not what you want for Kurdish. For Kurdish, you want
ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW to be sorted right after
ر U+0631 ARABIC LETTER REH.

This is achieved by the rule:

reorder-after <S0631> % ر
<S0695> % ڕ

Which removes U+0695 from its default position in the sort order
and inserts it again after U+0631.

reorder-after <S0646> % ن
<S0648> % و
<S06C6> % ۆ

does a similar thing to change the sorting of U+0648 and U+06C6.

To find out which of these rules I need, I created the ckb_IQ.UTF-8.in
test file first and wrote the Kurdish characters in the order you wanted
into that file.

Then I ran a test sort using a ckb_IQ locale which had *only*

LC_COLLATE
copy "iso14651_t1"
END LC_COLLATE

and *nothing* else.

The test sort showed that only U+0695, U+0648, and U+06C6 were sorted incorrectly.
All other characters from your list of Kurdish characters were sorted correctly
already. So I needed only to add rules to fix the sort order for these 3 characters.

You can see the same by just reading the iso14651_t1_common and find out which
of the Kurdish characters are already in the correct order in that file and which are not.
You have to do nothing for the characters which are already in correct order.
For the characters which are in a wrong position in iso14651_t1_common, you add
rules like

reorder-after <... collating-symbol after which to reorder ...>
<... the collating-symbol which should be reordered ...>

I found writing the test file and checking which characters are sorted
wrongly by default easier than staring at iso14651_t1_common. And it
is a good idea to have the test file anyway to make sure that the
Kurdish sort order always stays correct when something is changed in
glibc. If we have the test file, we will notice when some change causes a problem.

Changed in glibc:
status: Incomplete → In Progress
Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :
Download full text (3.4 KiB)

Thank you very much dear mike i got it, you made a great job, thanks again.
So now every thing is ready to be accepted in glibc.

Best Regards (In reply to Mike FABIAN from comment #63)
> (In reply to Jwtiyar Nariman from comment #62)
>
> > > Other characters not in this test file are sorted according to the defaults
> > > from
> > >
> > > copy "iso14651_t1"
> >
> > Sorting is good now, but adding these
> > reorder-after <S0631> % ر
> > > <S0695> % ڕ
> > >
> > > reorder-after <S0646> % ن
> > > <S0648> % و
> > > <S06C6> % ۆ
> > iam not understanding because for example this " <S0695> % ڕ " how you
> > order it?
>
> copy "iso14651_t1"
>
> contains
>
> copy "iso14651_t1_common"
>
> and some modifications which affect only Chinese and Japanese.
>
> So we look into the iso14651_t1_common file to see what the default sort
> order is.
>
> We find for example:
>
> ...
> <S0631> % ARABIC LETTER REH
> <S0632> % ARABIC LETTER ZAIN
> <S0691> % ARABIC LETTER RREH
> <S0692> % ARABIC LETTER REH WITH SMALL V
> <S0693> % ARABIC LETTER REH WITH RING
> <S0694> % ARABIC LETTER REH WITH DOT BELOW
> <S0695> % ARABIC LETTER REH WITH SMALL V BELOW
> <S0696> % ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE
> ...
>
> Looking at this you see that ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW
> is sorted right after ڔ U+0694 ARABIC LETTER REH WITH DOT BELOW by default.
> That is not what you want for Kurdish. For Kurdish, you want
> ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW to be sorted right after
> ر U+0631 ARABIC LETTER REH.
>
> This is achieved by the rule:
>
> reorder-after <S0631> % ر
> <S0695> % ڕ
>
> Which removes U+0695 from its default position in the sort order
> and inserts it again after U+0631.
>
> reorder-after <S0646> % ن
> <S0648> % و
> <S06C6> % ۆ
>
> does a similar thing to change the sorting of U+0648 and U+06C6.
>
> To find out which of these rules I need, I created the ckb_IQ.UTF-8.in
> test file first and wrote the Kurdish characters in the order you wanted
> into that file.
>
> Then I ran a test sort using a ckb_IQ locale which had *only*
>
> LC_COLLATE
> copy "iso14651_t1"
> END LC_COLLATE
>
> and *nothing* else.
>
> The test sort showed that only U+0695, U+0648, and U+06C6 were sorted
> incorrectly.
> All other characters from your list of Kurdish characters were sorted
> correctly
> already. So I needed only to add rules to fix the sort order for these 3
> characters.
>
> You can see the same by just reading the iso14651_t1_common and find out
> which
> of the Kurdish characters are already in the correct order in that file and
> which are not.
> You have to do nothing for the characters which are already in correct order.
> For the characters which are in a wrong position in iso14651_t1_common, you
> add
> rules like
>
> reorder-after <... collating-symbol after which to reorder ...>
> <... the collating-symbol which should be reordered ...>
>
> I found writing the test file and checking which characters are sorted
> wrongly by default easier than staring at iso14651_t1_common. And it
> is a good idea to have the test file anyway to make sure that the
> Kurdish sort order always stays c...

Read more...

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :
Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

https://github.com/mike-fabian/langtable/releases/tag/0.0.51

I added ckb_IQ.UTF-8 to langtable to make it usuable for installation on Fedora as soon as it is included in glibc.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

By the way, how do you input Kurdish Sorani? Do you use a keyboard layout? Or do you need an input method?

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

(In reply to Mike FABIAN from comment #67)
> By the way, how do you input Kurdish Sorani? Do you use a keyboard layout?
> Or do you need an input method?

Yes we have and its available.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

(In reply to Mike FABIAN from comment #66)
> https://github.com/mike-fabian/langtable/releases/tag/0.0.51
>
> I added ckb_IQ.UTF-8 to langtable to make it usuable for installation on
> Fedora as soon as it is included in glibc.

Our focus now is on Ubuntu because too much users in Ubuntu we have.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12212
0001-Add-new-locale-ckb_IQ-Kurdish-Sorani-spoken-in-Iraq-.patch

git log message changed according to Rafał Lużyński’s review.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Created attachment 12213
0002-Fix-ckb_IQ-BZ-9809.patch

Fixed according to Rafał Lużyński’s review.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Mike FABIAN from comment #71)
> Created attachment 12213 [details]
> 0002-Fix-ckb_IQ-BZ-9809.patch
>
> Fixed according to Rafał Lużyński’s review.

Changed to this according to Rafał Lużyński’s suggestion:

d_t_fmt "%A %d %b %Y, %I:%M:%S %p"

date_fmt "%A %d %B %Y, %Z %I:%M:%S %p"

All otherchanges are just whitespace and formatting.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :
Revision history for this message
In , Aras (aras-noori) wrote :

(In reply to Mike FABIAN from comment #73)
> Rafał Lużyński’s review:
>
> https://sourceware.org/ml/libc-alpha/2020-01/msg00281.html

Thanks to your efforts, the locale is now ripe to join to the lib.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

We have to wait until the release of glibc 2.31:

https://www.gnu.org/software/libc/
The current development version of glibc 2.31, releasing on or around February 1st, 2020.

Revision history for this message
In , Cvs-commit (cvs-commit) wrote :

The master branch has been updated by Mike Fabian <email address hidden>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4267522f5e0309f7606a8d1da5d436a166a719e2

commit 4267522f5e0309f7606a8d1da5d436a166a719e2
Author: Jwtiyar Nariman <email address hidden>
Date: Mon Jan 13 10:06:06 2020 +0100

    Add new locale: ckb_IQ (Kurdish/Sorani spoken in Iraq) [BZ #9809]

Revision history for this message
In , Cvs-commit (cvs-commit) wrote :

The master branch has been updated by Mike Fabian <email address hidden>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ae199e7d6423ed3bd0c8669381966ca4c58f4f49

commit ae199e7d6423ed3bd0c8669381966ca4c58f4f49
Author: Mike FABIAN <email address hidden>
Date: Mon Jan 13 10:12:07 2020 +0100

    Fix ckb_IQ [BZ #9809]

    Add ckb_IQ to SUPPORTED file.
    Add ckb_IQ.UTF-8.in collation test file.
    Mention new ckb_IQ locale in NEWS.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Hey dear Mike
I have downloaded new glibc 2.31 release but couldn't find ckb_iq localedata there? it was not planned to be there?

best regards.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Created attachment 12261
fixed typo in wednesday name in kurdish

Just a typo now fixed, replaced U+0624 with U+0627 in the name of Wednesday in kurdish.

Best Regards.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Created attachment 12262
Added reorder-end command which missing

Adding reorder-end because couldn't compile it with this error exist.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Jwtiyar Nariman from comment #78)
> Hey dear Mike
> I have downloaded new glibc 2.31 release but couldn't find ckb_iq localedata
> there? it was not planned to be there?

Yes, of course, that’s what I wrote in

https://sourceware.org/bugzilla/show_bug.cgi?id=9809#c75

2.31 was already in code freeze, I could push this only *after* 2.31 was released. Therefore, the target milesstone of this bug is set to 2.32.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Jwtiyar Nariman from comment #80)
> Created attachment 12262 [details]
> Added reorder-end command which missing
>
> Adding reorder-end because couldn't compile it with this error exist.

Not needed anymore, I rewrote the whole LC_COLLATE section, see

https://sourceware.org/bugzilla/show_bug.cgi?id=9809#c49

If you want to do further changes, please look at what is in current
git master:

https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=localedata/locales/ckb_IQ;hb=refs/heads/master

And then sent a *patch* not the complete new file.

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Jwtiyar Nariman from comment #79)
> Created attachment 12261 [details]
> fixed typo in wednesday name in kurdish
>
> Just a typo now fixed, replaced U+0624 with U+0627 in the name of Wednesday
> in kurdish.
>
> Best Regards.

$ git diff
diff --git a/localedata/locales/ckb_IQ b/localedata/locales/ckb_IQ
index a18ff69cb7..238c381edf 100644
--- a/localedata/locales/ckb_IQ
+++ b/localedata/locales/ckb_IQ
@@ -124,7 +124,7 @@ abday "<U0634><U06D5><U0645>";/
 day "<U06CC><U06D5><U0643><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U062F><U0648><U0648><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U0633><U06CE><U0634><U06D5><U0645><U0645><U06D5>";/
- "<U0686><U0648><U0624><U0631><U0634><U06D5><U0645><U0645><U06D5>";/
+ "<U0686><U0648><U0627><U0631><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U067E><U06CE><U0646><U062C><U0634><U06D5><U0645><U0645><U06D5>";/
     "<U0647><U06D5><U06CC><U0646><U06CC>";/
     "<U0634><U06D5><U0645><U0645><U06D5>"
lines 1-13/13 (END)

Revision history for this message
In , Cvs-commit (cvs-commit) wrote :

The master branch has been updated by Mike Fabian <email address hidden>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=eb948facd894e66429e2e170043b7d36fe445a8d

commit eb948facd894e66429e2e170043b7d36fe445a8d
Author: Mike FABIAN <email address hidden>
Date: Tue Feb 11 10:17:12 2020 +0100

    Fix typo in the name for Wednesday in Kurdish [BZ #9809]

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

Fixed in current master.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

(In reply to Mike FABIAN from comment #85)
> Fixed in current master.

Thank you dear mike for everything, your help really appreciated.

Best Regards.

Changed in glibc:
status: In Progress → Fix Released
Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

HEY dear Mike
Does ckb_IQ will be available in 2.32?
I think there is a problem with re-order as i mentioned before, Due to this commit:
https://sourceware.org/git/?p=glibc.git;a=commit;h=3404def00a1b332080fa51044733f6ead0eae5f3

Best Rgeards

Revision history for this message
In , Mike FABIAN (mike-fabian) wrote :

(In reply to Jwtiyar Nariman from comment #87)
> HEY dear Mike
> Does ckb_IQ will be available in 2.32?

Yes.

> I think there is a problem with re-order as i mentioned before, Due to this
> commit:
> https://sourceware.org/git/?p=glibc.git;a=commit;
> h=3404def00a1b332080fa51044733f6ead0eae5f3
>
> Best Rgeards

This is fixed by the mentioned commit.

And even before the commit, it worked, it just printed a warning at build time.

Revision history for this message
In , Jwtiyar Nariman (jwtiyar) wrote :

Hey
2.32 is released and ckb is existed, thank you for everyone specially Gunnar and Mike, I wonder to know how Ubuntu will update it to latest 2.32?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.