Comment 16 for bug 670758

Revision history for this message
Denis Moyogo Jacquerye (moyogo) wrote :

Bruno,
The characters listed come from decrees setting national alphabets (Benin, Burkina Faso, Chad, Mali, Nigeria, Senegal) and orthography standards set by national linguists organizations (Cameroon, Congo-Kinshasa), or pan-african linguist (African Alphabet and African Reference Alphabet). Other sources include Hartell's Alphabets of Africa, many SIL alphabetization books, dictionaries, language learning books or proposals to encode in Unicode.

Unicode blocks are not characters set, things like MES-1, MES-2, MES-3B are. Character sets are subsets of Unicode and very often have characters in more than one Unicode Block. The fact that uppercase and lowercase of the same letter can be in different blocks should make that obvious. However, it is true that some characters sets have been encoded in Unicode as blocks.
I'm not arguing one shouldn't work by block, I'm just arguing it's not the most practical approach from a language coverage point of view.

Paul,
Yes, most non-composite characters are in the IPA block (27), and diacritics in the Combining diacriticals block (16).
But you can find in the Latin Extended-C block:
U+2C64 LATIN CAPITAL LETTER R WITH TAIL used in Sudan
U+2C6D LATIN CAPITAL LETTER ALPHA used in Cameroon
U+2C72 LATIN CAPITAL LETTER W WITH HOOK and U+2C73 LATIN SMALL LETTER W WITH HOOK used in Burkina Faso

in the Latin Extended-D block:
U+A789 MODIFIER LETTER COLON used in Congo-Kinshasa, Kenya and Côte d’Ivoire
U+A78A MODIFIER LETTER SHORT EQUALS SIGN used in Congo-Kinshasa
U+A78D LATIN CAPITAL LETTER TURNED H used in Liberia

In the Combining Diacritical Marks Supplement block:
U+1DC6 COMBINING MACRON-GRAVE and U+1DC7 COMBINING ACUTE-MACRON used in Nigeria

In the Spacing Modifier Letters block:
U+02D7 MODIFIER LETTER MINUS SIGN and U+02EE MODIFIER LETTER DOUBLE APOSTROPHE used in Côte d’Ivoire

In Latin Extended Additional block:
The 60 precomposed characters used in Nigeria, South Africa and others.