Style: т, п, г Bulgarian Cyrillic letters look unfamiliar in bg_BG locale

Bug #708578 reported by lokster on 2011-01-27
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Font Family
Undecided
Unassigned

Bug Description

The Ubuntu Font Family has number of Cyrillic letter forms; upright and italic glyph differ considerably for some codepoints. In additional, the Ubuntu Font Family tries to support localised Cyrillic alphabets.

In Bulgarian, some of the forms look unfamiliar.

Original report:
I am not sure when this bug appears - sometimes the letter is correct, and sometimes - not.
I think it appears when using italic style, but not always...
The attached screenshot is taken from my forum, using the Ubuntu font.

lokster (lokiisyourmaster) wrote :
summary: - Cyrillic small letter "т" (latin: t) sometimes displays incorrect
+ Some Cyrillic letters are sometimes displayed incorrect - т, п, г

It appears that your software is requesting the Serbian/Macedonian
italic variants rather than standard Cyrillic italics. Fonts can alter
their behaviour through language and script specific OpenType features,
and that is what's happening here. Whether this is correct or not will
depend on the locale settings, and what the software believes the
correct language is.

Dave

This can be easily reporduced with Synaptic when a package is installed/upgraded - see attached screenshot.

I believe that my system is properly configured for Bulgarian language environment, but nevertheless the problem occurs - see attached screenshot.

I've forgot to attach the screenshot in the previous post, sorry...

lokster (lokiisyourmaster) wrote :

My locale is correct too:
LANG=bg_BG.UTF-8
LC_CTYPE="bg_BG.UTF-8"
LC_NUMERIC="bg_BG.UTF-8"
LC_TIME="bg_BG.UTF-8"
LC_COLLATE="bg_BG.UTF-8"
LC_MONETARY="bg_BG.UTF-8"
LC_MESSAGES="bg_BG.UTF-8"
LC_PAPER="bg_BG.UTF-8"
LC_NAME="bg_BG.UTF-8"
LC_ADDRESS="bg_BG.UTF-8"
LC_TELEPHONE="bg_BG.UTF-8"
LC_MEASUREMENT="bg_BG.UTF-8"
LC_IDENTIFICATION="bg_BG.UTF-8"
LC_ALL=

adoa (adoa) wrote :

Well actually this is not really incorrect, according Wikipedia. On your system the font seems to use the Serbian/Macedonian glyphs. I do not understand how or why the font chooses the correct or incorrect glyphs but the glyphs in your screen-short are considered italic Cyrillic glyphs for Serbian/Macedonian. In other countries with Cyrillic there are used other italic glyphs.

May I ask which language you actually speak/write and which language you use on your system?

See also
https://secure.wikimedia.org/wikipedia/en/wiki/Cyrillic_alphabet#Letterforms_and_typography

lokster (lokiisyourmaster) wrote :

We speak/write Bulgarian, and we use Bulgarian locale.

My locale output and speaking/writing language are identical to lokster's.

adoa (adoa) wrote :

Just an idea: Could the font be considering these glyphs to be correct for Bulgarian?

From time to time I use/read/write Russian texts with Cyrillic, even though my locale is German “de_DE.UFT-8”. I never saw those Serbian/Macedonian italic glyphs.

Hmmm. All of our references say that the Serbian/Macedonian style
italics are also used by Bulgarian, which is why the font is behaving
this way. But if you're telling us that all of our references are wrong,
we'll have to undo the OpenType features.

Dave

Maybe your sources are wrong. Unless my primary school teacher was wrong (which was my mother actually, so it can't be wrong) :)
Some Bulgarians use these symbols when handwriting, but as far as I know, they are not officially accepted.

This is clearly wrong. For comparison see how the wrong letters from the Synaptic's screen shot look in LibreOffice with the same font and style. This is the correct rendering of bulgarian "г", "п", and "т".

For the Wikipedia article, the statement saying Bulgarian has the same letters forms as Serbian and Macedonian comes from
- 23 April 2009 first adding Bulgarian-Macedonian along Serbian
  http://en.wikipedia.org/w/index.php?title=Cyrillic_alphabet&diff=prev&oldid=285745744
- 20 September 2009 then splitting Bulgarian-Macedonian into Bulgarian and Macedonian
  http://en.wikipedia.org/w/index.php?title=Cyrillic_alphabet&diff=prev&oldid=315023109

Both edits are by Anonymous and without cited references.

This John Daggett post http://hacks.mozilla.org/2010/11/firefox-4-font-feature-support/ of November 2010 reiterates the statement, not sure if it's from another source.

In 1999 a thread on the Unicode mailing list covered the topic http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML020/thread.html#804 but nobody mentioned Bulgarian having those Serbian/Macedonian letterforms.
The thread is summarized on http://jankojs.tripod.com/SerbianCyr.htm

This is very puzzling, as it's not just Wikipedia which makes the claims
about Bulgarian - I've even seen reference photographs of Bulgarian
street signs which use them for uprights. I'm wondering if, as the
earlier comment about cursive script confirmed, the confusion is coming
from written versus printed forms? We then face the issue of how closely
italic should follow cursive.

Either way, we need to investigate further, and come to a firm conclusion.

Dave

Well if the uprights are different in Bulgarian, it's another set of substitutions.
I guess http://typophile.com/node/32397 has some good examples.
However, it doesn't seem these variants are standard but rather font specific (see the different Macdonald's).

lokster (lokiisyourmaster) wrote :

Well, as I said, my mother is a primary school teacher. In Bulgaria. She teaches the kids how to write Bulgarian (for almost 30 years now), and it is not the way Wikipedia claims.
Even in handwritten text, these symbols are wrong, according to her textbooks.
Actually, only the fancy symbol fot "т" is used sometimes. The strange symbols for "г" and "п" are NEVER used. I haven't seen them in Bulgarian text (especially the "г" symbol). Only in the Ubuntu font.
I don't know why some people use this "т" symbol when handwriting - it's not officially accepted nor widely used.
Maybe because in handwritten text it doesn't look so bad, and it's not very big deal if someone is using it.
But on screen, books, magazines it makes the text unreadable and unnatural for Bulgarians.
By the way Macedonian language is actually Bulgarian with strong Serbian influence.
If I'm not wrong, it's not even accepted by our government :) Just saying...

Paul Sladen (sladen) on 2011-01-27
summary: - Some Cyrillic letters are sometimes displayed incorrect - т, п, г
+ Style: т, п, г Bulgarian Cyrillic letters look unfamiliar in bg_BG
+ locale
description: updated
Changed in ubuntu-font-family:
status: New → Incomplete
tags: added: uff-bulgarian uff-cyrillic uff-style
Paul Sladen (sladen) wrote :

Lokster: Thank you for raising this (and please also pass on thanks to your mother for her experience!). I'm trying to get some high-level context into this bug report, and also some definite hard facts about which codepoints we're discussing about.

If we can narrow down the question/focus it'll be easier to get input from the Ubuntu Bulgarian Loco team (although I don't think there is that much traffic on the 'ubuntu-bg' list).

lokster (lokiisyourmaster) wrote :

"although I don't think there is that much traffic on the 'ubuntu-bg' list"
There isn't :)
I will post photos from my mom's textbooks tomorrow. Exactly the pages which show how these symbols are written in Bulgarian - for reference :)
I hope it will help at least a little to solve this, because I really like the way Ubuntu font looks.

See attached photo of a primer used in the elementary school in Bulgaria.

Paul Sladen (sladen) wrote :

for l in bg_BG ru_RU ; do for f in Ubuntu{" "Italic,}" 60" ; do pango-view --margin=50 --language $l --font="$f" --text "т п г | $l $f" & done ; done

results in the following four images. Are the standard Russian (cursive, italics) forms what you're used to, and expecting to see? Which of the four images are correct, and which are wrong?

The first three are correct, the fourth is not.

Paul Sladen (sladen) wrote :

Svetlisashkov: thank you for the clear answer! Next question, do you /recognise/ the fourth forms in any capacity. If you do immediately recognise them, do you perceive them as being any of:

  * old forms, going out of fashion
  * super new forms (eg. not yet adopted)
  * imported forms (neighbouring countries/languages)
  * something else?

Paul Sladen: Unoficially and only in rare ocasions some people draw the "т" letter as in the fourth example. But this is uncommon and not wide accepted. The other two letters as shown in the fourth example are plain alien and confusing to any bulgarian user, especially the "г" letter.

Paul Sladen (sladen) wrote :

This PDF is the same layout as the school primer posted above, to make for easier comparison:

  http://launchpadlibrarian.net/62975117/IMG_7470.JPG (school handwriting primer)

and it does appear that the Russian cursive forms for т, п, г are closer than those currently set in the locl table for Bulgarian. Per Dave in comment #11 above, looking at Serbian cursive hands, they look similar to the above primer:

  http://en.wikipedia.org/wiki/File:Serbian_Cyrillic_cursive2.png
  http://en.wikipedia.org/wiki/File:Serbian_Cyrillic_cursive.png

Off-topic: I'm not particularly a fan of the 'locl' tables since they introduce this sort of variance and debugging, but if there's a case for using them, then there might be some scope for some further tweaks; eg. doing Г→D to match the current г→g cursive. Perhaps Ж and Т→m. Those who are native Bulgarian readers, what's your thoughts towards б looking more like a 6 or a δ too?

Paul Sladen (sladen) wrote :

Source for above, generated with:

  pango-view --markup --output bg-school-primer-layout.{pdf,txt}

Paul Sladen (sladen) wrote :

Source for above, generated with:

  pango-view --markup --output bg-school-primer-layout.{pdf,txt}

lokster (lokiisyourmaster) wrote :

@Paul Sladen if written correctly, the "б" symbol does not look like "6" (six). At least for Bulgarians :)
See the attached example.
Our "б" symbol must be closer to the Greek "delta" than to "6" (six).

Paul Sladen (sladen) wrote :

lokster: okay, brilliant. I believe that that is the case with 'б' already (excellent reassurance to know that it is at least part correct!):

  pango-view --markup --text '<span font="Ubuntu"><span lang="ru">Russian б</span> <span lang="bg">Bulgarian б</span></span>' --output be-ru-bg.pdf

Ah, so Bulgarian uses some of the variants, but not all - that at least
goes towards explaining the origin of this widely-reported "myth". It's
always good to know where bad information came from. It should be
trivial to adjust the fonts to match cultural expectations.

Dave

Is this will be fixed untill Natty release date?

Shiraaz Gabru (shiraaz) on 2011-02-16
Changed in ubuntu-font-family:
milestone: none → 0.71
Paul Sladen (sladen) wrote :

Shiraaz, David: are we at the state of having enough information available to actually implement this (swapping the alternates around is very easy compared with knowing /which/ should be the default).

From one of my old contacts at the University of Nottingham Slavonic Studies department suggested talking to John A Dunn at Glasgow as being a starting point.

Lokster, Svetoslav: are you happy if these three gets tweaked in this round, then we review it again and see if there are any alternatives that can be improved the next time around too (the aim is to get it correct, even if it takes a few goes!).

I think it's fair to conclude that our sources are wrong in their
assumptions about Bulgarian forms - right that there are differences,
wrong as to what they are. Unless any other Bulgarian speakers wish to
object to the plan at this point, I would regard it as safe to implement.

Dave

lokster (lokiisyourmaster) wrote :

I think there will be no objections to this change. At least, from Bulgarian-speakers.

Shiraaz Gabru (shiraaz) on 2011-02-16
Changed in ubuntu-font-family:
status: Incomplete → In Progress
Paul Sladen (sladen) wrote :

David: yup, question is, *what's the plan*? :-)

If we document "the plan" before we do it, we can hopefully do it. In this case, that means which mappings we're going to change, and what to.

lokster: it should be sooner, hopefully this month... but we'll have to wait and see.

Paul Sladen (sladen) on 2011-02-23
tags: added: uff-locl
Paul Sladen (sladen) wrote :

Screengrab of current (0.71) locale tables for Russian/Serbian/Bulgarian.

Please could everyone double check that the locale mapping currently shipping in 0.71 (see screenshot) are current.

Paul Sladen (sladen) wrote :

Pango source for above:

  pango-view --markup uff-cyrillic-locale-0.71.pango.txt

lokster (lokiisyourmaster) wrote :

I can confirm that at least the Bulgarian letters look OK now.

Paul Sladen (sladen) wrote :

This a new upstream version of the Ubuntu Font Family. In addition to
the extensive bug fixes it doubles the number of .ttfs files, from four
to eight with the inclusion of Light, Medium and italics.

Upstream changelog:

2010-03-08 (Paul Sladen) Ubuntu Font Family version 0.71.2

* (Production) Adjust Medium WeightClass to 500 (Md, MdIt) (LP:
#730912)

2010-03-07 (Paul Sladen) Ubuntu Font Family version 0.71.1

* (Design) Add Capitalised version of glyphs and kern. (Lt, LtIt,
Md, MdIt) DM (LP: #677446)
* (Design) Re-space and tighen Regular and Italic by amount specified
by Mark Shuttleworth (minus 4 FUnits). (Rg, It) (LP: #677149)
* (Design) Design: Latin (U+0192) made straight more like l/c f with
tail (LP: #670768)
* (Design) (U+01B3) should have hook on right, as the lowercase
(U+01B4) (LP: #681026)
* (Design) Tail of Light Italic germandbls, longs and lowercase 'f'
to match Italic/BoldItalic (LP: #623925)
* (Production) Update <case> feature (Lt, LtIt, Md, MdIt). DM
(LP: #676538, #676539)
* (Production) Remove Bulgarian locl feature for Italics. (LP: #708578)
* (Production) Update Description information with new string:
"The Ubuntu Font Family are libre fonts funded by Canonical Ltd
on behalf of the Ubuntu project. The font design work and
technical implementation is being undertaken by Dalton Maag. The
typeface is sans-serif, uses OpenType features and is manually
hinted for clarity on desktop and mobile computing screens. The
scope of the Ubuntu Font Family includes all the languages used
by the various Ubuntu users around the world in tune with
Ubuntu's philosophy which states that every user should be able
to use their software in the language of their choice. The
project is ongoing, and we expect the family will be extended to
cover many written languages in the coming years."
(Rg, It, Bd, BdIt, Lt, LtIt, Md, MdIt) (LP: #690590)
* (Production) Pixel per em indicator added at U+F000 (Lt, LtIt, Md,
MdIt) (LP: #615787)
* (Production) Version number indicator added at U+EFFD (Lt, LtIt, Md,
MdIt) (LP: #640623)
* (Production) fstype bit set to 0 - Editable (Lt, LtIt, Md, MdIt)
(LP: #648406)
* (Production) Localisation of name table has been removed because
of problems with Mac OS/X interpretation of localisation. DM
(LP: #730785)
* (Hinting) Regular '?' dot non-circular (has incorrect control
value). (LP: #654336)
* (Hinting) Too much space after latin capital 'G' in 13pt
regular. Now reduced. (LP: #683437)
* (Hinting) Balance Indian Rupee at 18,19pt (LP: #662177)
* (Hinting) Make Regular '£' less ambiguous at 13-15 ppm (LP: #685562)
* (Hinting) Regular capital 'W' made symmetrical at 31 ppem (LP: #686168)

Changed in ubuntu-font-family:
status: In Progress → Fix Released
lokster (lokiisyourmaster) wrote :

Correction. Not everything is OK. The Bulgarian "б" letter on the screenshot must look like the Russian "б", and not like the Serbian/Macedonian. It must not look like the greek "delta" - I explained this in my previous posts.
It's not very big difference, and most Bulgarians will not notice it (especially when using small font size), but it's wrong to write it that way nevertheless.
For reference see the photo in https://bugs.launchpad.net/ubuntu-font-family/+bug/708578/comments/20 and https://secure.wikimedia.org/wikipedia/en/wiki/File:Special_Cyrillics.png ("Standard" Cyrillic, not Macedonian/Serbian).

adoa (adoa) wrote :

I lived in Russia for some time but Russian is not my mother tongue, so I can only speak for the Russian forms:
I know that in Russia both variants (g and ∂) of the д are used as cursive forms. I cannot tell which one is more often used or even is expected. The cursive forms of the other letters are definitely correct for Russian language in the png-file posted by Paul.

Paul Sladen (sladen) wrote :

adoa: just to confirm (same thing, but phrased differently) "the Bulgarian and Russians forms should be identical for all letters, and in every way. They Serbian forms are not used at all" ?

Paul Sladen (sladen) wrote :

s/adoa:/lokster:/

lokster (lokiisyourmaster) wrote :

Actually, the Bulgarian cursive form must look like the Serbian, and the normal form must look like the Russian (if it's possible).
According to the school primer.
It makes sense to me, so I guess this is the right way.
But It's not a big deal actually. The most important part was that with the "т, п, г" letters.

Paul Sladen (sladen) wrote :

lokster: I've just been looking around for a clear support that the style of Bulgarian "б" wants to be Russian-form for the Upright and Serbian-form for the Italic. eg. I've looking through the following thread-again:

  http://typophile.com/node/32397

Do you have any contacts in the academic (study of language/type or design business), or would you be able to gather eg. photographs (in addition to the one sampler in comment #20. Looking at the Typophile thread, the last post notes that a tendency to raise 'в' to ascender height (and to add ascenders to other letters). Does that description fit?

lokster (lokiisyourmaster) wrote :

No, I don't have this kind of contacts, but I will try to gather some more photographic evidence.
However, bear in mind that the road signs, billboard etc. in Bulgaria are not the best source of typographic or linguistic information - they are known to have many (funny) errors which may remain unnoticed from non-Bulgarians :)
Anyway, I want to say that you've done great work with ubuntu font family.

Paul Sladen (sladen) wrote :

Well we do collectively want it to be right for as many people and languages in the end as possible.

For the Bulgarian б, I'm hoping that we can collect a bunch of support "evidence", attach it to a (new) bug report just about the Bulgarian б upright/cursive and then it's clear for anyone who comes back, as to why the change was made (so hopefully avoiding switching back and forth too much while people are still gathering input).

Could you open the new but report for that:

  http://launchpad.net/ubuntu-font-family/+filebug?field.title=Locale:+Bulgarian+lowercase+be+should+match+Russian+in+upright+and+Serbian+in+italic

Paul Sladen (sladen) wrote :

I've just been emailed about this one. The current status is that the language-specific variants have been adjusted as best as we can based on the feedback available and suggestions made (much of which was contradictory, people some people saying strong it should be one form, and others strongly saying it should be the other form):

  http://design.canonical.com/2011/12/ubuntu-mono-cyrillic-fixes/

This isn't shipped yet, but hopefully in the next few months we can ship that and check if everyone is happy with the result.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers