Diacritics: Glyphs for cedilla characters U+015E, U+015F, U+0162, U+0163 (ŞşŢţ) use commas

Bug #615565 reported by Mihai Capotă
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Font Family
Fix Released
Medium
Unassigned

Bug Description

Rendered in 24pt Regular

Sample Glyphs:

Description:

Glyphs for U+015E, U+015F, U+0162, U+0163 are wrong. They use a comma instead of a cedilla, as their Unicode description says. The comma characters have their own codepoints U+0218 - U+021B and are, in fact, properly rendered.

I don't know if this happens because of glyph definition or because of the use of locl or ccmp, but it should not happen as it causes confusion.

UA String:

Mozilla/5.0 (X11; U; Linux i686; ro; rv:1.9.2.8) Gecko/20100723 Ubuntu/10.04 (lucid) Firefox/3.6.8

Revision history for this message
Paul Sladen (sladen) wrote : Re: Diacritics: Cedilla glyphs (ŖŗŢţ) for U+015E, U+015F, U+0162, U+0163 use commas

I can confirm this for, although not consistently. Sometimes ('Ŗŗ') are shown with commas, and sometimes ('Ţţ').

visibility: private → public
summary: - Glyphs for U+015E, U+015F, U+0162, U+0163 are wrong
+ Diacritics: Cedilla glyphs (ŖŗŢţ) for U+015E, U+015F, U+0162, U+0163 use
+ commas
Changed in ubuntufontbetatesting:
status: New → Confirmed
Revision history for this message
Paul Sladen (sladen) wrote :
Changed in ubuntufontbetatesting:
importance: Undecided → Medium
Revision history for this message
Mihai Capotă (mihaic) wrote :

My concern is about the Romanian characters.

I don't know about U+0156, U+0157 (Ŗŗ), it's not Romanian. Although I do see it with a cedilla as I'm typing this in the default Ubuntu 10.10 Firefox font.

summary: - Diacritics: Cedilla glyphs (ŖŗŢţ) for U+015E, U+015F, U+0162, U+0163 use
- commas
+ Diacritics: Glyphs for cedilla characters U+015E, U+015F, U+0162, U+0163
+ (ŞşŢţ) use commas
Revision history for this message
Malcolm Wooden (malcolm-daltonmaag) wrote :

The Ubuntu font does have comma accents for Romanian Rr & Tt.

Changed in ubuntu-font-family:
status: Confirmed → Invalid
Revision history for this message
Mihai Capotă (mihaic) wrote :

@Malcolm Wooden:

Thank you for the comment. However, the bug is about characters that should have cedillas and instead wrongly have commas.

Changed in ubuntu-font-family:
status: Invalid → Confirmed
Revision history for this message
Malcolm Wooden (malcolm-daltonmaag) wrote :

Mihai, Sorry for the confusion regarding R's.

The Unicode glyphs that you have queried (U+015E, U+015F, U+0162, U+0163) should have cedilla accents as defined in the Unicode Standard. In Ubuntu Beta U+015E, U+015F do have cedillas but U+0162, U+0163 have commas. We will change this.

Also according to the Unicode Standard, Romanian's prefer commas to cedillas and define U+0218, U+0219, U+021A, U+021B as being for Romanian. To enforce this we include a local table that replaces U+015E, U+015F, U+0162, U+0163 with U+0218, U+0219, U+021A, U+021B in Romanian and Moldavian, but this may not work in all applications.

Changed in ubuntu-font-family:
status: Confirmed → Fix Committed
Revision history for this message
Mihai Capotă (mihaic) wrote :

Malcolm, thanks for fixing U+0162 and U+0163.

Regarding the local table substitution, I think it causes many problems because it makes documents *look* identical while they remain different. For example, search doesn't work.

The source of the problem is that many people still input text using the old cedilla characters. This is mainly on Windows XP which doesn't support the new comma characters.

I know that other fonts also use Romanian local table glyph substitution, but I think this is a bad practice that only generates confusion. Please disable it.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 615565] Re: Diacritics: Glyphs for cedilla characters U+015E, U+015F, U+0162, U+0163 (ŞşŢţ) use commas

 On 22/08/10 08:41, Mihai Capotă wrote:
> The source of the problem is that many people still input text using the
> old cedilla characters. This is mainly on Windows XP which doesn't
> support the new comma characters.
Is that problem declining as XP is phased out? Do moderrn OS's handle it
correctly?

Revision history for this message
Cristian Secară (secarica) wrote :

I sustain Mihai Capotă opinion, please leave the above mentioned characters in their individual behaviour, do not mix the appearance (glyphs).

Doing so may cause some problems:
- confusion between users who know which is which
- may lead to problems if a text will use Turkish language; although rare as expectation, this *is* a situation to consider; the official Romanian keyboard layout standard has explicit support for minority languages (Turkish included) by the use of dead keys; the purpose is the ability to insert text in minority languages (Turkish involved in this discussion), like names of persons, places or streets in some official Romanian documents
- will discourage the use of the correct characters from peoples who will say "hey, my old legacy characters are looking fine anyway, why should I bother with the correct ones ?"; we need to encourage the *use* of the correct characters, not the correct looking of characters

Please note that the Romanian language is Latin based and that all glyphs should resemble the Unicode abstract description of characters (1:1 correspondence).

Cristi

Revision history for this message
Mihai Capotă (mihaic) wrote :

On Sun, Aug 22, 2010 at 11:56, Mark Shuttleworth
<email address hidden> wrote:
>  On 22/08/10 08:41, Mihai Capotă wrote:
>> The source of the problem is that many people still input text using the
>> old cedilla characters. This is mainly on Windows XP which doesn't
>> support the new comma characters.
> Is that problem declining as XP is phased out? Do moderrn OS's handle it
> correctly?

Yes, Ubuntu handles it correctly and so do Windows Vista and 7, and Mac OS X.

But new text creation is only part of the problem. The cedilla
documents will exist for the foreseeable future. Replacing the actual
characters (as opposed to the glyphs, their aspect), is a very
difficult thing which may never happen.

Mihai

Revision history for this message
Paul Sladen (sladen) wrote :

As I understand it, this whole mess stems from [TtSs]commaaccent being missed out of earlier Unicode standards, and [TtSs]cedilla being used in the interim: the locl table fixes being an attempt to smooth things over. Reading material:

  http://partners.adobe.com/public/developer/en/opentype/aglfn13.txt (first two 2003 changelogs)
  http://forum.fontlab.com/fontlab-studio-tips-and-tricks/handling-romanian-glyphs-in-opentype-fonts-updated-as-of-2009-t337.0.html (suggested best practice + example howto for Fontlab)

Assuming that we're doing that (ignoring the original issue with the wrong glyphs), then we're at least doing the same as everyone else and being consistent. ...It could well be that other parts of the stack need fixing in the long-run (keyboard input, toolkit, applications, ...) but that's not a reason to stunt deploying the advanced metadata tables now. The suggested localisation mapping on that page is:

  languagesystem latn dflt;
  languagesystem latn ROM;
  languagesystem latn MOL;

  feature locl { # Localized Forms
    # Latin
    language ROM exclude_dflt; # Romanian
    lookup locl_ROM {
      sub [Scedilla scedilla] by [uni0218 uni0219];
      sub [uni0162 uni0163] by [uni021A uni021B];
    } locl_ROM;
    language MOL exclude_dflt; # Moldavian
    lookup locl_ROM;
  } locl;

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

 On 22/08/10 13:18, Paul Sladen wrote:
> As I understand it, this whole mess stems from [TtSs]commaaccent being
> missed out of earlier Unicode standards, and [TtSs]cedilla being used in
> the interim: the locl table fixes being an attempt to smooth things
> over. Reading material:
>
> http://partners.adobe.com/public/developer/en/opentype/aglfn13.txt (first two 2003 changelogs)
> http://forum.fontlab.com/fontlab-studio-tips-and-tricks/handling-romanian-glyphs-in-opentype-fonts-updated-as-of-2009-t337.0.html (suggested best practice + example howto for Fontlab)
>
> Assuming that we're doing that (ignoring the original issue with the
> wrong glyphs), then we're at least doing the same as everyone else and
> being consistent. ...It could well be that other parts of the stack
> need fixing in the long-run (keyboard input, toolkit, applications, ...)
> but that's not a reason to stunt deploying the advanced metadata tables
> now. The suggested localisation mapping on that page is:
>
> languagesystem latn dflt;
> languagesystem latn ROM;
> languagesystem latn MOL;
>
> feature locl { # Localized Forms
> # Latin
> language ROM exclude_dflt; # Romanian
> lookup locl_ROM {
> sub [Scedilla scedilla] by [uni0218 uni0219];
> sub [uni0162 uni0163] by [uni021A uni021B];
> } locl_ROM;
> language MOL exclude_dflt; # Moldavian
> lookup locl_ROM;
> } locl;

This looks like the best approach, assuming (a) that FontLab recommended
best practice is being more widely embraced and is the trajectory for OS
X, Win 7 etc, and (b) that's what we are currently doing or close to it.

Anyone from Dalton Maag able to comment? If we're in line with (a) and
(b) then I'm happy to close the bug.

Mark

Revision history for this message
Mihai Capotă (mihaic) wrote :

On Sun, Aug 22, 2010 at 16:04, Mark Shuttleworth
<email address hidden> wrote:
> This looks like the best approach, assuming (a) that FontLab recommended
> best practice is being more widely embraced and is the trajectory for OS
> X, Win 7 etc, and (b) that's what we are currently doing or close to it.

I can see how Dalton Maag may be reluctant to go against the current
font creation best practice, but:

(1) The best practice was wrong before.

Previously, the recommendation was to always use commas for U+0162,
U+0163 (Ţţ). This is what Dalton Maag initially did and fixed as a
response to this bug report. That mean that one of the Romanian
letters had a cedilla and the other had a comma. How was that good?

(2) There is no benefit for Ubuntu users, it only creates confusion.

Ubuntu already uses the correct characters for input. Not even print
(paper, plastic) designers would benefit from having the locl feature
since they already input the new characters.

(3) Cristian Secară, the creator of the only keyboard driver allowing
correct Romanian usage on Windows XP, backes me up. He's even
referenced on the matter of correct Romanian usage by Microsoft:

http://www.microsoft.com/romania/Diacritice.aspx

Outside of this bug, there is no need for extensive stack fixing in
Ubuntu. Everything else already works as it should.

Using glyph substitution does not "smooth things over". Why should
Romanians see letters in a different way than all other people? There
is no need to use it, other the complying with a broken best practice.

Mihai

Revision history for this message
Paul Sladen (sladen) wrote :

I can certainly see the advantage in not carrying unnecessary baggage, or variations that cause issues with debugging.

Mihai: could you pragmatically list what would need doing to get to /your/ ideal situation. (Just to delete the locl table overrides?)

Revision history for this message
Mihai Capotă (mihaic) wrote :

On Sun, Aug 22, 2010 at 19:45, Paul Sladen <email address hidden> wrote:
> Mihai: could you pragmatically list what would need doing to get to
> /your/ ideal situation.  (Just to delete the locl table overrides?)

Indeed. Just removing locl overrides for Romanian.

Apparently, there were two problems:

(1) U+0162, U+0163 (cedilla "T"s) were always using commas. Malcolm
Wooden already said in comment #6 that it will be fixed.

(2) U+015E, U+015F, U+0162, U+0163 (cedilla "S"es and "T"s) glyphs are
being replaced with U+0218, U+0219, U+021A, U+021B (comma) glyphs
through locl *only* when application language is Romanian.

Mihai

Revision history for this message
Cristian Secară (secarica) wrote :

On 2010-08-22, Paul Sladen wrote:

> As I understand it, this whole mess stems from [TtSs]commaaccent
> being missed out of earlier Unicode standards, and [TtSs]cedilla
> being used in the interim: the locl table fixes being an attempt to
> smooth things over. Reading material:
> http://partners.adobe.com/public/developer/en/opentype/aglfn13.txt
> (first two 2003 changelogs)

With this stupidity Adobe has done a great disservice to our culture. Along with other mistakes (were some on these mistakes were at Romanian side, like lack of standardization, certain degree of incompetence), Adobe's "contribution" has delayed the correctness for Romanian language with several years (I say ~10 years). With a lot of labor from our part, working together in a sort of team with a few Microsoft representatives, finally in Windows Vista the cedillas are back to cedillas on ş and ţ (and commas were added for ș and ț) at glyphs level [*], but Vista was out in 2007 or so.

Cristi

[*] in most (not sure if in all) core fonts

Revision history for this message
Paul Sladen (sladen) wrote :

ubuntu-private-fonts (0.1.10~ppa1) maverick; urgency=low

  * New upstream release.
    - Regular:
      + Misplaced accent above lowercase y corrected. (LP: #619951)
      + Hyphen made 1 pixel heavier in sizes 32-36ppm.
      + Respaced hypen in 8-16 ppm. (LP: #608758)
      + Respace s in 12-18ppm to look less like a 5.
      + Period made 1 pixel lighter az 27 ppm.
      + Lowercase w made wider at 22ppm.
      + Small size (8-9ppm) greyscale improved. (LP: #621456)
      + Accent positions improved where possible. (LP: #622385)
    - Regular & Italic:
      + x height increased 1 pixel at 20,22,24 ppm.
    - Bold Italic:
      + Some Greek Poly glyphs adjusted spacing.
    - All weights:
      + U+0162, U+0163 now have cedilla accents. (LP: #615565)

 -- Daniel Holbach <email address hidden> Wed, 25 Aug 2010 13:26:32 +0200

Changed in ubuntu-font-family:
status: Fix Committed → Fix Released
Revision history for this message
Paul Sladen (sladen) wrote :

Cristian, Mihai: can you check over the fonts in the 0.1.10~ppa1 release.

Revision history for this message
Mihai Capotă (mihaic) wrote :

I don't have access to the testing PPA. I'm checking using the website http://fonttest.design.canonical.com.

I can see the change advertised in the change list. U+0162, U+0163 do have cedilla accents, but *not* when the language is Romanian. Exactly as Malcolm Wooden wrote in comment 6.

This is a step towards the fix. The removal of the locl substitutions for Romanian would now fix the bug.

Revision history for this message
Paul Sladen (sladen) wrote :

Mihai: you would be most welcome to apply to join the 'ubuntu-typeface-interest' team. Full instructions on:

  https://wiki.ubuntu.com/Ubuntu%20Font%20Family#Howto

then I think we should open a /new/ separate bug specifically about the iocl substituion removal so we can have the satisfaction of at least having managed to close this one and greater clarity with what is needed.

Revision history for this message
Mihai Capotă (mihaic) wrote :

I applied to join the ubuntu-typeface-interest team.

I tried opening a new bug specifically about the glyph substitution, but I get "We're sorry, but something went wrong" at http://fonttest.design.canonical.com/submit and "Not allowed here" at https://bugs.launchpad.net/ubuntu-font-family/+filebug. Any suggestions? Should I report a bug against launchpad? :)

Revision history for this message
Mihai Capotă (mihaic) wrote :

After being accepted as a member of the ubuntu-typeface-interest team, I could submit the new bug report as Paul Sladen suggested. It's bug #635615.

Thanks for the fixing U+0162/U+0163.

Paul Sladen (sladen)
Changed in ubuntu-font-family:
milestone: none → 0.009
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.