Chinese characters unexpectedly switch fonts in WebKit-GTK

Reported by Jimhu on 2010-01-03
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fontconfig
Confirmed
Medium
Ubuntu Software Center
Undecided
Unassigned
Ubuntu Translations
Medium
Unassigned
webkit (Ubuntu)
Undecided
Unassigned
Nominated for Karmic by Aron Xu
Nominated for Lucid by Aron Xu

Bug Description

Ubuntu Software Center 1.1.7, Epiphany 2.28.0, Midori 0.1.9 (Ubuntu 9.10)

In WebKit-GTK, Chinese text varies fonts unexpectedly.

For example, in Ubuntu Software Center's main screen, where the Office department is labelled "办公", "办" is in the WQY font while "公” is in the Ukai font. And where the Universal Access department is labelled "全局访问", the "全局" and "访问" are in different fonts.
    http://launchpadlibrarian.net/37382124/Lesson07_images_033.png

You can see the same problem in the Epiphany and Midori browsers, which also use WebKit-GTK, by copying and pasting this into their address field:
    data:text/html,<meta%20http-equiv="Content-Type"%20content="text/html;%20charset="utf-8">办公%20全局访问

Firefox, AbiWord, and OpenOffice.org do not have the same problem. Interestingly, neither does Chromium 4.0.293.0 (35769, from the Chromium daily PPA), which suggests that it may be a WebKit-GTK bug that has been fixed since the version packaged in Ubuntu 9.10.

Download full text (3.5 KiB)

Created attachment 24321
mixture of Han glyphs from Japanese and Chiense fonts under en locale

The font order in 65-nonlatin.conf in fontconfig has many issues, and is causing more and more troubles for supporting CJK languages. Here is a summary of the problems I found:

1. mixing proprietary fonts with free fonts

In this file, "MS Gothic","SimSun","PMingLiu", "HanyiSong" and "MS 明朝" are proprietary fonts. As far as I know, none of the Linux distros received permission to use these fonts from the copyright owners of the fonts. Giving a higher priority to proprietary fonts will increase the user's dependency to them, encouraging font piracy and reduce user feedbacks for FLOSS font development.

In addition, "SimSun" used to be a popular "pirate" Chinese fonts for Chinese Linux users about 5 years ago due to the embedded bitmaps, but in the past 4 years, WenQuanYi project has developed high quality open-source bitmap fonts and sans-serif Chinese fonts, and getting far more popular than SimSun and most other proprietary Chinese fonts.

Also, AFAIK, "ZYSong18030" was only licensed to Redhat 9, from Zhong Yi Beijing Inc., and this font has no embedded bitmaps. Therefore, the user group of this font is quite small.

2. sans-serif and serif used the same font order

In CJK fonts, there are concepts such as Song (Mincho, Ming, or Batang), or Hei (Gothic or Dotum) correspond to sans-serif and serif font in Latin world. "Kai" is a style more or less correspond to italic or script. However, these fonts were ordered in the same way in both the serif and sans-serif blocks in 65-non-latin. The proper way should be

for serif:
  bitmap Chinese fonts (style independent) > Song/Ming > Mincho/Batang > Hei > Gothic/Dotum > Kai > system fallback (GNU Unifont exp.)

for sans-serif:
  bitmap Chinese fonts (style independent) > Hei > Gothic/Dotum > Song/Ming > Mincho/Batang > Kai > system fallback (GNU Unifont exp.)

I will explain the order for CJK fonts below.

3. fonts with lower unicode coverage and low quality were placed in front of more complete and polished ones

Japanese and Korean fonts usually contains only 6000 Han glyphs, while Chinese fonts, the typically charset is typically 20000. Because 65-nonlatin puts many Japanese fonts in front of Chinese fonts, when rendering a block of text with Han glyphs, one often see a mixture of Gothic, Mincho, Song and Kai glyphs, which looks horrible. See the attached screen capture:

I suggest to put Chinese fonts in front of Japanese/Korean fonts. When Pango fail to determine the Chinese text (which happens when rendering Han text under non-CJK locales), at least we can render the text with a consistent font (despite the z-variant differences). If Pango can determine the language, then, use language specific fontconfig rules to set the font order later (such as the language-selector-xx in Ubuntu).

4. order the font based on readability

The readability of Chinese fonts is a very complex problem. It is both technology (screen resolution, hinting techniques etc) and fashion (font styles from MS and Mac strongly influences Linux users) dependent. Therefore, it is constantly changing. More "modern" Chinese us...

Read more...

A correction:

Song (Mincho, Ming, or Batang), or Hei (Gothic or Dotum) correspond to serif and sans-serif fonts, respectively. My original post has the reversed order.

Download full text (3.5 KiB)

(In reply to comment #0)
> Created an attachment (id=24321) [details]
> mixture of Han glyphs from Japanese and Chinese fonts under en locale

Right this problem is well-known.

> In this file, "MS Gothic","SimSun","PMingLiu", "HanyiSong" and "MS 明朝"
> are proprietary fonts. As far as I know, none of the Linux distros received
> permission to use these fonts from the copyright owners of the fonts. Giving a
> higher priority to proprietary fonts will increase the user's dependency to
> them, encouraging font piracy and reduce user feedbacks for FLOSS font
> development.
>
> In addition, "SimSun" used to be a popular "pirate" Chinese fonts for Chinese
> Linux users about 5 years ago due to the embedded bitmaps, but in the past 4
> years, WenQuanYi project has developed high quality open-source bitmap fonts
> and sans-serif Chinese fonts, and getting far more popular than SimSun and most
> other proprietary Chinese fonts.
>
> Also, AFAIK, "ZYSong18030" was only licensed to Redhat 9, from Zhong Yi Beijing
> Inc., and this font has no embedded bitmaps. Therefore, the user group of this
> font is quite small.

Agreed. I think the propriety fonts should be moved to a non-free .conf file at least,
which should have lower priority than free ones. This would be a good opportunity
to clean up 65-nonlatin.conf.

> 2. sans-serif and serif used the same font order

Agree on the idea of correcting this.

> for serif:
> bitmap Chinese fonts (style independent) > Song/Ming > Mincho/Batang > Hei >
> Gothic/Dotum > Kai > system fallback (GNU Unifont exp.)
>
> for sans-serif:
> bitmap Chinese fonts (style independent) > Hei > Gothic/Dotum > Song/Ming >
> Mincho/Batang > Kai > system fallback (GNU Unifont exp.)

I think using bitmap before outline is a bad idea for JK anyway.

I suggest having a separate switch to turn on bitmap in the fontconfig rules perhaps.

> 3. fonts with lower unicode coverage and low quality were placed in front of
> more complete and polished ones

("Quality" may be subjective - anyway CJK respective styles are too different to
allow a common shared font.)

> Japanese and Korean fonts usually contains only 6000 Han glyphs, while Chinese
> fonts, the typically charset is typically 20000. Because 65-nonlatin puts many
> Japanese fonts in front of Chinese fonts, when rendering a block of text with
> Han glyphs, one often see a mixture of Gothic, Mincho, Song and Kai glyphs,
> which looks horrible.

Nod

> I suggest to put Chinese fonts in front of Japanese/Korean fonts. When Pango
> fail to determine the Chinese text (which happens when rendering Han text under
> non-CJK locales), at least we can render the text with a consistent font
> (despite the z-variant differences). If Pango can determine the language, then,
> use language specific fontconfig rules to set the font order later

Sounds reasonable enough.

> 4. order the font based on readability
>
> The readability of Chinese fonts is a very complex problem. It is both
> technology (screen resolution, hinting techniques etc) and fashion (font styles
> from MS and Mac strongly influences Linux users) dependent. Therefore, it is
> constantly changing. More "modern" Chinese us...

Read more...

Download full text (4.8 KiB)

Created attachment 24952
proposed 65-nonlatin.conf

Please find in the attachment the proposed font orders for CJK languages.

A few comments about this file:

1. the original file was taken from Behdad's branch at
http://cgit.freedesktop.org/~behdad/fontconfig/tree/conf.d/65-nonlatin.conf

2. I only touched CJK fonts. As I know nothing about other non-CJK languages, so I replaced the old list by my new font list and keep all others the same.

3. the fonts were ordered pretty much based on my previous comment, in short:
    A. free > non-free
    B. screen (CJK bitmap) fonts > print fonts
    C. Larger coverage > smaller coverage (CJK Unifonts > CJK specific fonts)
    D. for monospace, sans > serif, where sans has better readability
    E. for the same language, fonts with better "quality" are preferred

4. for "E" in (3), I would like to hear more input from CJK users.

5. I strongly suggest removing all the non-free fonts, at least move them to a separate file (plus that they are from the XP age and they are very outdated)

6. This file describes the general fallback path, and is assumed not to make any assumption on the desktop locales, thus, it is preferred to have language-specific font config files to fine-tune the font orders, such as language-selector files in Ubuntu.

Here is the CJK block I extracted from the serif block, as an example to my suggested changes. For each family, a comment line with the format of
{license, coverage, type, intended use, aliases, major lang-tag} is listed above the font name.

*GB18030(27514 glyphs)=CJK unified ideographs+CJK Ext A
*GBK(20932)=CJK unified ideographs
*GB2312(6763)=simplified Chinese minimum charset
*Big5(~13000)=traditional Chinese minimum charset
*HKSCS(vary)=HK Han glyphs scattered in CJK, CJK Ext.A and Ext.B

------------------------------------------------

<!-- ### block 1: Screen fonts ### -->
  <!-- free, GB18030, bitmap, screen font, sans/serif, zh-cn,zh-tw -->
 <family>WenQuanYi Bitmap Song</family> <!-- han (zh-cn,zh-tw) -->

<!-- ### block 2: Song/Micho/Batang print fonts ### -->
  <!-- free, GB2312+Big5+HKSCS, vector, print font, serif, zh-cn,zh-tw -->
 <family>AR PL ShanHeiSun Uni</family> <!-- han (zh-cn,zh-tw) -->
 <family>AR PL UMing CN</family> <!-- han (zh-cn,zh-tw) -->
 <family>AR PL New Sung</family> <!-- han (zh-cn,zh-tw) -->
  <!-- free, GB2312, vector, print font, serif, zh-cn -->
 <family>AR PL SungtiL GB</family>
  <!-- free, Big5, vector, print font, serif, zh-tw -->
 <family>AR PL Mingti2L Big5</family>
  <!-- free, GB2312+Big5+HKSCS, vector, print font, serif/cursive, zh-cn,zh-tw -->
  <family>AR PL Zenkai Uni</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
 <family>IPAMonaPMincho</family>
 <family>IPAPMincho</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
 <family>Sazanami Mincho</family>
  <!-- free, JIS, vector, print/screen font, sans, ja -->
 <family>Kochi Mincho</family>
  <!-- free, KR, vector, print/screen font, serif, ko -->
 <family>Baekmuk Batang</family> <!-- han (ko) -->
  <!-- free, KR, vector, print/screen font, serif, ko -->
 <family>UnBatang</family> <!-- han (ko) -->

<!-- ### block 3: Hei/Gothic/Dotum p...

Read more...

(In reply to comment #3)
> B. screen (CJK bitmap) fonts > print fonts

Should screen fonts really be bitmap - I would prefer to separate out bitmap fonts.

> <!-- ### block 1: Screen fonts ### -->
> <!-- free, GB18030, bitmap, screen font, sans/serif, zh-cn,zh-tw -->
> <family>WenQuanYi Bitmap Song</family> <!-- han (zh-cn,zh-tw) -->

I personally don't think a bitmap font should be preferred for CJK.
Though it should be made easy for users to use bitmaps if they want to.

> <!-- ### block 2: Song/Micho/Batang print fonts ### -->
> <!-- free, GB2312+Big5+HKSCS, vector, print font, serif, zh-cn,zh-tw -->
> <family>AR PL ShanHeiSun Uni</family> <!-- han (zh-cn,zh-tw) -->
> <family>AR PL UMing CN</family> <!-- han (zh-cn,zh-tw) -->
> <family>AR PL New Sung</family> <!-- han (zh-cn,zh-tw) -->

Isn't UMing newer than ShanHeiSun?

(In reply to comment #4)

> Should screen fonts really be bitmap - I would prefer to separate out bitmap
> fonts.
>
> I personally don't think a bitmap font should be preferred for CJK.
> Though it should be made easy for users to use bitmaps if they want to.
>

the sole purpose of designing those bitmaps are for screen use (obviously, they are not good for print). If you put them backwards, that basically means these fonts will never be used even they are installed. Unless it boosts it self to the front using its own config file, like wqy-bitmap-fonts. But that just makes the font swamp messier. Do you really want that way?

>
> Isn't UMing newer than ShanHeiSun?
>

that's true.

I like the general direction this bug is heading. Quick comments:

  - Time to put CJK stuff in its own file?

  - Helps immensely if you also tag each font in the comments whether it has Latin / Arabic / any non-CJK glyphs or not.

  - I think non-free fonts should have higher priority than free fonts. If a user installs non-free fonts, chances are they want to use it. For all other users though, it's a non-issue since they only have free fonts so the order doesn't matter.

(In reply to comment #6)
> - I think non-free fonts should have higher priority than free fonts. If a
> user installs non-free fonts, chances are they want to use it. For all other
> users though, it's a non-issue since they only have free fonts so the order
> doesn't matter.

How about having non-free fonts in a separate file though?

(In reply to comment #7)
> (In reply to comment #6)
> > - I think non-free fonts should have higher priority than free fonts. If a
> > user installs non-free fonts, chances are they want to use it. For all other
> > users though, it's a non-issue since they only have free fonts so the order
> > doesn't matter.
>
> How about having non-free fonts in a separate file though?

How would that help?

> How would that help?

Well it would make clear which fonts are free and which not...
if it does not make the priority numbering more complicated.

It would also allow people to turn non-free fonts "on" and "off" from fontconfig.

(Though I agree it is not the main issue here.)

(In reply to comment #6)
> I like the general direction this bug is heading. Quick comments:
>
> - Time to put CJK stuff in its own file?

that seems to be fine, something like 65-cjk.conf

>
> - Helps immensely if you also tag each font in the comments whether it has
> Latin / Arabic / any non-CJK glyphs or not.

almost all of them have Latin (basic), but rarely have Arabic. I will add more
comment when I get chance this weekend.

> - I think non-free fonts should have higher priority than free fonts. If a
> user installs non-free fonts, chances are they want to use it. For all other
> users though, it's a non-issue since they only have free fonts so the order
> doesn't matter.

As I said, I personally haven't heard any official licenses given from these
font makers to use their fonts on a Linux desktop. If anyone install and use
these fonts, it is very likely illegal. In another word, putting them in the
conf files simply makes unlicensed use of commercial fonts easier, and of
course, OSS font development projects will potentially lose users and feedback.

In the long-run, Linux desktop needs more high-quality CJK fonts, and these
fonts are less likely come from the commercial font makers, but the active OSS
font projects. So, helping the commercial font makers to promote their fonts in
the OSS community will eventually hurt linux desktop (by binding more and more
users to the proprietary fonts).

Plus, the current OSS CJK fonts are really on-par in quality with the
commercial ones: WenQuanYi's bitmaps are of similar quality to commercial
bitmaps, and more complete; "Droid Sans Fallback" from Google is really a
professionally developed font bought from some Chinese company. WenQuanYi Zen
Hei also performs very well on Linux desktop and progresses everyday with users
feedback. As we now have plenty of choices with OSS fonts, I don't think making
the commercial fonts use out-of-box will buy us any benefit.

If CJK needs to give default support for commercial fonts, there are tons of
commercial Latin fonts (like Arial, Helvetica ...) in the market, should we
also pre-configue fontconfig for them as well...?

(In reply to comment #10)
> As I said, I personally haven't heard any official licenses given from these
> font makers to use their fonts on a Linux desktop. If anyone install and use
> these fonts, it is very likely illegal. In another word, putting them in the
> conf files simply makes unlicensed use of commercial fonts easier, and of
> course, OSS font development projects will potentially lose users and feedback.
>
> In the long-run, Linux desktop needs more high-quality CJK fonts, and these
> fonts are less likely come from the commercial font makers, but the active OSS
> font projects. So, helping the commercial font makers to promote their fonts in
> the OSS community will eventually hurt linux desktop (by binding more and more
> users to the proprietary fonts).
>
> Plus, the current OSS CJK fonts are really on-par in quality with the
> commercial ones: WenQuanYi's bitmaps are of similar quality to commercial
> bitmaps, and more complete; "Droid Sans Fallback" from Google is really a
> professionally developed font bought from some Chinese company. WenQuanYi Zen
> Hei also performs very well on Linux desktop and progresses everyday with users
> feedback. As we now have plenty of choices with OSS fonts, I don't think making
> the commercial fonts use out-of-box will buy us any benefit.
>
> If CJK needs to give default support for commercial fonts, there are tons of
> commercial Latin fonts (like Arial, Helvetica ...) in the market, should we
> also pre-configue fontconfig for them as well...?

I still don't think making it unnecessarily hard for people who install fonts makes any sense. If there was *any* advantage, sure, but so far I fail to see one.

(In reply to comment #11)
> I still don't think making it unnecessarily hard for people who install fonts
> makes any sense.

but what you said is not consistent with Latin font settings in fontconfig: I don't see Tahoma, Calibri, Segeo etc in the latin configuration files; Arial is also set to a lower priority than Dejavu/Bitstream. Why should CJK use a reversed support order? because CJK people like piracy? (joking of course)

> If there was *any* advantage, sure, but so far I fail to see one.

I thought I said it clear:

1. attract more users and feedback for CJK OSS font development. Since all OSS software can benefit from release-often-release-early model to evolve, why OSS fonts can not benefit from this model?

2. discourage unlicensed use of fonts, because it is simply wrong. if nobody respect font copyright, nobody will spend time to develop them and make them better.

(In reply to comment #12)
> (In reply to comment #11)
> > I still don't think making it unnecessarily hard for people who install fonts
> > makes any sense.
>
> but what you said is not consistent with Latin font settings in fontconfig: I
> don't see Tahoma, Calibri, Segeo etc in the latin configuration files; Arial is
> also set to a lower priority than Dejavu/Bitstream. Why should CJK use a
> reversed support order? because CJK people like piracy? (joking of course)

If they are not mentioned, they are not. I'm just talking about the case that they are.

> > If there was *any* advantage, sure, but so far I fail to see one.
>
> I thought I said it clear:
>
> 1. attract more users and feedback for CJK OSS font development. Since all OSS
> software can benefit from release-often-release-early model to evolve, why OSS
> fonts can not benefit from this model?
>
> 2. discourage unlicensed use of fonts, because it is simply wrong. if nobody
> respect font copyright, nobody will spend time to develop them and make them
> better.

I don't have any strong opinion here as long as the approach taken minimizing incoming bug reports in the future :).

(In reply to comment #13)
> If they are not mentioned, they are not. I'm just talking about the case that
> they are.

if what you mean is to focus on the propitiatory CJK fonts that have already been included, I can tell you they are all past-seasons (some of them have never been a popular choice at all, such as HanyiSong and ZYSong18030). The new favorite commercial Chinese fonts are MS YaHei/Jhenghei (with good hinting) from Windows Vista, and ST Hei/LiHeiPro from MacOS X. These fonts now can be spotted over 80% in propitiatory-fonts-only-screenshots from Chinese Linux forum posts. So, keeping the "mentioned" fonts probably won't make these users happier.

(In reply to comment #13)
> If they are not mentioned, they are not. I'm just talking about the case that
> they are.
>
> I don't have any strong opinion here as long as the approach taken minimizing
> incoming bug reports in the future :).
>

anyway, these are just my suggestions. It might be better you get a second opinion from other CJK developers/users.

Also, I forget to include GNU unifont. Last year, Paul Hardy incorporate WenQuanYi's Han glyphs (WenQuanYi Unibit) to this font; the latest version of GNU Unifont now covers the entire BMP (probably the only one so far)
   http://unifoundry.com/unifont.html
Although I know most of you prefer vector fonts and disable bitmaps in fontconfig, but I think it does not hurt to add it as the system fallback, in case people want more glyphs and don't care about bitmaps.

There is also a 69-unifont.conf, probably the differences between these unifonts and cjk fonts are not that big.

If fontconfig supports an additional lang-tag match when setting <prefer> list, that can make CJK settings a lot easier. Something like

  <alias>
    <test name="lang" compare="contains">
       <string>ja</string>
    </test>

    <family>sans-serif</family>
    <prefer>
      <family>DejaVu Sans</family>
      <family>Bitstream Vera Sans</family>
      <family>VL PGothic</family>
    </prefer>
  </alias>

I guess it is probably equivalent to the following (?)

 <match>
  <test name="lang" compare="contains">
   <string>ja</string>
  </test>
  <edit name="family" mode="prepend_first" binding="strong">
   <string>VL PGothic</string>
  </edit>
  <edit name="family" mode="prepend_first" binding="strong">
   <string>Bitstream Vera Sans</string>
  </edit>
  <edit name="family" mode="prepend_first" binding="strong">
   <string>DejaVu Sans</string>
  </edit>
 </match>

but I know a lot of people just don't like "strong" binding.

I suggest to split 65-nonlatin to following files.

- 65-nonlatin-zh (or 65-chinese)
- 65-nonlatin-ja (or 65-japanese)
- 65-nonlatin-ko (or 65-korean)
- 65-nonlatin

I'll check out Japanese fonts order in this week.

btw, Should we also check 25-unhint-nonlatin and 40-nonlatin?

Created attachment 27259
25-unhint-nonlatin.conf

(In reply to comment #17)
> I suggest to split 65-nonlatin to following files.
>
> - 65-nonlatin-zh (or 65-chinese)
> - 65-nonlatin-ja (or 65-japanese)
> - 65-nonlatin-ko (or 65-korean)
> - 65-nonlatin
>
> I'll check out Japanese fonts order in this week.
>
> btw, Should we also check 25-unhint-nonlatin and 40-nonlatin?
>

I have proposed a set of CJK language specific fontconfig settings at
https://bugzilla.redhat.com/show_bug.cgi?id=499902
these files were tested with the above mentioned 65-nonlatin file, and appeared to eliminate some of the CJK font conflict in fedora. These language-specific files were numbered before 65.

For the unhint file, I think you may want to turn off "autohint", not "hinting". When autohint is on, hintstyle=hintmedium or hintfull will give poor rendering for CJK characters. "hintslight" may be acceptable for some users.

Also, we recently released a high quality vector font, WenQuanYi Micro Hei (Mono), build upon Google's Droid font family. We incorporated the hinting instructions from Droid Sans into this font and have achieved good screen quality of both Latin and non-Latin glyphs.

see more details about this font:
http://wenq.org/enindex.cgi?MicroHei(en)
http://wenq.org/enindex.cgi?MicroHei_BigBang_README
http://wenq.org/enindex.cgi?MicroHei_BigBang_ChangeLog
http://packages.debian.org/unstable/x11/ttf-wqy-microhei

if turning hinting off globally, I am afraid I have to put additional settings to enable it for this font.

Created attachment 27340
25-unhint-nonlatin.conf

changed from hinting to autohint.
I tried wqy-microhei. It seems that even hintstyle=hintslight gives poor rendering for CJK.

How to check:
$ pango-view --font='WenQuanYi Micro Hei' --text='微' --waterfall

(In reply to comment #20)
> I tried wqy-microhei. It seems that even hintstyle=hintslight gives poor
> rendering for CJK.
>
> How to check:
> $ pango-view --font='WenQuanYi Micro Hei' --text='微' --waterfall
>

I am not surprise at all. Autohinting for CJK glyphs is not usable.

I suggest these Japanese order.

old serif order:
<family>MS Gothic</family>
<family>UmePlus P Gothic</family>
<family>Sazanami Mincho</family>
<family>IPAMonaMincho</family>
<family>IPAMincho</family>
<family>Kochi Mincho</family>
<family>AR PL ShanHeiSun Uni</family>
<family>MS 明朝</family>

new serif order:
<family>MS PMincho</family> <!-- proprietary -->
<family>IPAPMincho</family>
<family>IPAPMonaMincho</family>
<family>Sazanami Mincho</family>
<family>Kochi Mincho</family>

'MS Gothic' and 'UmePlus P Gothic' are sans-serif fonts, not serif.
'MS PMincho' is same as 'MS P明朝'.
'AR PL ShanHeiSun Uni' is not japanese glyph.

old sans-serif order:
<family>MS Gothic</family>
<family>UmePlus P Gothic</family>
<family>AR PL ShanHeiSun Uni</family>
<family>VL Gothic</family>
<family>IPAMonaGothic</family>
<family>IPAGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>

new sans-serif order:
<family>MS PGothic</family> <!-- proprietary -->
<family>UmePlus P Gothic</family>
<family>VL PGothic</family>
<family>IPAPGothic</family>
<family>IPAMonaPGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>

Added 'P' to the name. P means proprietary.
Sazanami and Kochi has no proprietary fonts.

old monospace order:
<family>MS Gothic</family>
<family>UmePlus Gothic</family>
<family>VL Gothic</family>
<family>IPAMonaGothic</family>
<family>IPAGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>
<family>AR PL ShanHeiSun Uni</family>
<family>MS ゴシック</family>

new monospace order:
<family>MS Gothic</family> <!-- proprietary -->
<family>UmePlus Gothic</family>
<family>VL Gothic</family>
<family>IPAGothic</family>
<family>IPAMonaGothic</family>
<family>Sazanami Gothic</family>
<family>Kochi Gothic</family>

'MS Gothic' is same as 'MS ゴシック'.

cf.
UmePlus http://www.geocities.jp/ep3797/modified_fonts_01.html
VL Gothic http://dicey.org/vlgothic/
IPAfont http://ossipedia.ipa.go.jp/ipafont/
IPA Mona font http://www.geocities.jp/ipa_mona/
sazanami font http://wiki.fdiary.net/font/?sazanami
Kochi font http://wiki.fdiary.net/font/?kochi-alternative

(In reply to comment #22)
> I suggest these Japanese order.

Yep, looks reasonable enough to me. :-)

Just wonder: why list the MS fonts twice (beginning and end)?

sorry for delay.

(In reply to comment #23)
> Just wonder: why list the MS fonts twice (beginning and end)?

I don't know why old 65-nonlatin contains twice. :)

(In reply to comment #22)
> Added 'P' to the name. P means proprietary.
> Sazanami and Kochi has no proprietary fonts.

P means propotional, not proprietary.

(In reply to comment #2)
> > I suggest to put Chinese fonts in front of Japanese/Korean fonts. When Pango
> > fail to determine the Chinese text (which happens when rendering Han text under
> > non-CJK locales), at least we can render the text with a consistent font
> > (despite the z-variant differences). If Pango can determine the language, then,
> > use language specific fontconfig rules to set the font order later
>
> Sounds reasonable enough.
>

Nak.

If Chinese fonts are put in front of Japanese/Korean fonts, Japanese/Korean text have same issue because each languages has different letter shape in same code of Unicode. That's why I suggest to split c/j/k configuration files.

ex.
pango-view --waterfall --text='与返骨直' --language=zh_CN
pango-view --waterfall --text='与返骨直' --language=zh_TW
pango-view --waterfall --text='与返骨直' --language=ja

cf. http://d.hatena.ne.jp/mashabow/20090514/1242292024
'Arial Unicode MS' has multiple letter shapes. It's great.

51 comments hidden view all 131 comments

IPA Mona fonts should be put prior to plain IPA fonts because original fonts contains quirks that makes them work in microsoft and vice versa with anything other.

Created attachment 31663
65-nonlatin test suite svn rev23

I uploaded the test suite to WQY's svn and hopefully that can make revision process easier.

I attached the tarball for rev 23, but you can check it out with
 svn co https://wqy.svn.sf.net/svnroot/wqy/trunk/65nonlatin_test_suite \
  65nonlatin_test_suite

A summary for the changes I've made so far:

1. I split the original 41-* files into 41- and 65-, with 41-* for family->generic (per comment#70)
2. I updated 41-language-ja by attachment 31610 proposed by Hideki
3. I moved IPA Mona in front of IPA for 65-language-ja, per comment#76 by Baybal

With this version, I tested the following command:

  LANG=<zh_CN,en_US,ja_JP>.UTF-8 pango-view --markup --text 'English: 英文中的汉字,<span lang="zh">您好中文汉字</span>,<span lang="ja">您好日文</span>(Note: "您" is a Chinese-specific Hanzi)' --font="sans"

If you run this command with non-CJK locales, I anticipate the font preferences in 65-nonlatin will control the text rendering out-side the <span></span>, while the text inside the <span> will be controlled by 65-language-*.

If you run this command in a CJK locale, the text inside the <span> will remain be controlled by the 65-language files, and outside will be controlled by the 65-language-<current locale>.

With this numbering, user can overwrite the default settings with their ~/.fonts.conf.

If you make any change, please run the above test command and upload either a "svn diff" output or the tarball of the full package.

Created attachment 31684
split fontconfig files

Here is a conter-proposal with the fontconfig files properly split per font (ie I want a setup that any individual packager can trivially copy when he needs to package a new cjk font, not something that lefts him scratching his head and feel he'll go nowhere without going through the fontconfig packager)

Also provided are the scripts used to generate them from an easily changed csv file

I intentionnaly didn't touch the proposed CJK stacks, though I feel it is highly abusive to registed the same font in multiple generics. We have lots of latin fonts that could be classified in multiple generics (trivial example: DehaVu Sans Mono), we don't put them everywhere anyway

<email address hidden> wrote:
> http://bugs.freedesktop.org/show_bug.cgi?id=20911
>
> --- Comment #78 from Nicolas Mailhot <email address hidden> 2009-12-02 14:09:01 PST ---
> Created an attachment (id=31684)
> --> (http://bugs.freedesktop.org/attachment.cgi?id=31684)
> split fontconfig files
>
> Here is a conter-proposal with the fontconfig files properly split per font (ie
> I want a setup that any individual packager can trivially copy when he needs to
> package a new cjk font, not something that lefts him scratching his head and
> feel he'll go nowhere without going through the fontconfig packager)
>

I truly don't understand why this has to be done in a per-font format.
Why Latin fonts can be listed in a preferred list in 60-latin.conf, but
CJK fonts can not?

The split files only increase the maintenance complexity, reduce
the readability and gain very little (if there is any).

If you don't like what was proposed in the svn, please give me one
solid example to show it is problematic.

> Also provided are the scripts used to generate them from an easily changed csv
> file
>
> I intentionnaly didn't touch the proposed CJK stacks, though I feel it is
> highly abusive to registed the same font in multiple generics. We have lots of
> latin fonts that could be classified in multiple generics (trivial example:
> DehaVu Sans Mono), we don't put them everywhere anyway
>

but what's wrong to have a longer list of fallback fonts? I just don't
want fontconfig to randomly pickup one from other CJK fonts that we
know it is not appropriate, or don't display at all.

The scheme makes this setup robust because 1) we give plenty of
choices, and 2) we ranked them from good to bad for each category.
Even the best font is not installed, we can still get the next-best choice,
and so on. This is exactly what we ask for: give CJK people the
best your system can provide out-of-box.

On Wed, Dec 2, 2009 at 5:09 PM, <email address hidden> wrote:

>
> I intentionnaly didn't touch the proposed CJK stacks, though I feel it is
> highly abusive to registed the same font in multiple generics.
>
>
In addition, it is NOT to register the same font in multiple generics, it is
to expand
the fallback. I agree that in family->generic mapping, it makes sense to map
to only one generic alias for every font (which I did in 41-*). However, for

generic->family fallback, it is a different story. Even I don't put a font
name
there, fontconfig will pick it up. The only difference is at least I have
control
the rest of the orders.

In an extreme case, if there is no sans CJK fonts installed, for most CJK
users,
they would rather to see a serif font of their own language to be displayed
than seeing an alien sans-serif (or cocktails) picked by fontconfig.

(In reply to comment #80)
> On Wed, Dec 2, 2009 at 5:09 PM, <email address hidden> wrote:
>
> >
> > I intentionnaly didn't touch the proposed CJK stacks, though I feel it is
> > highly abusive to registed the same font in multiple generics.
> >
> >
> In addition, it is NOT to register the same font in multiple generics, it is
> to expand
> the fallback. I agree that in family->generic mapping, it makes sense to map
> to only one generic alias for every font (which I did in 41-*). However, for
>
> generic->family fallback, it is a different story. Even I don't put a font
> name
> there, fontconfig will pick it up. The only difference is at least I have
> control
> the rest of the orders.

IMHO this is broken logic. With that logic you end up putting every font in every generic stack just in case.

You'd be better served by opening a fontconfig bug asking to look into other generic stacks by default before fallbacking to fonts that were not specified explicitely in one of them.

(In reply to comment #79)
> <email address hidden> wrote:

> I truly don't understand why this has to be done in a per-font format.
> Why Latin fonts can be listed in a preferred list in 60-latin.conf, but
> CJK fonts can not?

I don't like 60-latin.conf either but on a Fedora system we've made it mostly irrelevant. (I suppose we could also make your new files irrelevant, probably not what you want)

> The split files only increase the maintenance complexity, reduce
> the readability and gain very little (if there is any).

As I've stated (many times) my primary constraint is to have a smooth packaging flow where any font can be picked up by a packager and packaged quickly and independently. And the result is full-featured, not "it almost works but the remaining bits need integration by the fontconfig maintainer in its master files". We have one fontconfig maintainer who is awesome but also real busy working on all the text stack, whereas we have one packager per font package, and spreading the work as much as possible is basic common sense.

In such a model you have one font family per package only not huge collections of unrelated fonts (as I see very often in Debian).

This is how Fedora has been working in the past two years.

Anything that requires editing a file shared by multiple packages, instead of letting each package drop its own file, is contrary to this workflow.

Anything that implies it is ok to change font rules for fonts in other packages, such as mixing multiple font names in a single file, is a receipe for packager conflicts

Anything that implies you need to edit a file shared by multiple packages, such as the files you propose integrating, is an impediment to this workflow because people wait for the central authority to move before doing anything

Besides from a support POW, is is a lot easier to ask users to rename one symlink and re-test, than tell them to edit a font list in XML format. So this config style also helps after the packaging.

> The scheme makes this setup robust because

The scheme is not robust. The more times you repeat a font name, the more times you introduce room for human mistakes and someone editing one instance of your declarations but not the other.

It is a good idea to fall back on fonts declared in other generic stacks before trying any random un-declared font. It is a bad idea to do it via explicit multiple font declarations

I don't think neither of us have all day to fight on this. I think it is time for fontconfig's maintainer or developers to make a choice.

Behdad, please let me and Nicolas know which way you prefer (from maintenance perspective). What ever your choice, I really hope this can be incorporated in the next release of fontconfig.

(In reply to comment #83)
> I don't think neither of us have all day to fight on this.

ok, it's not "I don't think", it is "I am sure".

Just for my two cents,

(In reply to comment #82)
> I don't like 60-latin.conf either but on a Fedora system we've made it mostly
> irrelevant. (I suppose we could also make your new files irrelevant, probably
> not what you want)

+1. I don't like either of the rules that contains a kind of the priority list things. from the POV of distributors or packagers, it's harmful for tuning. it sometimes affects unexpectedly. getting rid of them would makes really happier as long as you have well-tuned separate config files.

From the POV of upstream, I suppose providing the easy-use configuration would be important though, that should keeps as just an example IMHO.

> As I've stated (many times) my primary constraint is to have a smooth packaging
> flow where any font can be picked up by a packager and packaged quickly and
> independently. And the result is full-featured, not "it almost works but the
> remaining bits need integration by the fontconfig maintainer in its master
> files". We have one fontconfig maintainer who is awesome but also real busy
> working on all the text stack, whereas we have one packager per font package,
> and spreading the work as much as possible is basic common sense.

I agree with you. plus, this issue is more closer to the preference issue. since fontconfig supports to have separate config files, people could do that in your machine or in your distribution. deciding shipped default configuration in fontconfig according to a little people's preference or the discussions with a little people makes no sense. having different configuration in fontconfig is more likely than other software. ideally the fonts upstream should ships fontconfig config files for their fonts but shouldn't be done in fontconfig for the specific fonts.

(In reply to comment #78)
> Created an attachment (id=31684) [details]
> split fontconfig files
>
> Here is a conter-proposal with the fontconfig files properly split per font (ie
> I want a setup that any individual packager can trivially copy when he needs to
> package a new cjk font, not something that lefts him scratching his head and
> feel he'll go nowhere without going through the fontconfig packager)

I also agree that separate .conf looks clean and they should be in the
fonts packages themselves as much as possible. The current
blob of fontconfig rules seems quite unmaintainable.

> Also provided are the scripts used to generate them from an easily changed csv
> file

Have you thought of including something like it in Fedora's fontpackages package? :)

I dunno if we still need some bits of 65-nonlatin.conf around for now until distros stop depending on it completely?

So what is with my proposal to make embedded bitmaps a separate fonts?

I am trying now to change the default Chinese font in Fedora
to WQY ZenHei but it seems I can't do this without dropping
Zenhei from 65-nonlatin.conf - otherwise it overrides
Japanese on the Fedora desktop.

I think I would like to propose just dropping 65-nonlatin.conf
completely from conf.d and recommend distros to provide font .conf
for each font they install as needed. This is basically
what we are doing in Fedora today already and it works well.

>Here is a conter-proposal with the fontconfig files properly split per font (ie
>I want a setup that any individual packager can trivially copy when he needs to
>package a new cjk font, not something that lefts him scratching his head and
>feel he'll go nowhere without going through the fontconfig packager)

First I think if you think so, repository maintainers are still free to choice the optimal way of rendering for their fonts.
Nobody and this solution inclusively doesn't restrain people from still doing what they were doing prior to it.
Second, per font config would be unable to provide good out of the box experience by default anyway on most of typical cjk setups after few custom fonts manipulations which you should think as common for today. The main purpose of centralised 41-zh,jp,ko... is to provide adequate fallbacks, rather than having a complete solution for everything zh,jp,ko

>I think I would like to propose just dropping 65-nonlatin.conf
>completely from conf.d and recommend distros to provide font .conf
>for each font they install as needed. This is basically
>what we are doing in Fedora today already and it works well.

Per language fallbacks what we are already discussing there would be doing exactly what are you trying to do now but in even more quirkless and non-intrusive manner, and yes they drops 65-nonlatin

I PROPOSE US TO HAVE A JOINT BRAINSTORM ON IRC.

103 comments hidden view all 131 comments
Jimhu (huyiwei) wrote :
Aron Xu (happyaron) on 2010-01-03
Changed in ubuntu-translations:
importance: Undecided → Medium
status: New → Confirmed
Changed in software-center (Ubuntu):
status: New → Confirmed
YunQiang Su (wzssyqa) wrote :

In lucid ,there is not this bug.

Jimhu (huyiwei) on 2010-01-03
description: updated
description: updated
nihui (nihui) wrote :

从 openoffice.org 官方网站下载直接安装的 openoffice,里面的帮助文档也是这种难看的字体效果。
后来我从 xp 里面复制了一个 simsun 字体到系统里就正常了。

David Planella (dpm) wrote :

Hi JimHu,

Could you be more specific, especially for those of us not proficient in Chinese?

* In which way are , i.e. what do you expect and what is the result?
* Does this only affect Software Center, or also other applications?
* Which language pack are you using, Simplified Chinese or Traditional Chinese?

@nihui: please remember that the language we use for communication in the international Ubuntu community for handling bugs is English. Do you think you could translate your comment to English?. Alternatively, could someone else do it?

Thanks!

Aron Xu (happyaron) wrote :

Here is a translation of nihui's comment:

When I download and install openoffice.org from its official site, the font of help document in it has the similar ugly-looking appearance.
When I tried to use a simsun font, everything turns to be good.

Aron Xu wrote:
> Here is a translation of nihui's comment:
>
> When I download and install openoffice.org from its official site, the font of help document in it has the similar ugly-looking appearance.
> When I tried to use a simsun font, everything turns to be good.
>

This looks like a font configuration issue.
1. Which locale does the reporter use?
2. Please provide the output of 'ls -l /etc/fonts/conf.d/'

Thanks
Arne

Hi David,

The problem is that characters in a word always have different fonts in software-center, that means, looking at the attachment which JimHu added in his first comment, the second icon on the first row, it is the entry "Universal Access" in English, and we translate it into "全局访问" in Chinese (Simplified), you can see the fonts of "全局" and "访问" are different in the captured picture, and they should be in the same font.

On my own I only find this problem affect software-center, but it may affect more applications.

In the picture JimHu looks to be using Chinese (Simplified) as I do, and I haven't tried to use Chinese (Traditional) yet, but I believe we have the same problem.

I find that software-center display those characters which defined by $LANGUAGES but not $LC_* or $LANG when I tried to switch to English locale to see the string "Universal Access" just now.

Regards,
Aron Xu

Aron Xu (happyaron) wrote :

Hi Arne,
 I guess JimHu has the same setting with me, zh_CN.UTF-8, and here is the output of your required `ls' command on my machine:

lrwxrwxrwx 1 root root 31 2009-12-28 23:43 10-antialias.conf -> ../conf.avail/10-antialias.conf
lrwxrwxrwx 1 root root 29 2009-12-28 23:43 10-hinting.conf -> ../conf.avail/10-hinting.conf
lrwxrwxrwx 1 root root 36 2009-12-28 23:43 10-hinting-slight.conf -> ../conf.avail/10-hinting-slight.conf
lrwxrwxrwx 1 root root 43 2009-12-28 23:43 11-lcd-filter-lcddefault.conf -> ../conf.avail/11-lcd-filter-lcddefault.conf
lrwxrwxrwx 1 root root 39 2009-12-28 23:43 20-fix-globaladvance.conf -> ../conf.avail/20-fix-globaladvance.conf
lrwxrwxrwx 1 root root 39 2009-12-28 23:43 20-unhint-small-vera.conf -> ../conf.avail/20-unhint-small-vera.conf
lrwxrwxrwx 1 root root 39 2009-12-28 23:43 30-defoma.conf -> /var/lib/defoma/fontconfig.d/fonts.conf
lrwxrwxrwx 1 root root 36 2009-12-28 23:43 30-metric-aliases.conf -> ../conf.avail/30-metric-aliases.conf
lrwxrwxrwx 1 root root 33 2009-12-28 23:43 30-urw-aliases.conf -> ../conf.avail/30-urw-aliases.conf
lrwxrwxrwx 1 root root 30 2009-12-28 23:43 40-nonlatin.conf -> ../conf.avail/40-nonlatin.conf
lrwxrwxrwx 1 root root 32 2009-12-28 23:43 44-wqy-zenhei.conf -> ../conf.avail/44-wqy-zenhei.conf
lrwxrwxrwx 1 root root 27 2009-12-28 23:43 45-latin.conf -> ../conf.avail/45-latin.conf
lrwxrwxrwx 1 root root 31 2009-12-28 23:43 49-sansserif.conf -> ../conf.avail/49-sansserif.conf
lrwxrwxrwx 1 root root 26 2009-12-28 23:43 50-user.conf -> ../conf.avail/50-user.conf
lrwxrwxrwx 1 root root 27 2009-12-28 23:43 51-local.conf -> ../conf.avail/51-local.conf
lrwxrwxrwx 1 root root 42 2009-12-28 23:43 53-monospace-lcd-filter.conf -> ../conf.avail/53-monospace-lcd-filter.conf
lrwxrwxrwx 1 root root 27 2009-12-28 23:43 60-latin.conf -> ../conf.avail/60-latin.conf
lrwxrwxrwx 1 root root 35 2009-12-28 23:43 65-fonts-persian.conf -> ../conf.avail/65-fonts-persian.conf
lrwxrwxrwx 1 root root 30 2009-12-28 23:43 65-nonlatin.conf -> ../conf.avail/65-nonlatin.conf
lrwxrwxrwx 1 root root 38 2009-12-28 23:43 66-wqy-zenhei-sharp.conf -> ../conf.avail/66-wqy-zenhei-sharp.conf
lrwxrwxrwx 1 root root 53 2009-12-28 23:51 69-language-selector-zh-cn.conf -> /etc/fonts/conf.avail/69-language-selector-zh-cn.conf
lrwxrwxrwx 1 root root 29 2009-12-28 23:43 69-unifont.conf -> ../conf.avail/69-unifont.conf
lrwxrwxrwx 1 root root 32 2009-12-28 23:43 70-no-bitmaps.conf -> ../conf.avail/70-no-bitmaps.conf
lrwxrwxrwx 1 root root 31 2009-12-28 23:43 80-delicious.conf -> ../conf.avail/80-delicious.conf
lrwxrwxrwx 1 root root 31 2009-12-28 23:43 90-synthetic.conf -> ../conf.avail/90-synthetic.conf
lrwxrwxrwx 1 root root 50 2009-12-28 23:51 99-language-selector-zh.conf -> /etc/fonts/conf.avail/99-language-selector-zh.conf
-rw-r--r-- 1 root root 959 2009-03-19 22:03 README

Changed in software-center:
status: New → Invalid
Matthew Paul Thomas (mpt) wrote :

This looks like a WebKit-GTK problem. (Ubuntu Software Center uses WebKit-GTK for icon-view listings.) I get the same problem -- two different fonts -- when I copy and paste this into the URL field of the Epiphany or Midori browsers, both of which use WebKit-GTK:
   data:text/html,<meta%20http-equiv="Content-Type"%20context="text/html;%20charset="utf-8">办公%20全局访问

I do not get the same problem in either Chromium (which I think uses its own copy of WebKit) or Firefox (which uses Gecko).

affects: software-center (Ubuntu) → webkit (Ubuntu)
summary: - Chinese displays abnormal in Software Center
+ Chinese characters unexpectedly switch fonts in WebKit-GTK
description: updated
description: updated
Changed in ubuntu-translations:
status: Confirmed → Invalid
Aron Xu (happyaron) on 2010-01-08
Changed in ubuntu-translations:
status: Invalid → Confirmed
ZhengPeng Hou (zhengpeng-hou) wrote :

this is the chinese characters being rendered in chromium, which is a webkit based browser.

95 comments hidden view all 131 comments

(In reply to comment #89)
> Second, per font config would be unable to provide good out of the box
> experience by default anyway on most of typical cjk setups after few custom
> fonts manipulations which you should think as common for today.

Perhaps you are referring to Ubuntu's override system that was discussed earlier?

> >I think I would like to propose just dropping 65-nonlatin.conf
> >completely from conf.d and recommend distros to provide font .conf
> >for each font they install as needed. This is basically
> >what we are doing in Fedora today already and it works well.
>
> Per language fallbacks what we are already discussing there would be doing
> exactly what are you trying to do now but in even more quirkless and
> non-intrusive manner, and yes they drops 65-nonlatin

> I PROPOSE US TO HAVE A JOINT BRAINSTORM ON IRC.

We could talk on ##fonts, but may be hard to find
a common time to please all, but go ahead and suggest one
if you like.

So anyone object to removing 65-nonlatin.conf from conf.d/ ?
(If some distro still wants it they could symlink it from avail.d/.)

Otherwise at minimum I think we need lang tags in 65-nonlatin.conf.

(In reply to comment #90)
> So anyone object to removing 65-nonlatin.conf from conf.d/ ?
> (If some distro still wants it they could symlink it from avail.d/.)
>
> Otherwise at minimum I think we need lang tags in 65-nonlatin.conf.
>

I think it is a bad idea. As I said several times previously, 65-nonlatin.conf is not for specific languages. It only provides sufficient font fallback (to prevent fontconfig randomly picking up low quality fonts) for rendering CJK char under a non-CJK specific environment (such as en_US, fr etc).

For CJK-specific configs, they should go to the 65-language-ja.conf or 65-language-zh.conf file as in my "65-nonlatin test suite".

What we need is a BETTER 65-nonlatin, removing from conf.d or adding lang tag miss the point completely.

So how to solve https://bugzilla.redhat.com/show_bug.cgi?id=476459 ?

(In reply to comment #91)
> I think it is a bad idea. As I said several times previously, 65-nonlatin.conf
> is not for specific languages. It only provides sufficient font fallback (to
> prevent fontconfig randomly picking up low quality fonts) for rendering CJK
> char under a non-CJK specific environment (such as en_US, fr etc).

If there is proper font config in place we shouldn't need any fallbacks.

> What we need is a BETTER 65-nonlatin, removing from conf.d or adding lang tag
> miss the point completely.

Why is adding lang tag a problem, specially for CJK?

On 1/13/2010 7:30 PM, <email address hidden> wrote:
> --- Comment #92 from Jens Petersen<email address hidden> 2010-01-13 16:30:20 PST ---
> So how to solve https://bugzilla.redhat.com/show_bug.cgi?id=476459 ?
>

there are two ways:

1. add XX-vlgothic.conf (XX<65) and set prefer list for lang=ja
2. remove 44-wqy-zenhei and 66-vlgothic, download and use my
65-language-{ja,zh}.conf files

> If there is proper font config in place we shouldn't need any fallbacks.

what proper font config are you referring to?

> Why is adding lang tag a problem, specially for CJK?

first of all, this is already done in 65-language-{zh,ja}.conf in my
proposal,
we don't need two files to do the same thing!

second, fontconfig does need a config file to set the font orders for
non-CJK
locales, such as en_US. In my opinion, that is the exact purpose for
65-nonlatin. Add lang-tag matching only limit the rules to be effective
for CJK locales, and leave the font orders in non-CJK locales unspecified.

(In reply to comment #91)
> I think it is a bad idea. As I said several times previously, 65-nonlatin.conf
> is not for specific languages. It only provides sufficient font fallback (to
> prevent fontconfig randomly picking up low quality fonts) for rendering CJK
> char under a non-CJK specific environment (such as en_US, fr etc).

How is it bad? and how is it useful without the fonts? that looks like you missed the point. the certain config files should be provided by the font upstream. it's the above point and what Fedora is trying. the unnecessary built-in rules are worse than nothing.

(In reply to comment #94)
> How is it bad? and how is it useful without the fonts?

I think the meaning of "fallback" is that when something does not exist, something else fill in the place. That is EXACTLY what fontconfig is designed for: when a font is not installed, some other designated font will be used as alternative.

> that looks like you missed the point.
> the certain config files should be provided by the font upstream.

I agree, but that does not justify why "font preference orders" should be provided by the font itself. In fact, fontconfig has been doing this job since day 1, and it is doing ok. The only thing is to refine it.

> it's the above point and what Fedora is trying. the unnecessary
> built-in rules are worse than nothing.
>

how are these rules unnecessary? tell me.

I don't want to see another discussions got wasted in vain and derailed by discussing something under a completely different design. The problem for fontconfig is real, every minute we delay to give a solution, we are wasting thousands of minutes from the frustrated users. So, let's solve it first.

I would like to give a suggestion for any further discussions on this issue:

Let's separate what you want to achieve (i.e. using per-font-preference) and what this bug is trying to solve (i.e. amend 65-nonlatin to buld sufficient CJK fallback). The assumption of my proposal is that the fontconfig maintainers are happy with the current structure. If you want to propose something beyond the scope, please file a separate bug. I would be glad to join the discussion there.

For those of you who want to help, PLEASE, please download my svn files and test with it by yourself, and list any specific issues or submit your patch wrt these files, and these files only!

We've already spent a lot of time here, we really want to hear what the maintainers say, so, Behdad or Keith, please tell us what you guys think.

Created attachment 32959
sample config for fallback with the separate files

(In reply to comment #95)
> (In reply to comment #94)
> > How is it bad? and how is it useful without the fonts?
>
> I think the meaning of "fallback" is that when something does not exist,
> something else fill in the place. That is EXACTLY what fontconfig is designed
> for: when a font is not installed, some other designated font will be used as
> alternative.
>
> > that looks like you missed the point.
> > the certain config files should be provided by the font upstream.
>
> I agree, but that does not justify why "font preference orders" should be
> provided by the font itself. In fact, fontconfig has been doing this job since
> day 1, and it is doing ok. The only thing is to refine it.

You are misunderstanding the point then. I'm not saying that the order should be provided by the font upstream. providing the separate rule by font upstream should be easier to change the order by distro or the users with the prefix priority say. as I said to you on Red Hat Bugzilla too, thus all of the fontconfig config files shouldn't contains any other font names in it. Since this is a kinda preference, the order should just leave to the distro or the users. that's why I think having the minimal sets of the rule in fontconfig upstream should be sufficient.

>
> > it's the above point and what Fedora is trying. the unnecessary
> > built-in rules are worse than nothing.
> >
>
> how are these rules unnecessary? tell me.
>

My objection to get rid of these (65-nonlatin.conf and similar for your proposed rules) files in upstream because they:

- prevents to have different order with additional rules.

- which mixing up several fontnames in one file requires the certain knowledges and skills to modify it for users.

- plus, need to modify two files to change the order at least. 65-nonlatin.conf (or similar) and the prefix priority in separate config file from the font package.

I want to just update the prefix priority in the config filename to change the order. it would works enough without 65-nonlatin.conf say, and easy enough.
Aside from that, speaking of the fallback, I did in vlgotnic-{,p}-fonts in Fedora to behave some fallback with separate files for Japanese like sans-serif->VL PGothic->VL Gothic.

% ls /etc/fonts/conf.d/*vlgothic*
/etc/fonts/conf.d/65-vlgothic-pgothic.conf@ /etc/fonts/conf.d/66-vlgothic-gothic.conf@

See the attached files for the details of the config files.

Download full text (3.2 KiB)

<email address hidden> wrote:
>
>
> You are misunderstanding the point then. I'm not saying that the order should
> be provided by the font upstream. providing the separate rule by font upstream
> should be easier to change the order by distro or the users with the prefix
> priority say. as I said to you on Red Hat Bugzilla too, thus all of the
> fontconfig config files shouldn't contains any other font names in it. Since
> this is a kinda preference, the order should just leave to the distro or the
> users. that's why I think having the minimal sets of the rule in fontconfig
> upstream should be sufficient.
>

again, I think we are talking on different pages.
What you want to propose is to change the fontconfig config file
basic schemes, and what I want is to renovate it and fine-tune.

As I said previously, it would be more efficient if you
submit another bug to discuss the new proposal.

I personally don't think your "other-font-names-free-rule" is
sufficient to handle the complex CJK situations. In addition,
using the basic rules I proposed does not conflict to
what you want to do. The only difference is that fontconfig
has some basic memory about good and bad fonts, and your
approach erase all the memories of fontconfig, and
font packagers make all the decisions by manipulating the
priority numbers.

Also, if the packager for Font A think it is better
than Font B, and the packager for B think opposite.
How would you solve it? let them fight by competing
the priority numbers?

>
>>> it's the above point and what Fedora is trying. the unnecessary
>>> built-in rules are worse than nothing.
>>>
>>>
>> how are these rules unnecessary? tell me.
>>
>>
>
> My objection to get rid of these (65-nonlatin.conf and similar for your
> proposed rules) files in upstream because they:
>
> - prevents to have different order with additional rules.
>

it doesn't. just name your file with a priority less than 65.
if you name it bigger than 65, then use prepend_first in your rules.

> - which mixing up several fontnames in one file requires the certain knowledges
> and skills to modify it for users.
>

on the opposite, because it is centralized, it is easier
for users to modify. The most frustrating thing
using fontconfig is that when I modify one place to set
font orders, the rules never work because multiple other
config files overwrite it. It is impossible for ordinary
users to trace which one is actually functioning. The
approach you proposed is very likely leading to increasing
frustrations of such kind.

> - plus, need to modify two files to change the order at least. 65-nonlatin.conf
> (or similar) and the prefix priority in separate config file from the font
> package.
>

no

>
> I want to just update the prefix priority in the config filename to change the
> order. it would works enough without 65-nonlatin.conf say, and easy enough.
> Aside from that, speaking of the fallback, I did in vlgotnic-{,p}-fonts in
> Fedora to behave some fallback with separate files for Japanese like
> sans-serif->VL PGothic->VL Gothic.
>
> % ls /etc/fonts/conf.d/*vlgothic*
> /etc/fonts/conf.d/65-vlgothic-pgothic.conf@
> /etc/fonts/con...

Read more...

(In reply to comment #98)
> again, I think we are talking on different pages.
> What you want to propose is to change the fontconfig config file
> basic schemes, and what I want is to renovate it and fine-tune.

Sure. then my counter-proposal is to do that in your favorite distros. it's not something should be done upstream IMHO. I could start discussing this on another bug. but it's pretty opposite proposal to this, because once it gets approved, it eventually gets rid of your efforts too.

> I personally don't think your "other-font-names-free-rule" is
> sufficient to handle the complex CJK situations. In addition,
> using the basic rules I proposed does not conflict to
> what you want to do. The only difference is that fontconfig
> has some basic memory about good and bad fonts, and your
> approach erase all the memories of fontconfig, and
> font packagers make all the decisions by manipulating the
> priority numbers.

Right. because 65-nonlatin.conf prevents sane working on the separate config file idea. which means actually conflicting on it. otherwise we don't even need to get rid of it right.

> Also, if the packager for Font A think it is better
> than Font B, and the packager for B think opposite.
> How would you solve it? let them fight by competing
> the priority numbers?

The decision is up to the users or the distros. that's why I don't like to put the kind of the rules upstream. and it's not what upstream would worry about.

> it doesn't. just name your file with a priority less than 65.
> if you name it bigger than 65, then use prepend_first in your rules.

Once starting to use prepend_first, and if one wants to modify the order over it, all of fonts eventually will depends on prepend_first. it's not the right solution. it's a kind of a hack.

> on the opposite, because it is centralized, it is easier
> for users to modify.

I meant the syntax-wise etc. changing the priority order in the filename is much easier for that purpose.

> The most frustrating thing
> using fontconfig is that when I modify one place to set
> font orders, the rules never work because multiple other
> config files overwrite it. It is impossible for ordinary
> users to trace which one is actually functioning. The
> approach you proposed is very likely leading to increasing
> frustrations of such kind.

Not really. if we have simple rule for the font per a file, it should be easy to keep it on track with the debugging message, because any other changes for the font won't happens after that. having many rules in the different files would rather makes more complex to find out where it's really affected.

>
> > - plus, need to modify two files to change the order at least. 65-nonlatin.conf
> > (or similar) and the prefix priority in separate config file from the font
> > package.
> >
>
> no

With prepend_first? I was assuming the situation on what Fedora do, but anyway.

<email address hidden> wrote:
> Sure. then my counter-proposal is to do that in your favorite distros. it's not
> something should be done upstream IMHO. I could start discussing this on
> another bug. but it's pretty opposite proposal to this, because once it gets
> approved, it eventually gets rid of your efforts too.
>

I don't think we can convince each other on this matter,
and unfortunately, the maintainers do not seem to care.
so, I am getting bored ...

> Right. because 65-nonlatin.conf prevents sane working on the separate config
> file idea. which means actually conflicting on it. otherwise we don't even need
> to get rid of it right.
>
>> it doesn't. just name your file with a priority less than 65.
>> if you name it bigger than 65, then use prepend_first in your rules.
>>
> Once starting to use prepend_first, and if one wants to modify the order over
> it, all of fonts eventually will depends on prepend_first. it's not the right
> solution. it's a kind of a hack.
>

looks like you just choose to ignore my first suggestion,
i.e. giving your own rules a lower prefix and overwrite 65-nonlatin.
As a result, your conclusion that 65-nonlatin conflicts with
per-font-config and your below criticism are flawed.

Just rename your own rules to 64-xxx and do a "FC_DEBUG=1029 fc-match ...",
you will see how it works.

> Not really. if we have simple rule for the font per a file, it should be easy
> to keep it on track with the debugging message, because any other changes for
> the font won't happens after that. having many rules in the different files
> would rather makes more complex to find out where it's really affected.
>

As I said, setting 65-nonlatin DOES NOT prevent you from doing
what you want to do as a distro packager. It is important to have
some sane default rules from fontconfig upstream because not all
distros (such as some mini-system derived from LSB) have knowledgeable
maintainers for CJK fonts.

(In reply to comment #100)
> looks like you just choose to ignore my first suggestion,

I did because my idea is always against 65-nonlatin.conf. so all of the config files has to be put before 65-nonlatin.conf. playing with the narrow spaces won't make any better.

Okay, this may be a good settlement to extend the priority prefix to have more wider namespaces and align the section like this:

000-100: minimal sets of the config files from upstream.
200-300: users preference
400-500: distros preference
900-: upstream recommendation and fallbacks

the range might be improved later but this would resolves your and my issues if putting any rules prior to upstream's resolves the issue. we don't need to worry about 65-nonlatin.conf (realigned to somewhere after 600) anymore, and you can work on it upstream then. how does it sound for you?

> As I said, setting 65-nonlatin DOES NOT prevent you from doing
> what you want to do as a distro packager. It is important to have
> some sane default rules from fontconfig upstream because not all
> distros (such as some mini-system derived from LSB) have knowledgeable
> maintainers for CJK fonts.

Since all of the necessary configuration could be done in one file, assuming that it's came from font upstream, they just need to adjust the priority order to what they want. they don't eventually need to create any rules in the future.
Sorry for missing some assumptions on it. but it's possible to configure the fontconfig settings with putting a file anyway.

(In reply to comment #101)
> putting any rules prior to upstream's resolves the issue. we don't need to
> worry about 65-nonlatin.conf (realigned to somewhere after 600) anymore, and
> you can work on it upstream then. how does it sound for you?

Sorry I meant "after 900".

<email address hidden> wrote:
> I did because my idea is always against 65-nonlatin.conf. so all of
> the config
> files has to be put before 65-nonlatin.conf. playing with the narrow
> spaces
> won't make any better.

I am very glad that we finally reach some ground
and start to understand each other. that's good.

I knew you were pushed by the (artificially
determined) narrow prefix range for nonlatin
config files in Fedora [1]. I should have
pointed that out earlier.

> Okay, this may be a good settlement to extend the priority prefix to have more
> wider namespaces and align the section like this:
>
> 000-100: minimal sets of the config files from upstream.
> 200-300: users preference
> 400-500: distros preference
> 900-: upstream recommendation and fallbacks
>
> the range might be improved later but this would resolves your and my issues if
> putting any rules prior to upstream's resolves the issue. we don't need to
> worry about 65-nonlatin.conf (realigned to somewhere after 600) anymore, and
> you can work on it upstream then. how does it sound for you?
>

I think this is now a Fedora matter, as the rules in [1]
are only followed by Fedora packagers. I prefer to define
51~64 for non-latin distro preference, as it still allows
users to use ~/.fonts.conf to overwrite.

Maybe file a bug on Fedora's bugzilla and ask Nicolas
to consider this adjustment?

[1]
http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt

(In reply to comment #103)
> I think this is now a Fedora matter, as the rules in [1]
> are only followed by Fedora packagers. I prefer to define
> 51~64 for non-latin distro preference, as it still allows
> users to use ~/.fonts.conf to overwrite.

Is it? since the numbering is came from upstream, this improvement should appears in upstream no matter who follows that rule.

<email address hidden> wrote:
> Is it? since the numbering is came from upstream, this improvement should
> appears in upstream no matter who follows that rule.
>

then I guess you want to look at this
http://cgit.freedesktop.org/fontconfig/tree/conf.d/README

(In reply to comment #105)
> then I guess you want to look at this
> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README

And then? can you talk more? how does it explain why growing the range of the numbering for the priority order is a Fedora matter?

You understand Fedora's priority thing is based on that right?

<email address hidden> wrote:
>> then I guess you want to look at this
>> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
>>
>
> And then? can you talk more? how does it explain why growing the range of the
> numbering for the priority order is a Fedora matter?
>
> You understand Fedora's priority thing is based on that right?
>

I should have completed my sentence. What I wanted you to do
is to compare
http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
 with
http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt

the first one is what suggested in fontconfig, and the second
one is what suggested in Fedora. see the difference?

the limitation that non-latin shall not go below 65 is
only a Fedora limitation. As long as you match lang tag as
the enclosing block in your config file, I don't think it
matters which number you choose for Latin or non-latin
fonts if 50<n<65.

<rant start>
Following rules is fine, but do not turn it into dogmatism.
Rules are meant to help, not meant to hinder.
</rant end>

<email address hidden> wrote:
> And then? can you talk more? how does it explain why growing the range of the
> numbering for the priority order is a Fedora matter?
>
> You understand Fedora's priority thing is based on that right?
>
FIY, a bug was submitted to Fedora to clarify on the
prefix number range for non-latin config files:

https://bugzilla.redhat.com/show_bug.cgi?id=561246

I won't add any comments later. I'm not interested in the improvements of 65-nonlatin.conf anymore and now we have a solution to avoid the bad effects of it. but to correct the misunderstanding of:

(In reply to comment #107)
> <email address hidden> wrote:
> >> then I guess you want to look at this
> >> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
> >>
> >
> > And then? can you talk more? how does it explain why growing the range of the
> > numbering for the priority order is a Fedora matter?
> >
> > You understand Fedora's priority thing is based on that right?
> >
>
> I should have completed my sentence. What I wanted you to do
> is to compare
> http://cgit.freedesktop.org/fontconfig/tree/conf.d/README
> with
> http://git.fedorahosted.org/git/fontpackages.git?p=fontpackages.git;a=blob;f=fontconfig-templates/fontconfig-priorities.txt
>
> the first one is what suggested in fontconfig, and the second
> one is what suggested in Fedora. see the difference?
>
> the limitation that non-latin shall not go below 65 is
> only a Fedora limitation. As long as you match lang tag as
> the enclosing block in your config file, I don't think it
> matters which number you choose for Latin or non-latin
> fonts if 50<n<65.

That looks like you are talking about different point. indeed I said 65-nonlatin.conf badly affects to the separate-config idea though, my proposal posted at Comment #101 isn't for Fedora. otherwise I won't submit it here. the documented structure of the priority numbering is a good idea and inheriting this idea in Fedora is also good IMHO, but the assignment in Fedora was bad you are misunderstanding the point. since this kind of the configuration is completely preference and should be capable to customize it in various area such as at the user-side and at the distro-side, the scope of the customization should be defined in upstream. having more improvements than current policy in Fedora may works after that, but it may introduces the inconsistencies and another side-effects. that's not a solution but still a hack.
that's why I want to see the reserved area for distro and so on in upstream definition of the priority numbering, but anyway.

I'll keep an eye on another bug how it could improve.

why did you close the bug? this is not fixed. The current version of 65-nonlatin is still carrying all the issues I mentioned in the original report.

(In reply to comment #110)
> why did you close the bug? this is not fixed. The current version of
> 65-nonlatin is still carrying all the issues I mentioned in the original
> report.
>

Oops, it's not my intention at all. sorry for that.

115 comments hidden view all 131 comments
Matthew Paul Thomas (mpt) wrote :

As I said in the description, Chromium uses its own copy of WebKit, "which suggests that it may be a WebKit-GTK bug that has been fixed since the version packaged in Ubuntu 9.10."

jack (jacksjy) wrote :

Copy from Bug 467979, (which has been marked as a duplication by Matthew)

This may be useful.

QianQian Fang 's comment.

This is the typical scenario where Japanese fonts overwrite Chinese fonts. The default fontconfig settings put Japanese fonts in front of Chinese fonts. Because Japanese fonts only cover a small fraction of Kanji, when rendering a block of simplified/traditional Chinese text, the missing ones will be rendered by Chinese fonts with full-coverage. I had proposed solutions to this issue, but the fontconfig developers are slow in responding:

https://bugzilla.redhat.com/show_bug.cgi?id=499902
http://bugs.freedesktop.org/show_bug.cgi?id=20911

if your firefox and other programs handle Chinese properly, then it must be webkit-based programs ignore some of the additional CJK specific fontconfig settings.

Changed in fontconfig:
status: Unknown → Confirmed
115 comments hidden view all 131 comments

Hi,

Have any bits of these updates were picked up by fontconfig or by packaged by any other distribution for official inclusion?

Any idea on the status of Fedora 13, as far as this issue is concerned?

Thanks!

Regards,
Ilyes Gouta

(In reply to comment #112)
> Hi,
>
> Have any bits of these updates were picked up by fontconfig or by packaged by
> any other distribution for official inclusion?
>
> Any idea on the status of Fedora 13, as far as this issue is concerned?
>
> Thanks!
>
> Regards,
> Ilyes Gouta

You should bring any distro specific things up on fonts list or bugzilla if you have any issues. though we have a workaround to prevent affecting 65-nonlatin.conf in f13. so it should works and improved much more than f12 I believe.

Hi Akira,

> have any issues. though we have a workaround to prevent affecting
> 65-nonlatin.conf in f13. so it should works and improved much more than f12 I

Could you tell me more about this workaround in f13?

From what I've seen, f13 ships fontconfig 2.8.0 almost unmodified (http://cvs.fedoraproject.org/viewvc/rpms/fontconfig/F-13, 1 patch 25-no-bitmap-fedora.conf). How are CJK fonts (better) handled in f13?

Thanks,

-Ilyes Gouta

> (In reply to comment #112)
> > Hi,
> >
> > Have any bits of these updates were picked up by fontconfig or by packaged by
> > any other distribution for official inclusion?
> >
> > Any idea on the status of Fedora 13, as far as this issue is concerned?
> >
> > Thanks!
> >
> > Regards,
> > Ilyes Gouta
>
> You should bring any distro specific things up on fonts list or bugzilla if you
> have any issues. though we have a workaround to prevent affecting
> 65-nonlatin.conf in f13. so it should works and improved much more than f12 I
> believe.

(In reply to comment #114)
> Could you tell me more about this workaround in f13?
>
> From what I've seen, f13 ships fontconfig 2.8.0 almost unmodified
> (http://cvs.fedoraproject.org/viewvc/rpms/fontconfig/F-13, 1 patch
> 25-no-bitmap-fedora.conf). How are CJK fonts (better) handled in f13?

That has been done in each fontconfig config files in the CJK fonts packages. you can see some files that has 65-0- as a prefix say.which would be supposed to be evaluated prior to 65-nonlatin.conf.

117 comments hidden view all 131 comments

Do you still experience this problem? Please respond if you do.

Changed in webkit (Ubuntu):
status: Confirmed → Incomplete
Changed in ubuntu-translations:
status: Confirmed → Incomplete
Aron Xu (happyaron) wrote :

The bug is still valid for 9.10, but not presented in 10.04 ever.

David Planella (dpm) wrote :

Could someone from the bug squad update the "Nominate for release" status according to Aron's last comment? Thanks!

Changed in fontconfig:
importance: Unknown → Medium
Changed in fontconfig:
importance: Medium → Unknown
Changed in fontconfig:
importance: Unknown → Medium
Changed in webkit (Ubuntu):
status: Incomplete → Fix Released
Changed in ubuntu-translations:
status: Incomplete → Fix Released
Displaying first 40 and last 40 comments. View all 131 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.