Chinese characters in PDFs without embedded fonts are shown as squares

Bug #659280 reported by poloshiao
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Ubuntu Translations
Fix Released
Medium
Unassigned
fontconfig (Ubuntu)
Invalid
Undecided
Unassigned
langpack-locales (Ubuntu)
Confirmed
Undecided
Unassigned
language-selector (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Chinese characters in PDFs without embedded fonts are shown as squares, more analysis please read comment #14[1]. Samples are attached.

(Change the bug description so people would be easier to jump in and know what's the problem. Thanks to people from ubuntu-tw.org, who reported this bug.)

[1]https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/14

Revision history for this message
Aron Xu (happyaron) wrote :

If you just want to show characters in PDF files properly, you really should install poppler-data package instead of suggesting such big changes.

Revision history for this message
poloshiao (poloshiao) wrote :

Attn: Mr. Aron Xu

We guess you maybe can read traditional chinese chracters, so you are welcomed to visit these pages
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=149374#forumpost149374
supplied by xenomorph0525.

This problems affect all applications, Not only Evince, which call for the setting file /etc/fonts/conf.avail/69-language-selector-zh-tw.conf.
Installing poppler-data package only solve the reading PDF problem. but even PDF, It is not always succeed.

The following are good examples to show the different screenshots with or without installing poppler-data and editing 69-language-selector-zh-tw.conf

There are 2 indepent PDF documents, the left one is A, the right one is B.

picture 1. http://img257.imageshack.us/img257/4631/screenshotcox.png
The screenshot both without installing poppler-data and without editing 69-language-selector-zh-tw.conf.
A is normal but B appears almost blank in the right page.

picture 2. http://img821.imageshack.us/img821/8851/screenshot1sy.png
The screenshot both with installing poppler-data but without editing 69-language-selector-zh-tw.conf.
A is still normal but B appears square chracters all over the page in the right part.

picture 3. http://img231.imageshack.us/img231/7329/screenshot2c.png
The screenshot both with installing poppler-data and with editing 69-language-selector-zh-tw.conf.
A is still normal but B appears normal chracters all over the page in the right part.

conclution:
A document appears normal despite with or without installing poppler-data and editing 69-language-selector-zh-tw.conf.
B document appears normal only without installing poppler-data and with editing 69-language-selector-zh-tw.conf.

It is apparent that editng 69-language-selector-zh-tw.conf is the only way to change those square chracters always to normal ones.

One must learn how to make some changes after installing poppler-data package,
Otherwise the squares will appear, sometime, in some applications including evince.

What we wish is to help those newbies keep away from these square chracter probelms when they use any application.

Revision history for this message
Aron Xu (happyaron) wrote :

It looks reasonable, could you upload a sample PDF file that cannot be shown properly? That will help a lot when we are hunting what is the problem.

Revision history for this message
poloshiao (poloshiao) wrote :

Attn: Mr. Aron Xu
I upload 2 PDF documents for your reference.

The first document is related to the left page, said A.
http://activity.ntsec.gov.tw/activity/race-1/46/elementary/0815/081562.pdf

The second document is related to the right page, said B.
http://activity.ntsec.gov.tw/activity/race-1/45/senior/0402/040216.pdf

It is essential to set hant (TW) as the default language in the system setting instead of in the user setting which will not surely show the square character.

Revision history for this message
poloshiao (poloshiao) wrote :
Aron Xu (happyaron)
Changed in ubuntu-translations:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Walter Cheuk (wwycheuk) wrote :

No need to remove the entries, just move the Chinese typefaces to a higher priority. This is more reasonable since typeface priority should match the locale.

Revision history for this message
poloshiao (poloshiao) wrote :

Thanks to Walter Cheuk for #6 comment. Even though some Chinese typefaces are moved to higher priority than Bitstream Vera and DejaVu, it will be still shown suqare with Evince if those Chinese typefaces are not installed or removed from system. So the best way to solve the square problem throughly is to remove the Bitstream Vera and DejaVu entries.
This comment is suggested by xenomorph0525 and poloshiao help translate and post it.

Revision history for this message
Aron Xu (happyaron) wrote :

I got some time to test the proposed solution, the PDFs are rendered correctly, but other applications using English text are displayed uglier for using a Chinese font.

Revision history for this message
Aron Xu (happyaron) wrote :

I found xpdf with xpdf-chinese-* installed can work without touching /etc/fonts. I think we should ask for a solution regarding poppler (which are used by evince and okular to work with PDFs) instead of making changes to fontconfig settings.

Changed in fontconfig (Ubuntu):
status: New → Invalid
Changed in poppler-data (Ubuntu):
status: New → Confirmed
Changed in language-selector (Ubuntu):
status: New → Invalid
Revision history for this message
poloshiao (poloshiao) wrote :

The following was originally posted by xenomorph0525 on
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=159564#forumpost159564
I help translate it into English and post it here.

Those applications using English text are displayed uglier owing to their original appearance of English letter character from the adopted chinese font.

You may put those chinese fonts with confortable appearance of English letter character into superior order to rule out those ugly ones, if you dislike them.

It still can not solve the square character problems in chinese when one applys other application even though xpdf-chinese-* could solve the square character problems in chinese PDF files.

However it should be better to wipe out those fonts without chinese chacters from the file /etc/fonts/conf.avail/69-language-selector-zh-*.conf so that the annoying square character problems in chinese OS never appear forever.

Revision history for this message
Aron Xu (happyaron) wrote :

>Those applications using English text are displayed uglier owing to their original
> appearance of English letter character from the adopted chinese font.
>
> You may put those chinese fonts with confortable appearance of English letter
> character into superior order to rule out those ugly ones, if you dislike them.

This is an unacceptable resolution, because people may change their preferred English fonts to any fonts other than Dejavu or Bitstream Vera, you cannot force them to use a Chinese font which cannot provide best experience.

Fontconfig has the concept of priority, it is designed to make everything *just work* - When there isn't a character in the fonts of higher priority, applications should go and look at other fonts of lower priority.

I've read on the forum post you've shown. At least for now, 69-language-selector-zh-*.conf shouldn't be removed because nobody stands out to propose a better solution which does not produce regressions.

> It still can not solve the square character problems in chinese when one applys other
> application even though xpdf-chinese-* could solve the square character problems in
> chinese PDF files.

I guess the "Other application" in your words are most likely to say evince, okular, nautilus preview, etc. But please be aware, they all use poppler library, and they all use poppler-data for CJK typefaces. We should consider poppler-data is buggy, so the issue should be fixed ultimately within poppler*. xpdf sets us an example that everything is possible to be OK if we do not break user experience.

You might be aware, that Ubuntu's default font settings are accepted by most users and considered as the best defaults. If we do remove 69-language-selector-zh-*.conf, Chinese font appearance will most likely have minor differences to Debian, which does not has any tweak for Chinese.

> However it should be better to wipe out those fonts without chinese chacters from the
> file /etc/fonts/conf.avail/69-language-selector-zh-*.conf so that the annoying square
> character problems in chinese OS never appear forever.

No, at least not now. The buggy thing is poppler-data, which does not have correct maps like xpdf-chinese-* (and dependencies) do. Because poppler is widely used, it generates an impression that "other applications" are affected, but only xpdf survived because it is a special case.

Revision history for this message
hemiscy (scy-hemi) wrote :

I am zh-tw user and I am here for Aron's comment #11. The propose of removing English fonts from 69-language-selector-zh-*.conf count only be considered as a hack or workaround, which unfortunately produces regression. The real solution is to fix poppler-data, for application to render characters in both language correctly , whether English fonts are included in 69-language-selector-zh-*.conf or not.

Revision history for this message
poloshiao (poloshiao) wrote :
Download full text (8.5 KiB)

This post was originally posted by xenomorph0525 on 2011/1/30
Its url : http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=162666#forumpost162666 and its
following discussion posts.
I help translate it into English and post it here.

...............................................................................................................................................................
Ubuntu is easy to use.
Ubuntu does everything you need it to.
[url=http://www.ubuntu.com/desktop/why-use-ubuntu]It has so strong appeal for everyone on the page[/url]

We can supply better defaults for per-language font settings, so that the majority of users don't need to have it.
It is absolutely the main reason why we choose ubuntu, as traditional-chinese
locale, [url=https://wiki.ubuntu.com/font-selector]with the wish shown on the page [/url]

In fact ubuntu make the best achievement in localization. But there are still
many things to do to advance towards these respectable goals. One of them is the Font-selector tool 69-language-selector-zh-tw.conf file.

When install the ubuntu, users choose 『traditional-chinese』 in the first
window in front of him, that means the user has chosen to display the message
on the monitor with the locale default zh_TW. In turns he will automatically accept 69-language-selector-zh-tw.conf as the main mechanism to arrange the priority of installed fonts to be adopted. The majority of traditional chinese users choose 69-language-selector-zh-tw.conf by default and by willingness.

It is a great design ! But there are still somethings to be improved to chieve
the great goal: easy to use and per-language font settings.
Yeah, 69-language-selector-zh-tw.conf is one of them to be improved!

.....................................................

One of our posters, said in his post:

[url=http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=152366#forumpost152366]How to do when his monitor shows square characters when he apply pppoeconf ?[/url]
He has just finished the installation of ubuntu 10.10 server edition with
traditional chinese under text mode.
I find the reason why it shows square ?
It only installs DejaVu font, by default, in the /usr/share/fonts/truetype/ttf-dejavudirectory and use 69-language-selector-zh-tw.conf as font select tool.
So it must show square when it is intent to display chinese character.
This has bothered him much !
He must learn something else to read the square, not in the status “don't need to have it” as described above.

If you put 69-language-selector-zh-tw.conf into the searching square of
google searching engine, there are ten thousands of outputs.
Most of these pages try to teach ubuntu users, who choose traditional chinese
as there default locale, to edit there 69-language-selector-zh-tw.conf.

This is far from the appeal : Ubuntu does everything you need it to and
better defaults for per-language font settings.

We know minority of users have the needs to use English fonts, like DejaVu and Bitstream Vera Sans, as their default font selection. Most of them are experters. They may choose English, i.e. en_US, as their default locale, etc.. Most of them belongs to mi...

Read more...

Revision history for this message
Aron Xu (happyaron) wrote :

Hello, thanks for your efforts and happy new year! Let's make a summary of discussions till now so we can probably move on ;-)

The facts (and problems):
1. The special case in those PDF files is they does not have Chinese fonts embedded. Many daily used files fall in this kind (such as research papers on CNKI which is very widely used in China's academic interests).
2. The problem does not only exist in Chinese PDF rendering, but also some other random applications which hard code to use "sans-serif" fonts.
3. With current 69-language-selector-zh-xy.conf, and with poppler-data installed, the Chinese text in those PDF files are shown as squares.
4. xpdf with xpdf-chinese-* installed can render those PDF files correctly, Adobe Reader can do it as well.
5. Cannot reproduce this problem on Debian when we change the font settings manually in gnome-appearance-properties to make the user experience just like Ubuntu.

More on technical details:

Programs cannot handle fontconfig configurations correctly when it needs to show Chinese contents. Fontconfig assumes applications will find font faces in the list order, if it fails in the former ones, then move and find it in latter ones, but those buggy applications only find font faces in the first font listed.

Ubuntu's approach to font settings is to have an English font (commonly Dejavu or Bitstream Vera) with higher priority, and Chinese fonts with lower priority. Users benefit from such settings because English typefaces in Chinese fonts are usually not that good-looking. Most of the desktop applications work well.

Proposed solutions, with pros and cons:

1. Remove English fonts from /etc/fonts/conf.avail/69-language-selector-zh-xy.conf.
    This one would make it able to show those PDF files correctly, but forces people to use English typefaces provided by Chinese fonts, which lower the user experience so it is a regression. This is a quick bug ugly solution.

2. Fix poppler-data and all relevant applications so they can honor fontconfig settings correctly.
    This one is not a quick-and-dirty fixed that can be applied by users themselves as previous one. But this should be the correct way - xpdf and Adobe Reader can do well, so their is no reason that poppler can't by fixing relevant codes. This is technically a "better" solution, but takes longer time to achieve.

summary: - subject: Remove the string tags include Bitstream Vera、DejaVu inside in
- the setting file /etc/fonts/conf.avail/69-language-selector-zh-xy.conf
- where xy represents cn, hk, mo, sg and tw individually.
+ Chinese characters in PDFs without embedded fonts are shown as squares
Aron Xu (happyaron)
description: updated
Revision history for this message
hemiscy (scy-hemi) wrote :

Thank to poloshiao's effort and Aron Xu's explaination.

IMHO, there chould be better solutions (or workarounds) better than removing English fonts from 69-language-selector-zh-*.conf, for the reasons below.

Firstly, 69-language-selector-zh-*.conf works *perfectly* with gnome desktop, nautilus, gedit, Firefox and so many other applications. Apparently, it is not 69-language-selector-zh-*.conf which causes the bug. The problem is that the buggy applications do not handle font setting correctly, but not the font setting is wrong.

Certainly, if we remove English fonts from 69-language-selector-zh-*.conf, the buggy applications will render correctly CJK fonts. But at what cost? The current font setting which works well, a lot of applications which work well with the font setting, and the users who are happy with the font setting. We can say that display correctly is more important than beautifully. But there are people who do care about that.

According to the experiences of zh-tw community, there are several applications have the issue of Sans font setting, and this is the list: Evince, Tetravex, and Flash plugin.

To my knowledge, Tetravex has been updated and the problem is no longer exist. The version of Flash plugin later than 10.1.53.64 is reported to work well (there are still some incorrect cases but it's another bug of coding which 69-language-selector-zh-*.conf could not fix). User who does not have the good versions can also solve the Flash problem by modifying 49-sansserif.conf.

Therefore, Evince is probably the only one rest we need to deal with so far. I think we have other choice other than 69-language-selector-zh-*.conf, that is Adobe Reader.

Chinese users usually have some problem with PDF, not only the square, but also the fragmental characters. Adobe Reader can solve both of these issus, but 69-language-selector-zh-*.conf could only do with the square.

69-language-selector-zh-*.conf is a system-wide setting. Too many applications will be effected when it is changed. Since we have other choices, it would be better to let 69-language-selector-zh-*.conf be what it is currently. In the same time, we investigate which is the real buggy part and try to fix it. (It may not poppler because Tetravex does not use poppler)

Thank you.

Revision history for this message
poloshiao (poloshiao) wrote :

#13 :
The above test is finished under ubuntu 9.10 with the original pdf file: http://dl.dropbox.com/u/2300246/%E9%84%AD%E6%BC%A2%E6%96%87%20%E8%98%AD%E5%B6%BC%E7%9A%84%E8%B2%9D%E6%96%87%E5%8C%96.pdf

The above url has some problem to be downloaded, so I send the original pdf file directly as a detached file.

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

@ Aron Xu:
Comment for 1. Remove English fonts from /etc/fonts/conf.avail/69-language-selector-zh-xy.conf.
> This one would make it able to show those PDF files correctly, but forces people to use English typefaces provided by Chinese fonts, which lower the user experience so it is a regression. This is a quick bug ugly solution.

Forcing Traditional Chinese users to use English typefaces provided by Chinese fonts is NEVER a prolbem, because now the setting refer to WenQenYi Micro Hei which is deverived from Droid, and Droid provides good English typefaces actually. Otherwise, mobliephone manuactures won't adopt droid on their Android phone, right?

Furthermore, using the typefaces provided by Chinese fonts gives Traditional Chinese user a better looking appearance with more consistance and harmony. I think people who cares about the artwork and eye candy might agree with me.

All in all, this solution is not a solution actually, becuse the prolbem is not caused by fontconfig settings. This solution is a "Enhancement" or "Feature request" actualy, you have missed the point, and the real bugs should always be removed.

> 2. Fix poppler-data and all relevant applications so they can honor fontconfig settings correctly.
Yes, that is the real solution! We should file bugs for them to solve all the buggy things!

By doing this, there won't be more people suffered by the programs honored fontconfig settings.

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

Correction: By doing this, there won't be more people suffered by the programs *not* honored fontconfig settings.

Revision history for this message
Aron Xu (happyaron) wrote : Re: [Bug 659280] Re: Chinese characters in PDFs without embedded fonts are shown as squares

On Sat, Feb 5, 2011 at 13:58, Cheng-Chia Tseng <email address hidden> wrote:
> Forcing Traditional Chinese users to use English typefaces provided by
> Chinese fonts is NEVER a prolbem, because now the setting refer to
> WenQenYi Micro Hei which is deverived from Droid, and Droid provides
> good English typefaces actually. Otherwise, mobliephone manuactures
> won't adopt droid on their Android phone, right?
>

Please bear in mind, not all people love Droid fonts, they *SHOULD*
have the freedom and possibility to choose what they like, but not
forcing people to do anything because of a dirty workaround.

> Furthermore, using the typefaces provided by Chinese fonts gives
> Traditional Chinese user a better looking appearance with more
> consistance and harmony. I think people who cares about the artwork and
> eye candy might agree with me.
>

Fonts preferences varies because different people have different
tastes, it's just your own feeling if you cannot prove it's
representative.

> All in all, this solution is not a solution actually, becuse the prolbem
> is not caused by fontconfig settings. This solution is a "Enhancement"
> or "Feature request" actualy, you have missed the point, and the real
> bugs should always be removed.
>

This isn't an "Enhancement", nor "Feature request", it's a bug.
"Enhancement" and "Feature request" are what intends to make things
better when the original ones don't break things. Now the buggy
applications prevents people reading their files, so it's a BUG.

>> 2. Fix poppler-data and all relevant applications so they can honor fontconfig settings correctly.
> Yes, that is the real solution! We should file bugs for them to solve all the buggy things!
>
> By doing this, there won't be more people suffered by the programs
> honored fontconfig settings.
>

I've already said, it's a quick-and-dirty workaround which causes regression.

--
Regards,
Aron Xu

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

I am not talking about preference with the dejavu fonts you like or not.

Using ONE font to display English, Chinese, Japanese and other characters is what I am talking about to keep the consistance.

Using different fonts to display characters respectively must break the harmony when all the different typefaces show up, such as you see DejaVu Englsih and WenQenYi Micro Hei Chinese characters are at one screen will never be more harmony with Only WenQenYi Micro Hei English and Chinese are at one screen. This is about art, about eye candy, and about harmony.

I don't know why the developers tend to use DejaVu for English and WQY Micro Hei for Chinese for Chinese users. Is there anyone knows the reason?

By the way, I think editting the fontconfig settings which Ubuntu ships is someting like "Enhancement" or "Feature Request" that people should file a new one, and removing the bugs which cause the problems is what should do. I don't know whether you are confused by what I am saying or not...

Revision history for this message
Aron Xu (happyaron) wrote :

On Sat, Feb 5, 2011 at 17:59, Cheng-Chia Tseng <email address hidden> wrote:
> I am not talking about preference with the dejavu fonts you like or not.
>
> Using ONE font to display English, Chinese, Japanese and other
> characters is what I am talking about to keep the consistance.
>
> Using different fonts to display characters respectively must break the
> harmony when all the different typefaces show up, such as you see DejaVu
> Englsih and WenQenYi Micro Hei Chinese characters are at one screen will
> never be more harmony with Only WenQenYi Micro Hei English and Chinese
> are at one screen. This is about art, about eye candy, and about
> harmony.
>
> I don't know why the developers tend to use DejaVu for English and WQY
> Micro Hei for Chinese for Chinese users. Is there anyone knows the
> reason?
>

I have already said, it's your own preference unless you can prove
majority of people are with your opinion.

Here is a fact:
Many people prefer the current settings, and many major distros have
already use similar solutions - then people are happier than before.

> By the way, I think editting the fontconfig settings which Ubuntu ships
> is someting like "Enhancement" or "Feature Request" that people should
> file a new one, and removing the bugs which cause the problems is what
> should do. I don't know whether you are confused by what I am saying or
> not...
>

It's a bug and it breaks things, which has few relationship to "feature".

--
Regards,
Aron Xu

Revision history for this message
hemiscy (scy-hemi) wrote :

Aron Xu wrote:
> Here is a fact:
> Many people prefer the current settings, and many major distros have
> already use similar solutions - then people are happier than before.

Aye. I do like current font setting. DejaVu Sans looks good and, IMHO, it works "harmoniously" with WQY Micro Hei.

***

I wonder if we are talking about the "beauty". All the posts above from zh-tw mentioned that some applications can not display correctly Chinese characters. It's about a bug, not personal or major font taste.

I also wonder if it is necessary to make a system-wide change just because of a bug of one or two applications.

Adobe can fix the square issue of Flash plugin, we also can fix the same problem for evince.

By the way, the font configuration provides freedoms and possibilities to users. People who like English fonts listed in the font config file or not could do anything they like to the file by themselves. I am always for this freedom.

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

Well, there is no evidence or report that you can say that most people prefer the current settings. According to the comment, that is your preference to English typefaces of DejaVu than ones of WQY Micro Hei.

So, please just regardless of the preference, as far as I know that WQY Micro Hei character coverage is bigger than DejaVu provides and contains all of them, there is no need to use DjaVu first to meet the English typefaces then fall back to Chinese typfaces for Chinese user, right?

Now, imagine there is a text containing multilanguage such as English, Chinese, Japanese, Korean at the same time. You have a font which covered all the characters to display, one other for Engish only, one other for Japanese and else other for Korean. I believe that just use that font which can display the characters all is more harmony than displaying those characters of different langs to respective fonts. If you don't think so, please consult the art team to confirm that what I am saying is right or not. I have studied graphic art, and I know that using the same font to display what all it covers is just better than different fonts mixed in the same screen which make readers feel cluttered and not consistant. If you have other material that says I am wrong with this please contact me privately, and I will appreciate that a lot.

I won't comment more on the font issue.

By the way, if you guys want to discuss the settings of fontconfig for Chinese users, I propose you to file a new bug about language-selcter product to improve it (or you can say "enhancement" or something else here instead of "bug"). Editting the fontconfig settings won't change the fact that "Chinese characters in PDFs without embedded fonts are shown as squares" caused by Evince or something else not respecting the fallback mechanism.

Revision history for this message
hemiscy (scy-hemi) wrote :

Cheng-Chia Tseng wrote:
> Editting the fontconfig settings won't change the fact that "Chinese characters in PDFs without
> embedded fonts are shown as squares" caused by Evince or something else not respecting the
> fallback mechanism.

Ja, that's the point. Changing 69-language-selector-zh-*.conf won't help the bug. Let's stop talking about font configuration here. If someone want to remove English fonts for esthetic reason, it would be better to file a new bug.

Revision history for this message
Aron Xu (happyaron) wrote :

@Cheng-Chia Tseng,
You've misunderstood my point, please read back and find out where I said something like "I don't like this workaround because I like Dejavu"? No, there is not.

What I am emphasizing is "people should have the freedom and possibility to choose what font they prefer", and the workaround will force people to lose their freedom. Also, please don't argue that people can change the fontconfig configuration if they prefer to have things back to the current one after your workaround - now you are able to change it to whatever you like, too. No matter whether you have learned about art, it's not our topic here.

I've read the pages of arguments on ubuntu-tw.org on this topic. It appears to be several individuals do really cares about correctly displaying documents rather than best default experience, and they are who have been trying to push the changes. I know their opinion is "workaround first, so they don't just wait for fixing the bug".

From a developer point of view, workaround is acceptable once it do helps on some problems and do not cause regressions. But here your proposal does break users' freedom, and does break someone's preference of fonts configurations - as I've said it's quick-and-*dirty*, and the nature of being a dirty hack makes it isn't an acceptable solution.

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

For people who want to discuss about the fontconfig settings of [...]-zh-XX.conf, please go to:
https://bugs.launchpad.net/ubuntu/+source/language-selector/+bug/713950

Revision history for this message
poloshiao (poloshiao) wrote :

This post was original written by xenomorph0525 and was posted here:
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=163566#forumpost163566
I help translate it into English and post it to this bug report.
.................................................................................

1. about 69-language-selector-zh-*.conf
The zh_TW users take more care about normal displaying with chinese fonts than about nice appearance or not with English fonts. Furthermore there are many subjective viewpoints that are highly diversified about being good or ugly looks related to those fonts. Some agree that it will looks better if showing chinese characters with chinese fonts while showing English characters with DejaVu fonts. Others think that it should be better if showing English characters and Chinese characters with the same fonts as well.

2. It will prevent chinese characters from being normal displayed and partially declined to be replaced with square characters in some programs if allowing strings including Bitstream Vera、DejaVu retained in 69-language-selector-zh-*.conf. We have implemented a case study and posted the report in details:
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/13

3. We are holding an affixing theme
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654
for survey of being favorable to or against the proposal to edit 69-language-selector-zh-tw.conf, 69-language-selector-zh-hk.conf, 69-language-selector-zh-mo.conf among the traditional chinese users. We notify adequately whether they are in favor of removal strings including Serif、Sans、Monospace of DejaVu fonts and replacement with those from chinese fonts when display English characters in traditional chinese locale.
We will gather statistics about agreement or opposition separately at 2/24/2011, one week before the release of Ubuntu 11.04 Alpha 3, and 3/24/2011, one week before the release of Ubuntu 11.04 Beta. We will post the result data here as a reference for language-selector authorities.

4. At the same time we support to edit those programs which have problems in displaying chinese characters without a doubt.

5. Our ultimate anticipation is that there should be not any mojibake by default font setting in ubuntu 11.04 traditional chinese locale. We expect to implement this wish by removing strings including Bitstream Vera、DejaVu from 69-language-selector-zh-tw.conf, 69-language-selector-zh-hk.conf, 69-language-selector-zh-mo.conf. Your kind attention will be appreciated very much.

Revision history for this message
Qianqian Fang (fangq) wrote :

poloshiao:

can you please post the output of the following command?

 FC_DEBUG=1029 evince Document_B.pdf

where "Document_B.pdf" is the one that had display problems in your original report. Please do the above test with the "original" 69-language*.conf files.

I personally doubt it was caused by 69-language-selector-zh*.conf, but I can be wrong.

Revision history for this message
poloshiao (poloshiao) wrote :

Qianqian Fang wrote at #28 asked me to post the output of the following command.
FC_DEBUG=1029 evince Document_B.pdf
where the Document_B.pdf is the second pdf document posted at https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/4:
The second document is related to the right page, said B.
http://activity.ntsec.gov.tw/activity/race-1/45/senior/0402/040216.pdf

The following are our results:

1.FC_DEBUG=1029 evince 040216.pdf > origina-1.txt
with Original 69-language-selector-zh-tw.conf

original-1.txt as attached file original-1.txt

2.FC_DEBUG=1029 evince 040216.pdf > fixed-1.txt
with 69-language-selector-zh-tw.conf removed the strings by command:
<sudo sed -i '/DejaVu/d ; /Bitstream Vera/d ; /WenQuanYi Bitmap Song/d' /etc/fonts/conf.avail/69-language-selector-zh-tw.conf>

fixed-1.txt as attached file at the following post fixed-1.txt

Thank you for your kind attention !

Revision history for this message
poloshiao (poloshiao) wrote :

2.FC_DEBUG=1029 evince 040216.pdf > fixed-1.txt
with 69-language-selector-zh-tw.conf removed the strings by command:
<sudo sed -i '/DejaVu/d ; /Bitstream Vera/d ; /WenQuanYi Bitmap Song/d' /etc/fonts/conf.avail/69-language-selector-zh-tw.conf>

fixed-1.txt as attached file fixed-1.txt

Revision history for this message
poloshiao (poloshiao) wrote :

This post was originally written by xenomorph0525 and was posted here:
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=164272#forumpost164272
I help translate it into English and post it to this bug report.

................................................................
The file 69-language-selector-*.conf first appeared since Ubuntu 8.04 was issued. There were not any problem like Sans fonts becoming sqare characters when displayed chinese fonts before that time. It is owing to the way of mixing Chinese and English fonts in a file 69-language-selector-zh-*.conf that the sqare fonts came into being.

It is caused by 2 main facters that Sans fonts becoming sqare characters when displayed chinese fonts:
1. the way of mixing Chinese and English fonts in the file 69-language-selector-zh-*.conf
2. not in support of fallback function like the file 69-language-selector-zh-*.conf by some applications.

The problem should be resolved thoroughly only if removal of these two factors. In any case we have to remove the strings including Bitstream Vera、DejaVu from the file 69-language-selector-zh-*.conf so that other applications will not show abnormal Chinese fonts caused by the second factor.

After all it is so hard to control in advance how the programmers design their software. At least the first factor is under our master control among these two factors. Of course we should edit those programs with known bugs in the 2nd factor, beside the first factor, either.

It must not be one kind of good setting to mix Chinese fonts and English fonts in one setting file.
This would be regarded, by the users, as to be drawn back to normal setting ( not mixed Chinese and English fonts in one file) in case that the proposal to edit the file 69-language-selector-zh-*.conf was brought up when Ubnutu 8.04 had been releasing.

Maybe the reason why it is classified into not good solution by somebodies should be too late, after 2 years, to be brought forward to edit the setting. Nevertheless it is because of time point to shape the different impression. In fact it is still a suggestion of such kind of item as regressing back to normal setting.

Revision history for this message
Aron Xu (happyaron) wrote :

@poloshiao, Thanks for you hard work on translating the messages, :)

@xenomorph0525,
You've tried to predict what I'll reply to your comments, and it seems to be you'd like ask me a question: correctly displaying Chinese characters by a workaround or waiting for fixing the buggy software - which is more important?

From your point of view, correctly displaying the characters is more important - I know your opinion very clearly. Please don't doubt I've misunderstood you anymore.

But why such a problem being so critical in your mind? I guess the answer is the problem exists in the application that renders PDF files - which you must use to finish your work or study. So it's critical for you, and for some people who have similar situations - but fixing a problem should not influence other common people who don't care about your problem, their right (to have these three lines in fontconfig configuration) should be respected first.

If there are any problem on preference of fonts, that's another issue and you should convince people your choice is more suitable for a common taste. And you should really be clear that working around a bug in another application should never be your reason to push such a change, because font configuration matters for people who care that, then maybe you are break their things by this way (if you don't influence them, it'll be okay). Also, you can't say after such a change, people still have the freedom to add whatever they like back - then you have your freedom now to do whatever, too.

So I think you have wasted a lot of time here on arguing whether we would change fontconfig to adopt a dirty work around. Why not go and find a fix for poppler(-data)? That's the universal way which make everyone happy. (I think you agree it is able to be fixed.)

There is another issue which has very similar problem:
Many MP3 files have Big5/GBK tags, but no media player accept any work around such as detect Big5/GBK before assuming they are UTF-8 (even placing a locale hack is never accepted) - because such a detection has problems, and people who are using UTF-8 files might suffer problems they should never need.

Now, applying your proposal on working around PDF rendering problem will do harm to people who care their font settings, so it's not acceptable. Please, put all your efforts to make poppler(-data), that is a much more effective (and maybe correct) way to solve your problem, but not keep arguing or asking me "which is more important".

Revision history for this message
Aron Xu (happyaron) wrote :

Excuse me, correct the last sentence:

Please, put all your efforts to make poppler(-data) behave correctly, that is a much more effective (and maybe correct) way to solve your problem, but not keep arguing or asking me "which is more important".

Revision history for this message
poloshiao (poloshiao) wrote :

There are 12 users who support favorablely to the affixing theme at
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654
, but 1 user who exptresses against the proposal to edit 69-language-selector-zh-tw.conf.

There are totally 13 users who signed their opinions about the the affixing theme to edit 69-language-selector-zh-tw.conf、69-language-selector-zh-hk.conf、69-language-selector-zh-mo.conf on http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654.

You are welcomed to visit the the affixing pages at http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654&forum=2.

The next statistics will be reported here on 03/24/2011.

Revision history for this message
poloshiao (poloshiao) wrote :

There are 15 users who support favorablely to the affixing theme at
http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654
, but 4 user who exptresses against the proposal to edit 69-language-selector-zh-tw.conf.

There are totally 19 users who signed their opinions about the the affixing theme to edit 69-language-selector-zh-tw.conf、69-language-selector-zh-hk.conf、69-language-selector-zh-mo.conf on http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?post_id=169324#forumpost169324.

You are welcomed to visit the the affixing pages at http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654&forum=2.

The next statistics will be reported here on 04/21/2011.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

poppler-data has been added to the dependency list of language-selector.
https://launchpad.net/ubuntu/+source/language-selector/0.27

Revision history for this message
poloshiao (poloshiao) wrote :

It is 2011/04/21 now, just one week in advance of the scheduled date to release ubuntu 11.04.
The final statistic ended till 2011/04/21 about the problem to improve the square characters when chinese fonts shown is reported as follows:

There are 22 users who support favorablely to the affixing theme at http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654, but 6 user who exptresses against the proposal to edit 69-language-selector-zh-tw.conf.

There are totally 28 users who signed their opinions about the the affixing theme to edit 69-language-selector-zh-tw.conf、69-language-selector-zh-hk.conf、69-language-selector-zh-mo.conf on http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654&forum=2&post_id=175120#forumpost175120.

You are welcomed to visit the the affixing pages at http://www.ubuntu-tw.org/modules/newbb/viewtopic.php?topic_id=35654&forum=2.

As one of the most popular linux distributions and one of the developing teams emphasizing to keep user-friendliness in mind, the abve statistic do mean something important to the ubuntu developer team.

Revision history for this message
An Yang (euroford) wrote :

Hi everybody,

When meet any fonts related problem, you should locate which font are lost?
Here is the fonts info of 081526.pdf, see attachment.

It uses sans/serif/bold and other fonts, so if you want to display the characters, let the fontconfig use the right fonts at first.

To check it, you can use the following commands:
fc-match san
fc-match serif
fc-match tahoma

Revision history for this message
An Yang (euroford) wrote :

Here's my screenshot of 081526.pdf

Revision history for this message
An Yang (euroford) wrote :

In 040216.pdf, it use PMingLiU font.
In 2004lanyu.pdf, it use DFLiShu, DFMing, MingLiU and PMingLiU.

In /etc/fonts/conf.d, fontconfig just has MingLiU/PMingLiU support, but without DFLiShu and DFMing.

Revision history for this message
An Yang (euroford) wrote :
Download full text (4.1 KiB)

detail fonts info
081562.pdf:

name type emb sub uni object ID
------------------------------------ ----------------- --- --- --- ---------
Tahoma TrueType no no no 280 0
NGMPBD+DFKaiShu-SB-Estd-BF CID TrueType yes yes yes 282 0
BKASVS+新細明體,Bold TrueType yes yes yes 174 0
TimesNewRoman TrueType no no no 160 0
新細明體 TrueType no no no 158 0
新細明體,Bold TrueType no no no 152 0
BKASVS+新細明體 TrueType yes yes yes 171 0
BKASVS+新細明體 TrueType yes yes yes 155 0
BKASVS+新細明體 TrueType yes yes yes 167 0
BKASVS+新細明體 TrueType yes yes yes 181 0
BKASVS+新細明體,Bold TrueType yes yes yes 194 0
TimesNewRoman,Bold TrueType no no no 186 0
Arial TrueType no no no 196 0
BKASVS+新細明體 TrueType yes yes yes 185 0
BKASVS+新細明體 TrueType yes yes yes 198 0
BKASVS+新細明體,Bold TrueType yes yes yes 211 0
BKASVS+新細明體 TrueType yes yes yes 212 0
新細明體,Italic TrueType no no no 210 0
BKASVS+新細明體 TrueType yes yes yes 218 0
BKASVS+新細明體 TrueType yes yes yes 224 0
BKASVS+新細明體 TrueType yes yes yes 227 0
BKASVS+新細明體,Bold TrueType yes yes yes 233 0
BKASVS+TimesNewRoman TrueType yes yes yes 232 0
KILDGJ+DFKaiShu-SB-Estd-BF CID TrueType yes yes yes 294 0

040216.pdf:
name type emb sub uni object ID
------------------------------------ ----------------- --- --- --- ---------
JCPBIC+DFKaiShu-SB-Estd-BF CID TrueType yes yes yes 4723 0
JCPBPL+PMingLiU CID TrueType yes yes yes 4718 0
TimesNewRoman TrueType no no no 6 0
標楷體 CID TrueType no no no 8 0
標楷體,Bold CID TrueType no no no 12 0
TimesNewRoman,Bold TrueType no no no 14 0
新細明體 CID TrueType no no no 17 0
新細明體,Bold CID TrueType no no no 42 0
AGHMHK+Wingdings3 TrueType yes yes no 44 0
細明體 CID TrueType no no no 57 0
@新細明體 CID TrueType no no no 105 0
華康儷粗黑,Bold CID TrueType no no no 4614 0
EGMPAP+DFKaiShu-SB-Estd-BF CID TrueType yes yes yes 4706 0

2004lanyu.pdf:
name type emb sub uni object ID
...

Read more...

Revision history for this message
An Yang (euroford) wrote :

Evince display squares due to the fault function of locales, it can not judge the width of the Chinese characters in the file.

affects: poppler-data (Ubuntu) → langpack-locales (Ubuntu)
Revision history for this message
An Yang (euroford) wrote :

If you install poppler-data, evince will use the charmaps in it, the buggy locales will be jumped.
So if you refuse to install poppler-data, please look at my bug report: http://sourceware.org/bugzilla/show_bug.cgi?id=13064

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

Even if not everyone applauds it, bug 713950 was fixed as suggested a few weeks ago. As a side effect, the problem description of this bug is currently not correct, and it's unclear to me if it's motivated to still keep this bug open.

Of course, as long as the issues mentioned in the above comments have not been properly addressed, I'm not opposed to keep discussing them in a bug. Just wondering if there possibly are better bug reports for the purpose. If not, I would suggest that the summary and description of this bug are changed to reflect the current problem(s).

Revision history for this message
poloshiao (poloshiao) wrote :

We have shown up a test report on this comment
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/2 to prove the opinion that It does not always succeed with installing poppler-data package but without editing the setting file /etc/fonts/conf.avail/69-language-selector-zh-tw.conf.

So the hypothesis “If you install poppler-data, evince will use the charmaps in it, the buggy locales will be jumped. “ on https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/43 must be checked with more testimonies.

Revision history for this message
poloshiao (poloshiao) wrote :

The original links of the pictures in Comment 2 for bug 659280
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/2

picture 1. http://img257.imageshack.us/img257/4631/screenshotcox.png
picture 2. http://img821.imageshack.us/img821/8851/screenshot1sy.png
picture 3. http://img231.imageshack.us/img231/7329/screenshot2c.png
are very hard to be shown normally for some users now.

So I change them to new links as belows:
picture 1. http://a.imageshack.us/img257/4631/screenshotcox.png
picture 2. http://a.imageshack.us/img821/8851/screenshot1sy.png
picture 3. http://a.imageshack.us/img231/7329/screenshot2c.png
Wish these links will become easy to be shown normally for all users.

Apologize for my inattention !

Revision history for this message
An Yang (euroford) wrote :

Hi poloshiao,

Could you show me the result when you execute LANG=zh_TW.big5 evince 040216.pdf ?

Revision history for this message
An Yang (euroford) wrote :

I can see the pictures in comment #46.
I notice that all the character of 細明體 display as a small squares.
Could you show us the out put of `fc-match 細明體`?

Revision history for this message
poloshiao (poloshiao) wrote :

The tests suggested on
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/47
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/48
by An Yang (euroford) are under scheduling and the results will be reported here as soon as possible.

There were also two test reports posted on
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/13
related to poppler and popper-data.

The same reason as described on
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/46,
I have edited the links of pictures on
https://bugs.launchpad.net/ubuntu-translations/+bug/659280/comments/13
so that they can be shown easily by all users.

.... Report 1........................

1. The 1st report described how tetravex, which is not related to pdf,
appearred square chinese fonts uder zh_TW locale with 69-language-selector-zh-TW.conf.
1-1. with 69-language-selector-zh-TW.conf as default:
The chinese characters appeared square.
http://a.imageshack.us/img193/5210/tetravex2280.png
1-2. with poller-data installed and with 69-language-selector-zh-TW.conf as default:
The chinese characters appeared square.
http://a.imageshack.us/img11/8430/tetravex2280poppler.png
1-3. with poller-data installed but 69-language-selector-zh-TW.conf with Bitstream Vera、DejaVu string removed:
The chinese characters appeared normal.
http://a.imageshack.us/img716/6640/tetravex2280poppler69.png
1-4. without poller-data installed and 69-language-selector-zh-TW.conf with Bitstream Vera、DejaVu string also removed:
The chinese characters appeared normal.
http://a.imageshack.us/img232/2470/tetravex228069.png

Conclusion:
It displayed normal chinese fonts only when Bitstream Vera、DejaVu strings was removed from 69-language-selector-zh-TW.conf (despite poller-data was installed or not.)

.... Report 3........................

3. The 3rd report related to the different behavior of xpdf detached in the same page:
3-1. with 69-language-selector-zh-TW.conf as default:
http://a.imageshack.us/img89/854/xpdf302.png
3-2. with poller-data installed and with 69-language-selector-zh-TW.conf as default:
http://a.imageshack.us/img573/7048/xpdf302poppler.png
3-3. with poller-data installed but 69-language-selector-zh-TW.conf with Bitstream Vera、DejaVu string removed:
http://a.imageshack.us/img3/4677/xpdf302poppler69.png
3-4. without poller-data installed and 69-language-selector-zh-TW.conf with Bitstream Vera、DejaVu string also removed:
http://a.imageshack.us/img3/1472/xpdf30269.png

conclusion:
Although xpdf, with xpdf-chinese-traditional installed, shows the content in normal chinese fonts all the way, but it expresses the name of directories and files in abnormal chinese fonts in all cases.

We wish Mr. An Yang (euroford) may give us some hints or suggestions about how to explain the test results.

Revision history for this message
poloshiao (poloshiao) wrote :

Sorry, the link of picture 3-2 has failed, but it is just the same as picture 3-1:

3-2. with poller-data installed and with 69-language-selector-zh-TW.conf as default:
http://a.imageshack.us/img573/7048/xpdf302poppler.png

3-1. with 69-language-selector-zh-TW.conf as default:
http://a.imageshack.us/img89/854/xpdf302.png

Revision history for this message
An Yang (euroford) wrote :

xpdf is a very old X application, I did not use it since gpdf was replaced by evince.

I just remember it does not use pango, and has it's own cMap.

So any change in fontconfig and poller-data will never affect it.

Revision history for this message
hemiscy (scy-hemi) wrote :

Could anyone explain why a newly installed Japanese Ubuntu 11.04 (without update) can correctly display the PDF files in question? In /etc/fonts/conf.avail/69-language-ja-jp.conf, the DejaVu fonts are listed before CJK fonts as well.

Revision history for this message
An Yang (euroford) wrote :

Hi hemiscy,

Something is wrong in charmap/locales of zh_TW/zh_CN.

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

I tested LANG=zh_TW.big5 evince 040216.pdf on my ubuntu 11.04 in Virttualbox.

All the settings are default, no edit.

Test result:
Page 1, see http://www.flickr.com/photos/pswo10680/6074457463/in/photostream
Page 2, see http://www.flickr.com/photos/pswo10680/6074457465/in/photostream

The output of 'fc-match 細明體' is DejaVuSans.ttf: "DejaVu Sans" "Book".

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

After poppler-data installed, the result is diffirent, see below.

Test result:
Page 1, see http://www.flickr.com/photos/pswo10680/6075067018/in/photostream
Page 2, see http://www.flickr.com/photos/pswo10680/6075067014/in/photostream

The output of 'fc-match 細明體' is the same.

PS. I do not have LANG=zh_TW.big5, so it fallbacks to C locale. But using LANG=zh_TW.utf8 does have the same result.

Revision history for this message
An Yang (euroford) wrote :

>The output of 'fc-match 細明體' is DejaVuSans.ttf: "DejaVu Sans" "Book".

It's the point.

And I'm sorry, you should check whether "zh_TW BIG5" in /var/lib/locales/supported.d/local .
If not, just add it, and run sudo locale-gen.

Then you could run LANG=zh_TW evince xxx.pdf
zh_TW default use BIG5, so do not use zh_TW.BIG5.

Revision history for this message
Cheng-Chia Tseng (zerng07) wrote :

I I have did what you writed at #56.
After locale-gen, system returned "zh_TW.BIG5... ??????????? "/usr/lib/locale/locale-archive": ????????? failed".

Although it said that it failed, I still can run LANG=zh_TW evince 040216.pdf successfully.
The output result is the same with #55. The difference is just on interface translation.

Poppler-data installed.

Test result:
Page 1, see http://www.flickr.com/photos/pswo10680/6080146194/
Page 2, see http://www.flickr.com/photos/pswo10680/6080146200/

The output of 'fc-match 細明體' is still the same.

Revision history for this message
An Yang (euroford) wrote :

Hi hemiscy (scy-hemi)

Could you paste the output of 'fc-match 細明體' or 'fc-match MingLiU' in Japanese ubuntu?

Revision history for this message
Walter Cheuk (wwycheuk) wrote :

For those who are still concerned with this bug, it should have been fixed with #713950

Revision history for this message
David Planella (dpm) wrote :

Marked as Fix Released in translations, as per the last comment.

Changed in ubuntu-translations:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.