the japanese kanas in ptt.cc was converted to Private Used Area in unicode

Bug #424677 reported by LI Daobing
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Translations
Invalid
Medium
Ubuntu Japanese Translators
fqterm (Ubuntu)
Invalid
Undecided
Unassigned
pcmanx-gtk2 (Ubuntu)
Invalid
Undecided
Unassigned
qterm (Ubuntu)
Invalid
Undecided
Unassigned
ttf-wqy-zenhei (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: ttf-wqy-zenhei

the following character has a wrong display under "WenQuanYi Zen Hei" and "WenQuanYi Zen Hei Mono"
"☆"

many other japanese characters also has this problem.

three screenshot in leafpad in attachment, the file name is the font name.

Tags: fonts i18n
Revision history for this message
LI Daobing (lidaobing) wrote :
Revision history for this message
LI Daobing (lidaobing) wrote :
Revision history for this message
LI Daobing (lidaobing) wrote :
Revision history for this message
Qianqian Fang (fangq) wrote :

what you typed are not Japanese characters. Their encodings belongs to Private Used Area (PUA), and are arbitrarily defined for specific fonts. The corresponding unicode values are:
  "U+f73f U+f703 U+2606 U+f70f U+f715"

What you should be using are "らき☆すた", with corresponding unicode values
  "U+3089 U+304D U+2606 U+3059 U+305F"

these are the true "Japanese" kanas, Zen Hei and many other CJK fonts can display them very well.

Perhaps Arne should consider removing these PUA glyphs from Uming. Using them causing more confusions. I can find a bunch of reports on this:

http://wenq.org/forum/viewtopic.php?f=5&t=744&p=3985
http://wenq.org/forum/viewtopic.php?f=12&t=874

Another frequently encountered problem is U+E5D9 (mapped to U+23EBF), the string "" are often seen at the beginning of a paragraph (may be a result of using MS Word) and causing confusions, for example:

http://forum.ubuntu.org.cn/viewtopic.php?f=8&t=221913&p=1423546

I understand that Uming defines these code points based on HKSCS-2004 Annex (http://www.ogcio.gov.hk/ccli/unicode/hkscs/download/2003cmp.txt), however since uming has far less code points than the 65535 limit, it is entirely possible to restore these code points to their official unicode encoding.

Arne, do you want to comment on this?

Revision history for this message
LI Daobing (lidaobing) wrote :

Please don't remove these symbols from ttf-arphic-uming until (at least one) BBS client can correct convert these character in ptt.cc to the correct unicode.

summary: - failed to display some japanese character
+ the japanese kanas in ptt.cc was converted to Private Used Area in
+ unicode
Aron Xu (happyaron)
tags: added: fonts i18n
Changed in ubuntu-translations:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Pierre Slamich (pierre-slamich) wrote :

Subscribing Ubuntu Japanese Translators.

Changed in ubuntu-translations:
assignee: nobody → Ubuntu Japanese Translators (ubuntu-l10n-ja)
Revision history for this message
Fumihito YOSHIDA (hito) wrote :

@pierre-slamich,

It seems font/cmap side problem, this is *not* translation related things. So, I have question. Japanese translators could resolve this issue?

I would be very grateful if you could correct my understanding.
 - added as subscriber : reasonable.
 - changed assignee: ..why?

Revision history for this message
dino99 (9d9) wrote :

That version is no more maintained

Changed in ubuntu-translations:
status: Triaged → Invalid
Changed in fqterm (Ubuntu):
status: New → Invalid
Changed in pcmanx-gtk2 (Ubuntu):
status: New → Invalid
Changed in qterm (Ubuntu):
status: New → Invalid
Changed in ttf-wqy-zenhei (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.