Unicode Phoencian block, 1090X, not displayed in correct direction

Bug #459991 reported by Phil Stone
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openoffice.org (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Binary package hint: openoffice.org

Ubuntu 9.10 added a system font for Phoenician so it displays correctly.

Open Office can display the characters, but should display them like Hebrew, where they advance right-to-left. This is not happening and is a bug.

Note the right-to-left icons do allow whole paragraphs of Phoenician, but like Hebrew the individual words should advance just like Hebrew.

Phil

ProblemType: Bug
Architecture: i386
Date: Sat Oct 24 12:56:25 2009
DistroRelease: Ubuntu 9.10
Package: openoffice.org-core 1:3.1.1-5ubuntu1
ProcEnviron:
 LANGUAGE=
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-14.48-generic
SourcePackage: openoffice.org
Uname: Linux 2.6.31-14-generic i686
XsessionErrors: (<unknown>:3577): Gdk-CRITICAL **: gdk_window_get_origin: assertion `GDK_IS_WINDOW (window)' failed

Revision history for this message
Phil Stone (philstone) wrote :
Revision history for this message
Chris Cheney (ccheney) wrote :

Did you actually put the document into that language setting? If not it is probably set to your default language of US English...

Changed in openoffice.org (Ubuntu):
status: New → Incomplete
Chris Cheney (ccheney)
tags: added: karmic
Revision history for this message
Chris Cheney (ccheney) wrote :

We're closing this bug since it is has been some time with no response from the original reporter. However, if the issue still exists please feel free to reopen with the requested information. Also, if you could, please test against the latest development version of Ubuntu, since this confirms the bug is one we may be able to pass upstream for help.

Changed in openoffice.org (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Phil Stone (philstone) wrote :
Download full text (4.3 KiB)

Let me give a few closing comments for anyone that may find this bug/thread in the future and wonder what happened.

This was one of several bugs I found while investigating the Phoenician unicode block. I was setting out to type set a Bible using Phoenician, since this was the alphabet the Bible was originally written in. I also needed a complete tool chain for dealing with this alphabet. I eventually did get that done, link below. If anyone reading this in the future doesn't have a launchpad account, and needs the Phoenician resources I'll mention below, please use the contact links from the following website/page.

http://www.bibletimepress.com/bibles

This Open Office bug was by far the smallest of the bugs I found, though it was an early one since it was easy to test parts of the needed tool-chain.

Turns out most Java apps cannot handle either, the key one being Eclipse, because apparently nobody respects the surrogate pairs that are used for these code block values. Surrogate pairs were added after the original Java language specification was written. Remember Phoenician is 1090X, a 20 bit value. So Eclipse was full of bugs related to syntax highlighting and editing when any surrogate pair related unicode value is entered. Once these values made it onto a line in a file edited by eclipse the line could no longer be safely edited.

I opened a bug there too, and over the months learned a lot. The problem is so pervasive the Eclipse guys seem to think this will never be solved. I would add that it will never be solved in Java apps because it not in the control of the Java team to fix, they've set surrogate pair standards that nobody follows in practice. Early Java language educational resources get wrong, so to do most Java programmers.

I also found that the use of these values in web browsers is not supported enough for any practical use, especially server side
fonts and the MS .eot file format does not handle, at least not using open source .ttf to .eot conversion tools. It may be that Windows cannot handle anything more than 16 bit unicode, though I don't know for sure. The system for displaying the Unicode value in a box for missing characters does not work above 16 bits.

The default font used for Phoenician in the Unicode standard and thus used in Ubuntu is from the last known historical inscription, about 318 AD, probably the worst choice that could have been made, as this was a language used for 1800 years earlier using a very different, and much better, and much more common, letter form.

Kate, the Kubuntu text editor, could not handle these unicode values either.

Latex, the type setting program, was also unable to handle this range well, though with some unusual, pre-alpha, macro packages designed primarily for typesetting the Koran, it came close.

I also found that this particular code block is missing several very important values, including the most important inter-word separator, so the block itself is defective. Since they only assigned 5bits, or 32 possible values, there isn't room to fix and also include the missing vowels.

I also found that this alphabet was originally bi-directional, boustrophedon. T...

Read more...

Revision history for this message
Chris Cheney (ccheney) wrote :

Phil,

Thanks for the information. I am uncertain but perhaps the graphite library may be working towards supporting these types of languages?

http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&cat_id=RenderingGraphite

If so then it is supported in OpenOffice.org but I doubt there is font support for the language you are interested in yet.

Chris

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.