Let me give a few closing comments for anyone that may find this bug/thread in the future and wonder what happened. This was one of several bugs I found while investigating the Phoenician unicode block. I was setting out to type set a Bible using Phoenician, since this was the alphabet the Bible was originally written in. I also needed a complete tool chain for dealing with this alphabet. I eventually did get that done, link below. If anyone reading this in the future doesn't have a launchpad account, and needs the Phoenician resources I'll mention below, please use the contact links from the following website/page. http://www.bibletimepress.com/bibles This Open Office bug was by far the smallest of the bugs I found, though it was an early one since it was easy to test parts of the needed tool-chain. Turns out most Java apps cannot handle either, the key one being Eclipse, because apparently nobody respects the surrogate pairs that are used for these code block values. Surrogate pairs were added after the original Java language specification was written. Remember Phoenician is 1090X, a 20 bit value. So Eclipse was full of bugs related to syntax highlighting and editing when any surrogate pair related unicode value is entered. Once these values made it onto a line in a file edited by eclipse the line could no longer be safely edited. I opened a bug there too, and over the months learned a lot. The problem is so pervasive the Eclipse guys seem to think this will never be solved. I would add that it will never be solved in Java apps because it not in the control of the Java team to fix, they've set surrogate pair standards that nobody follows in practice. Early Java language educational resources get wrong, so to do most Java programmers. I also found that the use of these values in web browsers is not supported enough for any practical use, especially server side fonts and the MS .eot file format does not handle, at least not using open source .ttf to .eot conversion tools. It may be that Windows cannot handle anything more than 16 bit unicode, though I don't know for sure. The system for displaying the Unicode value in a box for missing characters does not work above 16 bits. The default font used for Phoenician in the Unicode standard and thus used in Ubuntu is from the last known historical inscription, about 318 AD, probably the worst choice that could have been made, as this was a language used for 1800 years earlier using a very different, and much better, and much more common, letter form. Kate, the Kubuntu text editor, could not handle these unicode values either. Latex, the type setting program, was also unable to handle this range well, though with some unusual, pre-alpha, macro packages designed primarily for typesetting the Koran, it came close. I also found that this particular code block is missing several very important values, including the most important inter-word separator, so the block itself is defective. Since they only assigned 5bits, or 32 possible values, there isn't room to fix and also include the missing vowels. I also found that this alphabet was originally bi-directional, boustrophedon. There is essentially no support, anywhere, for that. But, that also suggested that many problems would be solved using it left-to-right in most cases. This turned out to solve a bunch of problems, including the ease of learning the language. My fix was to rebuild the Phoenician code page at 0xEF00, within the 16 bit Unicode private use area, left-to-right, with better choices of letter placements, including the inter-word space character and Phoenician vowels. I have the X keyboard files needed for this block, various fonts and related tools should someone need to type them, or display them. The keyboard layout is designed for English language touch typists, and can be learned in an hour. Open Office does work fine with this solution, as does eclipse, as does Latex (XeTeX), as does Kate, even KMail, and the .ttf to .eot format converters also work, so too all the browsers that someone might be using. (Though it is still not easy to style.) This solution risks collisions with other private use area code blocks. That has not been a problem in practice. Thanks to everyone who looked at this hard problem. It was the tip of an iceberg. Phil