Comment 37 for bug 608631

Revision history for this message
verdy_p (verdy-p) wrote :

Reply to comment #35;

It doesnot matter if the character is incorrectly rendered. The decision to include it or not is part of a policy for each project.

And when Bruno Patri said that the character was rendered as a lozenge in TTYs, this makes non-sence, because it is fully dependant on the character set conversion that occurs within the terminal emulator, either on the server site (user's current locale parameter, or system/application locale parameter), or in the client terminal (and with its configured fonts.

Most terminal emulators today support UTF-8 as the encoding of choice on the network, as well as servers (at leasst Linux and most FreeBSD distributions, and since many years now, Sun/Oracle Solaris or IBM AIX as well).

The main problem was not the presence of NNBSP (which could already be entered, but the lack of accurate identification if it is rendered : we need a visual clue to make the difference between distinct spaces.

Then it's up to each translation project to determine which characters they accept. Most browsers now have accurate rendering, so there's no reason of refusing it (correct display of whitespaces is in fact a minimum demanded now for HTML5 which descibes the minimum subset of Unicode character properties that browsers should support (this includes the recognition of the non-breaking property, as well as the whitespace property, both of tem being STABLE, and that must be recognized as well for implementing IDNA safely).

For those environments that can't support Unicode encoded output and that are performing charset conversions, it will be part of the system libraries for charset conversions. Terminals should not have to worry about these characters. After all the same terminals will also have problems to render Chinese or Russian if they can't support Unicode or extended character sets, and the same projects that are supporting translations to Chinese, Russian, Arabic, Hebrew, Indic scripts, or Thai should not have to prohibit NNBSP for correct display.

Free Fonts and Free rendering libraries are now available, and should be updated to support these characters, even if a font does not map NNBSP but only THINSP (including the recent fonts shipped with Windows 7 such as "Segoe UI" which only maps THINSP, where "Times New Roman" maps NNBSP since long now : It is no longer needed for Windows 7 because its builtin renderer can automatically map NNBSP to the same glyph as THINSP, and caan otherwise emulate all whitespace characters present in Unicode 5.0 at least).

Qt is possibly late in its support, byt I think this is because it does not really implement the text renderer itself but depends on an old version of a library which is dependent of the target system for which it was built. Qt on Windows works and renders NNBSP correctly, even with the "Segoe UI" font that does not map the character. such mapping can also be performed on Linux/*nix by X11 font servers, even if the text renderer libraru does not implement it. Updating XFree86 (or similar) will work correctly to resolve the issue. It is also very simple to update the terminal emulator (XTerm, Telnet) because they are basic user applications which have very little system dependency and that fully work in the user domain (not at the kernel level).

The only remaining environments will be some interfaces for embedded systems with limited rendering capabilities. The need for NNBSP is sufficiently demonstrated that they should add support for it in their firmware updates. But nothing will happen if there's no incitation for their authors to implement it. The best we can do is to document it, and thanks to the patch, they will know that the support for it is wanted and expected: if they have been able to adpt their applications to support the zillions of Han ideographs (and multiple presentation forms for Traditional or Simplified Chinese, or Korean, or Japanese) they should be able to manage a simple whitespace that is needed for frequently used languages like French, Spanish, or Russian.

Someone has to start the work. The others will follow the move rapidly. We can't advance on such issues if nobody wants to make the first step (and in fact the first steps have been largely taken now by all 5 major browsers, and in Windows).

Its not reasonnable to block this in Launchpad, when in fact this will just be dependant on each translation project and most of them should accept the character without any problem now, or will be able to create their own patch if needed without lots of difficulties (by either upgradeing their dependant libraries, or implementing exceptions to support all whitespaces of Unicode 5, or by using custom fonts that are compatible with the character and are already available with the mapping, or by changing their builtin charset converters : libiconv for example has the necessary support, if you have the correct version installed on your system, and updating it is in fact necessary for supporting recent versions of web browsers and X11 desktop environments for more languages : now more than 60% of the content on the web or in emails is encoded with Unicode, legacy 8-bit encoings are disappearing with an accelerated speed, given that msot developments made today use standard web rendering technologies that highly depend on support of Unicode and with mandatory support of UTF-8 now in all the new IETF protocols).