Comment 2 for bug 2002290

Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

(I am the one who designed [1] and implemented RTL (right-to-left) and BiDi (bidirectional) text support in VTE.)

The two issues you report here are totally independent.

Re bug 1:

Terminal emulators, by their very nature and their legacy of maybe ~50 years, _have to_ operate in a strict rectangular grid of character cells. If you try to break out of this grid, you break pretty much everything.

Sticking to such a grid has quite a few advantages and quite a few disadvantates. The visual disadvantages are more prominent with scripts that do connect the letters to each other, such as Arabic.

I'm sure that there's room for improvement in rendering, but it probably doesn't belong to VTE. Or maybe belongs to the VTE to the extent of switching to a different font rendering engine (e.g. from freetype to harfbuzz; there's an upstream bug about it).

However, by the very nature of the grid layout, no rendering engine could perform magic and end up with a beautiful rendering if it starts with a font that doesn't have the letters of the desired width.

Long story short: You'll need to find a high quality monospace Arabic font. Or, in fact, one where the English and the Arabic letters all have the same width (or somehow merge two such fonts, an English and an Arabic one, via fontconfig; I'm not at all familiar with how to do that).

For testing, I happened to use a font where the Arabic text didn't look anywhere as bad as your screenshot. I probably used "Monospace 9", but I don't know if this font itself contains Arabic, or if they were substituted from another font, and if so then which one. Comparing the layout to the layout of let's say web browsers (which don't have the fixed with constraint), bearing in mind that I cannot read Arabic, I am confident to say that the rendering was way better than the one in your screenshot. At the very least, letters were connected or not connected exactly as in the browser, and the overall look was also reasonably close.

So keep finding the right font for you.

Re bug 2:

This one is not about joining or not joining adjacent letters; this one is about figuring out how to shuffle the order of the character cells to make sure that words and sentences aren't "sdrawkcab" (backwards).

Pretty much every terminal behaves differently when it comes to RTL or BiDi text. That is, unfortunately it is literally impossible for an app to emit RTL or BiDi text and expect to appear correctly in the terminal. Also, some applications have different requirement from the terminal than others.

Overly simplified story: Some apps need to emit logical order and expect the terminal emulator to rearrange the cells. If the terminal doesn't rearrange, the output will be "nekorb" (broken). Some other apps need to reorder the cells themselves, and if the terminal also reorders them then the output will be, again, "nekorb". No matter which approach a terminal emulator picks (i.e. to rearrange or not to rearrange according to the BiDi algorithm), one set of the applications has no chance of implementing RTL. Or, rather, no application could ever implement proper RTL, because they could not tell which kind of terminal they operate on. (Fun: I've also come across an app where half of the strings were in logical order and half were in visual, making it absolutely certain that no terminal would ever display correctly the entire app.)

The only proper attempt to fix RTL for terminals dated back to ~30 years earlier, preceding the Unicode BiDi Algorithm, and contained many flaws (let alone that it's a document with no known implementation). All the other attempts were much more fundamentally broken (like forcing the terminal to rearrange the cells, making it literally impossible to implement e.g. BiDi-aware terminal-based text editors). I elaborate on this in my BiDi proposal.

I gathered information, studied the subject thoroughly, evaluated earlier "solutions", and then come up with a proposal that would address all the issues. This is a ~50 page document, referring to the also quite enormous Unicode BiDi Algorithm as one of its building blocks. Then I implemented it in VTE.

It's important to understand that no RTL behavior for terminals could ever magically fix the output of all the existing utilities (such as "apt" as seen in your screenshot), it would just be literally impossible to do that.

My RTL/BiDi proposal makes confirming terminals a _platform_ capable of properly handling the RTL/BiDi needs of every kind of application. But as a next step, applications on top of them have to use the provided features wisely.

So VTE is the first, and perhaps still only (I'm not sure about that) terminal whose RTL / BiDi capabilities make it possible for all of "apt", "vim", "tmux" etc. (insert thousands of other apps here) to implement RTL / BiDi, despite their vastly different needs.

Now it's the applications' turn to use the available RTL / BiDi features as they require.

(By the way, if you copy the given "apt" output into a browser or a graphical BiDi-aware terminal, I'm quite certain that they would result in the same order of letters as in VTE. I presume that the broken final rendering is absolutely unrelated to the terminal. It's just what the Unicode BiDi Algorithm says to do with the given text.)

If the order of the letters in a word, or order of words in a sentence is incorrect, you need to ask the developers of the given app that emits the string to fix their software.

> but it can be seen in any Ubuntu version and in any terminal version as well (it has been there since forever)

Surely not. My BiDi work, which modified the default rendering of RTL text, landed in VTE in mid-2019, i.e. 3.5 years ago.

[1] https://terminal-wg.pages.freedesktop.org/bidi/