Text base direction is incorrect for RTL text

Bug #1653234 reported by Dov Grobgeld
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Inkscape
In Progress
Medium
Unassigned

Bug Description

For right-to-left languages the display is determined by the Unicode bidirectional algorithm. Part of this algorithm is the base-direction that influences the order of the redirected text. There are two common ways of determining the base direction. One is to have a user interactive element that allow choosing the base direction, another (which is employed by gtk) is to determine the base direction according to the contents. The latter is how it is done in gtk, and in particular by searching for the first "strong" character (in the Unicode bidi algorithm) sense, and using that direction as the base direction.

When running the latest inkscape today, inkscape-0.92-7.pre3 under fedora 25, the determination of the basedirection is incorrect, and text starting with Hebrew is incorrectly shown as a LTR paragraph. I believe (though my memory may fail me) this is a regression, and that it used to work in earlier version.

I'm including a screenshot showing the same text in the gtk text viewer widget (correctly) and the incorrect rendering in the inkscape canvas.

Revision history for this message
Dov Grobgeld (dov-grobgeld) wrote :
su_v (suv-lp)
tags: added: text
Revision history for this message
Dov Grobgeld (dov-grobgeld) wrote :

Added an svg file illustrating the problem.

In the rendered text the Hebrew should be on the right hand side and the English on the left hand side. I.e. in the browser, if capital are considered to be Hebrew characters:

abc CBA

Revision history for this message
Dov Grobgeld (dov-grobgeld) wrote :

Added an svg file illustrating the problem.

In the rendered text the Hebrew should be on the right hand side and the English on the left hand side. I.e. in the browser, if capital are considered to be Hebrew characters:

abc CBA

Revision history for this message
jazzynico (jazzynico) wrote :

Thanks for taking the time to write a report!

Bug reproduced on Windows XP (32-bit), Inkscape 0.91 and lp:inkscape/0.92.x rev. 15301.
Not reproduced with 0.48.5, confirming a regression.

Changed in inkscape:
importance: Undecided → Medium
status: New → Triaged
tags: added: regression
Revision history for this message
jazzynico (jazzynico) wrote :

Related: Bug #1169348 "Hebrew text-anchor reversed?"
<https://bugs.launchpad.net/inkscape/+bug/1169348>

jazzynico (jazzynico)
tags: added: rtl
Revision history for this message
Dov Grobgeld (dov-grobgeld) wrote :

I investigated this bug a bit and it seems that the `text_source->style->direction.computed` is not calculated correctly. It is LTR even if the paragraph starts with a strong RTL character. The following patch overrides this value with a result resolved by pango and seems to be a functioning workaround for the bug:

diff --git a/src/libnrtype/Layout-TNG-Compute.cpp b/src/libnrtype/Layout-TNG-Compute.cpp
index 6b1aba5..104f228 100644
--- a/src/libnrtype/Layout-TNG-Compute.cpp
+++ b/src/libnrtype/Layout-TNG-Compute.cpp
@@ -1099,8 +1099,9 @@ void Layout::Calculator::_buildPangoItemizationForPara(ParagraphInfo *para) con
     if (_flow._input_stream[para->first_input_index]->Type() == TEXT_SOURCE) {
         Layout::InputStreamTextSource const *text_source = static_cast<Layout::InputStreamTextSource *>(_flow._input_stream[para->first_input_index]);

- para->direction = (text_source->style->direction.computed == SP_CSS_DIRECTION_LTR) ? LEFT_TO_RIGHT : RIGHT_TO_LEFT;
- PangoDirection pango_direction = (text_source->style->direction.computed == SP_CSS_DIRECTION_LTR) ? PANGO_DIRECTION_LTR : PANGO_DIRECTION_RTL;
+ PangoDirection pango_direction = pango_find_base_dir(para_text.data(),-1);
+ para->direction = (pango_direction == PANGO_DIRECTION_LTR) ? LEFT_TO_RIGHT : RIGHT_TO_LEFT;
+
         pango_items_glist = pango_itemize_with_base_dir(_pango_context, pango_direction, para_text.data(), 0, para_text.bytes(), attributes_list, NULL);
     }

Revision history for this message
Dov Grobgeld (dov-grobgeld) wrote :

Will it speed up the handling of this bug if I create a proper pull request? But note that I don't know (didn't probe) why `text_source->style->direction.compute` is wrong. I simply ignored it.

Revision history for this message
jazzynico (jazzynico) wrote :

@Dov - What would speed up things is having someone with good knowledge of that part of the code review your patch and test for regressions. I'm targeting the report to 0.93 for now so that we have time to test with the development code.
Given that it worked with 0.48.5, there's also a possibility the bug was introduced when trying to fix another one, so we must be very careful with what we change now.

Thanks!

Changed in inkscape:
milestone: none → 0.93
status: Triaged → In Progress
Revision history for this message
jazzynico (jazzynico) wrote :

Related: Bug #1658510 "Inkscape 0.92 RTL text problem when writing mixed Arabic/Latin or Arabic/Numbers texts"
https://bugs.launchpad.net/inkscape/+bug/1658510

Note to self: test both reports on Xubuntu to confirm 0.91 is affected or not.

Revision history for this message
Tavmjong Bah (tavmjong-free) wrote :

Inkscape must follow CSS which uses the "direction" property to set the base text direction. If you add 'direction:rtl' to the style attribute of the text element, the text is rendered as you expect.

Unfortunately, there is no GUI at the moment to set the "direction" property.

Note, both Firefox and Chrome render the test svg file in the same way as Inkscape.

Revision history for this message
Tavmjong Bah (tavmjong-free) wrote :

Added support for setting 'direction' in text toolbar (far right side). Trunk commit r15466.

Revision history for this message
Dov Grobgeld (dov-grobgeld) wrote :

Though the manual setting of the direction is a substantial improvement, this is in my opinion not sufficiently user friendly. This sounds to be a case of following the letter of the law, instead of what is reasonable behavior as far as the user is concerned. Further, this breaks files that used to work in earlier Inkscape versions. Note that there already is a HTML5 direction:auto tag, see https://www.w3.org/International/tests/repository/html5/the-dir-attribute/results-dir-auto, and there already is a discussion on whether to import this into css as well. Today all chat systems, web widgets, desktop native entry widgets on Windows and Linux/X11, mobile entry widgets, automatically choose the direction based on the contents. There is no reason why inkscape cannot do that.

Isn't it possible to set the direction to "auto" by default in terms of inkscape, and save this choice in a inkscape specific tag, but then export the resolved direction in the css direction tag.

Revision history for this message
Tavmjong Bah (tavmjong-free) wrote :

Looking at the HTML 5 documentation:

"The heuristic used by this state is very crude (it just looks at the first character with a strong directionality, in a manner analogous to the Paragraph Level determination in the bidirectional algorithm). Authors are urged to only use this value as a last resort when the direction of the text is truly unknown and no better server-side heuristic can be applied. [BIDI]"

So they are not particularly recommending the use of "auto".

If you wish Inkscape to support "auto" then it would probably be best to get the CSS group to add the value to the 'direction' property. (I thought about doing this but then read the comment in the HTML spec.) You can add a CSS issue at:

https://github.com/w3c/csswg-drafts/issues/

The spec in question is [css-writing-modes-3]:

https://drafts.csswg.org/css-writing-modes-3/#propdef-direction

Tav

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.